[Pitch] Constrained Reference Types

Reference types (classes and actors) provide memory-safe heap allocations which

  • can be cheaper to pass as only a reference is passed rather than a copy of the value
  • can be recursive, and
  • can be bound to multiple variables with only a single in-memory value

The semantics of classes and actors today have a set of drawbacks:

  • All references provide mutation (although Actor is mutually exclusive)
    • Can't reason about where mutations come from
    • Can't reason about values being fixed
  • All references are reference counted
    • non-allocating/locking code can't use the references as they may be holding the final reference
    • performance overheads

This pitch suggests adding a new set of reference types with a common declaration, referred to here as box, which may use the same underlying representation as classes and actors but which can be constrained by the type system to reason about its use.

The reference types all refer to a single declaration and each provides different semantics for copying and mutating the referenced value.

Use-cases

  • Data structures which have a set-up procedure but which are fixed once they've been fully initialised
  • Immutable data structures, such as trees, which describe hierarchies with references to distinct states that share common unchanged subtrees

Explanation

The declaration for the type would have four associated types that each distinctly constrain mutability and uniqueness:

  • A mutable reference to a box has only a single value and can be mutated
    • it is ~Copyable and deinitialises the box when it goes out of scope
  • A unique reference to a box has only a single value but is immutable
    • it is ~Copyable and deinitialises the box when it goes out of scope
  • A shared reference to a box may have multiple references but is immutable
    • the box is reference counted
    • the box is deinitialised when the last reference is dropped
  • A copy on write reference to a box may have multiple references and is mutable
    • the declaration of the box must not be ~Copyable or contain ~Copyable stored properties
    • when the box is mutated via the reference it is first ensured to be unique, creating a copy of the underlying value if necessary

When mutating a box, the reference must be bound to a mutable var (as with value types like struct and enum), so the mutability is transitive in nested structures in the same manner as value types.

Declaration

If T is declared as follows

box T { /* ... */ }

Then the references above written as mutable T, unique T, shared T (or simply T), and cow T.

The self in any initialiser for T would probably need to be the mutable variant.

If the declaration is ~Copyable, i.e.:

box T: ~Copyable { /* ... */ }

Then cow T is not permitted as it requires the data structure to be copied on some mutations.

Casting

Several implicit casts are permitted between the reference types

From mutable unique shared cow
mutable Yes Yes If Copyable
unique No Yes If Copyable
shared No If Copyable If Copyable
cow No Yes Yes

Note: the copy-on-write variant would create a new unique copy if required to satisfy the cast requirements

Conversion

To support runtime-checked conversion to a unique value, the following could be provided

asUnique(
    _ value: consuming shared Foo
) -> Result<unique Foo, shared Foo>

And to provide an escape hatch for mutation, the following could also be provided

asMutable(
    _ value: consuming unique Foo
) -> mutable Foo

Representation

Rather than use distinct representations for each type, the same in-memory representation can be used for all (with the required header space for reference counting). This would mean that the same value and implementation can be used across all of them and that the casts are free (except for when copies occur for copy-on-write references).

4 Likes

Rather than introduce a new declaration, this could also just be modifiers to references to existing declarations of class.

1 Like

Isn't reifying copy-on-write into the language a bit too much? The standard library gets away with existing tools just fine.

Overall, I'd personally prefer to have a set of tools instead that would allow to express already existing reference concept as their own fully-fledged value types. I imagine it'd only require a way to delegate interfaces (similar to rust's deref) and an API to ARI over UnsafeMutablePointer

Honestly, if we ever get around to adding indirect for structs, and support for indirect for non-copyable types, these can all just be library types.

Hell, you can already implement these with classes, it just isn't as nice.

final class Box<T: ~Copyable> {
  public var value: T
  public init(_ value: consuming T) {
    self.value = value
  }
}

// The T: Copyable is redundant, but I think it's more clear.
extension Box where T: Copyable {
  func clone() -> Self {
    return Box(value)
  }
}

struct Mutable<T: ~Copyable>: ~Copyable {
  private var box: Box<T>
  public init(_ value: consuming T) {
    box = Box(value)
  }
  public var value: T {
    _read {
       yield box.value
    }
    _modify {
      yield &box.value
    }
  }
}

struct Shared<T: ~Copyable> {
  private var box: Box<T>
  public init(_ value: consuming T) {
    box = Box(value)
  }
  public var value: T {
    _read {
      yield box.value
    }
  }
}

struct Unique<T: ~Copyable>: ~Copyable {
  private var box: Box<T>
  public init(_ value: consuming T) {
    box = Box(value)
  }
  public var value: T {
    _read {
      yield box.value
    }
  }
}

struct CoW<T> {
  private var box: Box<T>
  public init(_ value: T) {
    box = Box(value)
  }
  private mutating func makeUnique() {
    if !isKnownUniquelyReferenced(&box) {
      box = box.clone()
    }
  }
  public var value: T {
    _read {
      yield box.value
    }
    _modify {
      makeUnique()
      yield &box.value()
    }
  }
}

EDIT: The conversions can either be (possibly consuming) properties, or initializers. There's some room in the design space there.

4 Likes