[Pitch] Box

Hi Evolution, below is the proposal I’ve written up to add Box to the standard library. Evolution PR is here: Box by Azoy · Pull Request #3067 · swiftlang/swift-evolution · GitHub

Box

Introduction

We propose to introduce a new type in the standard library Box which is a
simple smart pointer type that uniquely owns a value on the heap.

Motivation

Sometimes in Swift it's necessary to manually put something on the heap when it
normally wouldn't be located there. Types such as Int, InlineArray, or
custom struct types are stored on the stack or in registers.

A common way of doing something like this today is by using pointers directly:

let ptr = unsafe UnsafeMutablePointer<Int>.allocate(capacity: 1)

// Make sure to perform cleanup at the end of the scope!
defer {
  unsafe ptr.deinitialize()
  unsafe ptr.deallocate()
}


// Must initialize the pointer the first
unsafe ptr.initialize(to: 123)

// Now we can access 'pointee' and read/write to our 'Int'
unsafe ptr.pointee += 321

...

// 'ptr' gets deallocated here

Using pointers like this is extremely unsafe and error prone. We previously
couldn't make wrapper types over this pattern because we couldn't perform the
cleanup happening in the defer in structs. It wasn't until recently that we
gained noncopyable types that allowed us to have deinits in structs. A common
pattern in Swift is to instead wrap an instance in a class to get similar
cleanup behavior:

class Box<T> {
  var value: T

  init(value: T) {
    self.value = value
  }
}

This is much nicer and safer than the original pointer code, however this
construct is more of a shared pointer than a unique one. Classes also come with
their own overhead on top of the pointer allocation leading to slightly worse
performance than using pointers directly.

Proposed solution

The standard library will add a new noncopyable type Box which is a safe smart
pointer that uniquely owns some instance on the heap.

// Storing an inline array on the heap
var box = Box<[3 of _]>([1, 2, 3])

print(box[]) // [1, 2, 3]

box[].swapAt(0, 2)

print(box[]) // [3, 2, 1]

It's smart because it will automatically clean up the heap allocation when the
box is no longer being used:

struct Foo: ~Copyable {
  func bar() {
    print("bar")
  }

  deinit {
    print("foo")
  }
}

func main() {
  let box = Box(Foo())

  box[].bar() // "bar"

  print("baz") // "baz"

  // "foo"
}

Detailed design

/// A smart pointer type that uniquely owns an instance of `Value` on the heap.
public struct Box<Value: ~Copyable>: ~Copyable {
  /// Initializes a value of this box with the given initial value.
  ///
  /// - Parameter initialValue: The initial value to initialize the box with.
  public init(_ initialValue: consuming Value)
} 

extension Box where Value: ~Copyable {
  /// Consumes the box and returns the instance of `Value` that was within the
  /// box.
  public consuming func consume() -> Value

  /// Dereferences the box allowing for in-place reads and writes to the stored
  /// `Value`.
  public subscript() -> Value
}

extension Box where Value: ~Copyable {
  /// Returns a single element span reference to the instance of `Value` stored
  /// within this box.
  public var span: Span<Value> {
    get
  }

  /// Returns a single element mutable span reference to the instance of `Value`
  /// stored within this box.
  public var mutableSpan: MutableSpan<Value> {
    mutating get
  }
}

extension Box where Value: Copyable {
  /// Copies the value within the box and returns it in a new box instance.
  public func clone() -> Box<Value>
}

Source compatibility

Box is a brand new type in the standard library, so source should still be
compatible.

Given the name of this type however, we foresee this shadowing with existing user
defined types named Box. This isn't a particular issue though because the
standard library has special shadowing rules which prefer user defined types by
default. Which means in user code with a custom Box type, that type will
always be preferred over the standard library's Swift.Box.

ABI compatibility

The API introduced in this proposal is purely additive to the standard library's
ABI; thus existing ABI is compatible.

Implications on adoption

Box is a new type within the standard library, so adopters must use at least
the version of Swift that introduced this type.

Alternatives considered

Name this type Unique

Following C++'s std::unique_ptr, a natural name for this type could be
Unique. However, there's a strong precedent of Swift developers reaching for
the Box name to manually put something on the heap. While C++ uses
std::unique_ptr, Rust does name their unique smart pointer type Box, so
there is prior art for both potential names. This proposal suggests Box as the
succinct and simple name.

Use a named property for dereferencing instead of an empty subscript

var box = Box<[3 of _]>([1, 2, 3])
box[].swapAt(0, 2) // [3, 2, 1]

The use of the empty subscript is unprecedented in the standard library or quite
frankly in Swift in general. We could instead follow in UnsafePointer's
footsteps with pointee or something similar like value so that the example
above becomes:

box.pointee.swapAt(0, 2) // [3, 2, 1]

// or

box.value.swapAt(0, 2) // [3, 2, 1]

We propose the empty subscript because alternatives introduce too much ceremony
around the box itself. Box should simply be just a vehicle with which you can
manually manage where a value lives versus being this cumbersome wrapper type
you have to plumb through to access the inner value.

Consider uses of std::unique_ptr:

class A {
public:
  void foo() const {
    std::println("Foo");
  }
};

int main() {
  auto a = std::make_unique<A>();

  a->foo();
}

All members of the boxed instance can be referenced through C++'s usual arrow
operator -> behaving exactly like some A*. C++ is able to do this via their
operator overloading of ->.

Taking a look at Rust too:

struct A {}

impl A {
  fn foo() {
    println!("foo");
  }
}

fn main() {
  let a = Box::new(A {});

  a.foo();
}

there's no ceremony at all and all members of A are immediately accessible
from the box itself.

Rust achieves this with a special Deref trait (or protocol in Swift terms)
that I'll discuss as a potential future direction.

Rename consume to take or move

There a plenty of good names that could be used here like take or move.
take comes from Optional.take() and move from UnsafeMutablePointer.move().
Both of those methods don't actually consume the parent instance they are called
on however unlike consume. Calling consume ends the lifetime of Box and it
is immediately deallocated after returning the instance stored within.

Future directions

Add a std::shared_ptr alternative

This proposal only introduces the uniquely owned smart pointer type, but there's
also the shared smart pointer construct. C++ has this with std::shared_ptr and
Rust calls theirs Arc. While the unique pointer is able to make copyable types
noncopyable, the shared pointer is able to make noncopyable types into
copyable ones by keeping track of a reference count similar to classes in Swift.

Introduce a Clonable protocol

Box comes with a clone method that will effectively copy the box and its
contents entirely returning a new instance of it. We can't make Box a copyable
type because we need to be able to customize deinitialization and for
performance reasons wouldn't want the compiler to implicitly add copies of it
either. So Box is a noncopyable type, but when its contents are copyable we
can add explicit ways to copy the box into a new allocation.

Box.clone() is only available when the underlying Value is Copyable, but
there is a theoretical other protocol that this is relying on which is
Clonable. Box itself can conform to Clonable by providing the explicit
clone() operation, but itself not being Copyable. If this method were
conditional on Value: Clonable, then you could call clone() on a box of a
box (Box<Box<T>>).

Rust has a hierarchy very similar to this:

public protocol Clonable {
  func clone() -> Self
}

public protocol Copyable: Clonable {}

where conforming to Copyable allows the compiler to implicitly add copies
where needed.

Implement something similar to Rust's Deref trait

Deref is a trait (protocol) in Rust that allows types to access members of
another unrelated type if it's able to produce a borrow (or a safe pointer) of
said unrelated type. Here's what Deref looks like in Rust:

pub trait Deref {
  type Target: ?Sized;

  fn deref(&self) -> &Self::Target;
}

It's a very simple protocol that defines an associated type Target (let's
ignore the ?Sized) with a method requirement deref that must return a borrow
of Target.

This trait is known to the Rust compiler to achieve these special
semantics, but Swift could very well define a protocol very similar to this
today. While we don't have borrows, I don't believe this protocol definition in
Swift would require it either.

Without getting too into specifics, array like data structures in Rust also
conform to Deref with their Target being a type similar to Span<Element>.
This lets them put shared functionality all on Span while leaving data
structure specific behaviors on the data structure itself. E.g. swap is
implemented on MutableSpan<Element> and not Array in Swift terms. This is
being mentioned because if we wanted to do something similar for our array like
types, they wouldn't want to return a borrow of Span, but instead the span
directly. Rust does this because their span type is spelt &[T] while the [T]
is an unsized (hence the ?Sized for Target), so returning a borrow of [T]
naturally leads them to returning &[T] (span) directly.

It could look something like the following in Swift for Box:

public protocol Deref {
  associatedtype Target

  subscript() -> Target { ... }
}

extension Box: Deref where Value: ~Copyable {
  subscript() -> Target { ... }
}

which could allow for call sites to look like the following:

var box = Box<[3 of _]>([1, 2, 3])
box.swapAt(0, 2) // [3, 2, 1]
22 Likes

Any reason not to call this HeapBox, to avoid the existing name collisions?

Also, it would probably be good to outline more concrete use cases rather than just saying sometimes this is useful.

Also, any plans for back deployment? On Apple platforms new types can't back deploy by default.

Edit: And also, is there a reason this is a class rather than final class? Do we anticipate subclassing?

4 Likes

It's a noncopyable struct, isn't it? Or am I misunderstanding your question?

1 Like

The name collisions are fine because standard library types are always shadowing by default. The compiler will always prefer custom types versus ones in the standard library.

Sure, I can add some more examples to the proposal. Another example I forgot to mention in the proposal is that this is needed to support indirect noncopyable enums right now (because support for indirect enum when its noncopyable doesn’t work at the moment). Taking this example from: Learn Rust With Entirely Too Many Linked Lists:

typealias Link<T: ~Copyable> = Box<Node<T>>?

struct Node<T: ~Copyable>: ~Copyable {
  let next: Link<T>
  var value: T
}

We can make a basic linked list from this node pattern with Box.

There are no current plans for back deployment.

This is not a class, it is a noncopyable struct.

4 Likes

I'm not totally sure I understand this example. InlineArray is Copyable when its elements are Copyable.[1] This example built on InlineArray<Int> seems to imply that Box<InlineArray<Int>> InlineArray<Int> is not Copyable because the subscript is defined when Value is also not Copyable:

Am I understanding correctly? Is there another way that subscript is also available when Value is Copyable?


  1. https://github.com/swiftlang/swift-evolution/blob/main/proposals/0453-vector.md ↩︎

This constraint indicates that Value may or may not be Copyable. These API are available for all copyable and noncopyable types.

2 Likes

Yeah, I was looking at the original "current solution" example.

1 Like

A lot of existing Swift projects also use the name Box for a generic reference-counted on-heap wrapper class (for instance, searching language:swift "class Box<" on Github gives more than a thousand hits), which might be part of the confusion here. As such, although that's the name Rust uses, Box might not be the best name for a uniquely-owned value buffer in the Swift standard library, given existing practice.

9 Likes

I was going to suggest Indirect to go with the built-in keyword for enums, but…enum indirect uses CoW, doesn’t it? And having those behave differently seems very subtle.

Maybe then UniqueBox?

1 Like

I'm of two minds on this. Box forms a nice punchy three-letter taxonomy with other similar wrappers that might be added in the future (based on what's in swift-collections recently), like Mut, and Ref. And it has the bonus that the parameterless subscript [] literally looks like a box. Serendipity.

But... names like Mut and Ref are much more Rusty than Swifty and it's hard to say whether those are the names we'd be happy with for those constructs. So it might be premature to try to align things that aren't being proposed yet.

The considered alternative Unique does feel like an improvement. It's not terribly verbose and it's a lot more descriptive about what the thing is—a Box could have any sort of ownership over what's inside, but Unique answers that question more clearly.

However, I think the naming questions for this thing is very similar to the naming question we had for InlineArray (originally proposed as Vector). Are we defining a new term of art, or using existing prior art compatibly? (Seems like the latter if we consider Rust, but not if we consider existing Box patterns in Swift.)

Do we see this thing as a fundamental building block that will be exposed in APIs, or more of an implementation detail?

7 Likes

My understanding is also that the Rust Box shipped in 1.0. So any discussion that might have taken place in Rust about Box potentially being ambiguous WRT existing expectations in the engineering community might have been less important on a "new" language just hitting 1.0 with more freedom to coin those terms of art.

Swift today is far more mature than Rust was ten years ago. I would agree that Box is the clear prior-art… but Rust chose Box as a term of art without ten years worth of community evolution and momentum to take into consideration.

Maybe Box is the hero we want. Not the hero we need.

1 Like

Box without Deref is a pretty subpar experience. And that empty subscript is neither discoverable nor easy to explain — I think I'd prefer a simple wrappedValue property.

(and speaking of wrappedValue, could box also be a property wrapper?)

Swift kinda-has Deref already though — @dynamicMemberLookup. Is there a reason Box couldn't or shouldn't expose its wrapped value's properties in that way?

9 Likes

Keypath-based @dynamicMemberLookup only works for properties; there's no way to do perfect forwarding of methods, which would require method keypaths (with the full suite of ownership modifiers, throwing, async, etc.).

1 Like

I would also hate to rely on the optimizer to eliminate the key path overhead.

9 Likes

Was $box considered instead of the empty subscript for deferencing ? Although it would be kinda backward regarding property wrappers.

Because Box is a reference type, consuming it will not have the desired effect. It simply consumes that reference to the object, leaving other references intact.

Jonathan can't read too good today.

Box is not a class, it is a noncopyable struct.

3 Likes

Sorry, I saw this in the original post:

And my brain deduped it with this:

I'll be quiet and go back in my… box. :zipper_mouth_face:

6 Likes

I don't think the ceremony (or the cognitive load) is simply a number of characters to write or read. This looks quite cryptic to my eye:

print(box[])
box[].swapAt(0, 2)
print(box[])

box^ could work aesthetically but for that syntax to work in an lvalue position we'd need to change the language first.

IMHO this would be more Swifty:

print(box.val)
box.val.swapAt(0, 2)
print(box.val)

or even some longer name for the property like pointee, value, etc.

Could property wrapper approach make it ceremony free? box.swapAt(0, 2)

8 Likes