[Pitch] Noncopyable (or "move-only") structs and enums

We've been building our way toward the concepts laid out in the ownership manifesto, introducing features that add new degrees of control over Swift's memory management and copying behavior. The ultimate level of control is to be able to declare a type that can't be copied at all, whether by value or by reference, opening up the ability for Swift to safely and efficiently represent resources that always have unique ownership, using a model similar to Rust to statically control access to and lifetime of the resource by borrowing it. We're ready now to pitch an initial set of functionality for noncopyable types. Here is the draft proposal:

This proposal builds on our other pitch for @noImplicitCopy values, since the fundamental ways that values of noncopyable type behave is the same as values that have had implicit copying suppressed (with the difference that noncopyable types cannot even be explicitly copied, of course). Previous vision documents have called these "move-only types", but as we've proposed and reviewed related proposals, we've drifted away from exposing the term "move" elsewhere in the language. We've also found that when describing the functionality to potential users, they often get the impression that being "move-only" is a new capability, when really the inverse is true—existing Swift types all satisfy an implicit "copyable" requirement, and what we're adding is the ability for types to opt out of that requirement. We're proposing the term "noncopyable" instead, since we think it makes it clearer that a capability is being taken away, and links the feature to a future explicit Copyable constraint that will likely become necessary in the near future.

Speaking of which, this initial proposal comes with a significant constraint that noncopyable types cannot yet conform to protocols, be used as type arguments to generic functions or types, stored in Any or other existentials, or satisfy associated type requirements. Swift's existing generics system pervasively assumes all types are copyable, and integrating noncopyable types into generics is another large design project. Although the initial lack of generic abstraction support is very limiting, we think that the feature will be very useful even without it. Full integration with existing runtime and standard library interfaces will also likely come with new runtime requirements which may impose back-deployment restrictions on some functionality, and we want to make sure that developers can adopt some level of noncopyable type functionality without backward deployment constraints.

Noncopyable types can be experimented with using recent nightly toolchains and the compiler flag -enable-experimental-move-only, using the attribute name @_moveOnly for now. We hope you all will start to experiment with it!

32 Likes

A non-@frozen class may add fields of noncopyable type without changing ABI.

Is there a such thing as a frozen class? I think I've seen some kind of fixed-layout annotation, but on classes it's not spelled @frozen, is it?

Would it be possible to do forget self within a defer block, or is it only possible within lexical(?) scope?

2 Likes

I can't think of a reason off the top of my head why it shouldn't work in a defer block. That would be a useful way to ensure the deinit is always disabled.

6 Likes

Until we specify the behavior of "finer-grained destructuring", I'm afraid we'll need to make it an error to "forget" any non-trivial type--any type with at least one member that (a) is a reference (b) is a struct-with-deinit, or (c) has non-trivial members.

So this would be illegal:

class Ref {}

@noncopyable
struct Inner {
  deinit { print("destroying inner") }
}

@noncopyable
struct Outer {
  var ref = Ref()
  var inner = Inner()
  deinit { print("destroying outer") }
}

do {
  let outer = Outer()
  forget outer
}

Supporting such types in any way would be equivalent to just implementing finer-grained destructuring.

[EDIT] Alternatively, we say that forget performs memberize deinitialization. That is implementable, since we're restricted to self. And it's compatible with future fine-grain destructuring.

3 Likes

Given that the compiler will already enforce with an error on different code paths, why not have it apply to the entire function declaration? This would also avoid a new keyword

We're modifying compiler behavior from standard requirements on inputs & outputs of the function scope, much like @discardableResult or @nonescaping

The following ends up feeling more natural to me and "less surprising" when doing code reviews

@noncopyable
struct FileDescriptor {
  private let fd: Int32

  deinit {
    close(fd)
  }

  @forgetSelf // bike-shedding
  consuming func take() -> Int32 {
    return fd
    // we don't want this method to trigger 'deinit'
  }
}

That allows us to remove forget in the keyword lexicon, and making it not as easy to forget to a particular code branch to run it.

Although, it does add more friction to later add a free forget function like Rust has, and it also removes the ability for developers to have super-fine grained control of when it happens.

4 Likes

That's definitely an interesting alternative to consider, since I suspect in the common case you will want a consuming method to either always forget or always destroy. In more intricate situations, where you may want to forget on some but not all paths through the method, I like that the expression-based approach makes sure you consider all paths.

2 Likes

I wonder if there's a more general design to have around deinit and forget. It seems to me that the current deinit design implies that the lifetime of a non-copyable binding can be ended implicitly in any function. There's a lot of interesting things that can be done if, instead, you require that the lifetime ends by moving the object to an argument of another function that takes care of it in a specific way.

One way we could do this would be to have access modifiers on deinit and deinit overloads. Deinit would continue to be special in that it would be the only function that can forget about self. Instead of using forget, other methods would have to explicitly call an accessible deinit method. If there is an accessible parameterless deinit(), it is called implicitly at the end of the object's lifetime. If there is no accessible parameterless deinit, then you have to explicitly use a consuming function to get rid of it, be that a deinit or any other function.

As a general style matter, we can discourage widely-accessible deinit methods and suggest that people wrap calls to deinit in named, more accessible consuming functions.

To me, this seems like a "generalization of forget". Forget having access restrictions is a special way to have access controls on deinitializers.

3 Likes

This could be, quite naturally, forgetting func take() -> Int32.

1 Like

That syntax could be misleading, though, because external to the method the only thing that matters is that it's consuming. How exactly the value gets consumed is an implementation detail. Since other changes to the method modifier among borrowing/consuming/mutating are ABI- and potentially-API-breaking, using a different name there could imply a bigger external behavior break than is really in play.

7 Likes

The example of using forget in the pitch didn't seem to me like it solved a problem that was very difficult to solve in other ways. For example, the FileDescriptor type in the example could use a sentinel value (e.g. fd == 0 or fd == nil) to signal to deinit that the handle is already closed and therefore shouldn't be closed a second time. In either approach, the programmer must remember to do something in the close() function to inhibit the deinit logic. Is the main benefit of forget in the file descriptor example that complexity in the implementation of deinit can be avoided? I do think that has value, but I'm wondering if I'm missing something else that makes forget a truly necessary tool.

1 Like

I remember that, pre-1.0, Rust had experimented with strictly "linear types" (which require explicit consumption, as opposed to the "affine types" that Rust 1.0 ended up with, and which we're proposing here, that get implicitly consumed), and found them to be severely unergonomic and uncomposable. I can definitely foresee situations where there is no one "default" way to consume a resource, and forcing an explicit choice would be good, but it seems like those cases should be the exception rather than the rule.

We've done a pretty good job in Swift so far of keeping it possible to "make invalid states unrepresentable". You could use a sentinel state, but then the entire implementation of FileDescriptor needs to be prepared to possibly handle being in that state, and clients then might need to be able to check whether the descriptor is valid, and so forth.

6 Likes

I wasn't around to see that, but I can believe it. I'd like to re-pitch this slightly differently: deinit overloads more importantly as an alternative to a forget keyword rather than as a way to get linear types. A public deinit gets you the behavior as currently spelled in the proposal (anybody can drop the value), and forget can be expressed in terms of less-than-public deinit methods, with availability rules that don't need to be explained to anyone already familiar with Swift's access modifiers. Removing the way to end an object without calling any deinitializer seems like a win to me.

How does forget work if self contains other non-copyable types? Are they transitively forgotten or is their deinit run? I would imagine the latter, but it's easy to think it's the former and always going through a deinit might make the answer more intuitive.

2 Likes

From talking with Andy about the current implementation, I think we'll want to restrict the ability to forget to types that would be trivial if not for their noncopyability for now. Eventually we want to get to the point where, when you forget a type with nontrivial fields, you can selectively consume those fields, by forwarding their ownership elsewhere. Any remaining fields you don't consume would have their individual destructors run, so forget would only be skin-deep.

6 Likes

Oh hi, you're poking the ol' move-only bear again! The mem::forget stuff is interesting! :smile_cat:

Here are some useful links:

  • The Pain Of Linear Types In Rust - an article I wrote in the early days of the Swift team looking into "move-only types", discussing why "true linear types" (as opposed to the "affine types" that Rust has) are a huge mess. Also a look into what I originally believed to be the generics-migration story for Swift adopting move-only types back in 2017 (good call to punt on generics, Joe :joy_cat:).

  • Rustonomicon - Leaking - official documentation (also by me) on how to deal with the Possibility of leaking (mem::forget) when desiging safe abstractions around unsafe code, and some examples of APIs that had to be adjusted in The Leakpocalypse™ (where Rust came to terms with the fact that mem::forget could be emulated in safe code and so may as well be marked safe To Be Honest).

  • std::mem::ManuallyDrop a wrapper type that prevents the compiler from automatically dropping a value, allowing the developer to either forget it or drop it when they think is the right time. In essence, this type is inverted mem::forget: everything is forgotten by default, and you need to manually not-forget it.

  • Two Memory Bugs From Ringbahn - some cases where incorrect mem::forget usage resulted in double-drops (relevant to the checking stuff you added).

23 Likes

The docs for mem::forget cite Rc as a way to invent forgetting in safe code without it, but is there a way in Rust to invent forget for a lifetime-dependent value (which Rc seems like it can't since it would impose indefinite lifetime on the thing being refcounted)? Indiscriminate forget has come up as a wrinkle in answering the question "are locks movable without storing them in an out-of-line cell". Apple's os_unfair_lock for instance doesn't have any lingering OS resources when it's not in a locked state, so its unique owner is free to move the OS_UNFAIR_LOCK_INIT bit pattern around in memory to give the lock away. It only needs to be pinned in memory while locked, which can be handled by ensuring locking always happens within the lifetime of a borrow on the lock. That works great unless some jerk mem::forgets the MutexGuard that would normally unlock it, leaving the lock in a locked state while statically appearing to be un-borrowed again while the OS still has guns pointed at its current address, leading to hijinx if the presumptive owner of the lock moves it in this state. It seems to me that the "if it didn't exist, you could invent it" argument doesn't hold as strongly for lifetime-dependent type families like Mutex and MutexGuard, but maybe I'm missing the trick to invent it in this case and it's all hopeless.

1 Like

Apologies I am a bit sick so this might be slightly incoherent.

The Leakpocalypse™ really was about reference-counted cycles. Also that Mutex and MutexGuard were never actually the problem because if you mem::forget the lock just gets stuck "locked" and the program deadlocks (although your example is an interesting wrinkle!). It's actually exceptionally rare to encounter the issue, to create A Leakpocalyse™ Type you must:

  1. Create a type with some non-trivial validity invariant (Array)
  2. Create another type which mutably borrows (inout) Array and violates that invariant (Drain)
  3. Argue that it's ok for the invariant to be transiently violated because:
    a. The mutable borrow held by Drain prevents anyone from touching Array and observing the violated invariant
    b. The mutable borrow expiring implies Drain's destructor ran
    c. Drain's destructor repairs/enforces the invariant

All of this is valid except 3b, because you can write code like (pardon my rusty pseudo-swift):

// Make the contents of this non-POD if you want real UB
var arr = [1, 2, 3, 4];

// Start consuming the values out of the array
{
  var drain = arr.drain();
  drain.next();
  // At this point the Array is invalid, as index 0 is logically deinit
  // But it's ok, `arr[0]` should be a compiler error because 
  // `arr` is `inout` borrowed by `drain`!
  
  // ... now leak it with a strong refcounted cycle (Node is a class)
  var node1 = Node();
  var node2 = Node();
  // At this point the compiler sees that node1 holds the borrow
  // that drain held, so now we still can't access arr until
  // node1 is gone
  node1.value = drain;
  node1.other = node2;
  node2.other = node1;
}
// OK! drain and node1 are both DEFINITELY gone, it's impossible
// for the program to access them (correct!) so the borrow can be
// released (correct?) and we know the dtor ran (incorrect!!!)

// Reading deinit memory!!!
print(arr[0]);

You can do this kind of thing because Rust would type these nodes as like Node<Drain<'a, T>>, and anywhere a Node goes the compiler infects with the 'a borrow (or perhaps more accurately, expands 'a to cover those parts of the program, and the borrow follows). Basically refcounting isn't really indefinite. The compiler can follow all the places the strong references get to, and it knows when they all go away then the borrow can go away (anything that could really smuggle things forever like globals or passing things across non-scoped threads requires T: 'static, the forever lifetime, which would make the borrow never expire).

I'm afraid I can't really extrapolate to what this looks like in Swift. Like, classes were always going to be a nightmare for noncopyable types. I'm not really sure what it Means to put a noncopyable type into a class, since taking an inout borrow of that field is unknowable to some other code that happens to also have a reference to that class instance. This is why Rc/Arc in rust enforce immutable access to their contents (no inout), which you need to claw back with dynamic "interior mutability" types like RefCell or Mutex.

1 Like

We have an existing scheme for dynamically enforcing exclusive access to classes, globals, and other dynamically shared-mutable things, which we're also planning to use here. So every class ivar is in essence wrapped in a RefCell.

To be clear it's not impossible to build a version of Rust that says "building your own safe version of mem::forget is unsound (and any API which can be used to do so is Also Unsound)". We just decided that on balance it's not worth the effort.

Someone actually tried to fork rust (or at least, the stdlib) to do exactly that in response to The Leakpocalypse™, although obviously that didn't get far politically.

One approach is to introduce a LeakSafe marker trait (analogous to Send/Sync) and then make Rc<T> require T: LeakSafe. Types like Drain would opt out of LeakSafe, and be disallowed from being put in refcounted cycles.

Another approach would be to introduce a marker trait that prevents interior-mutability types from being put in Rc, making cycles impossible (really restrictive...).

Another another approach would be to mandate no borrows stored in classes (equivalent to Rust requiring 'static). This one seems perhaps the most viable for Swift, since you've already embraced restrictions on where borrows can go. This would Suck for collections (an "Array of references" is pretty common), but maybe CoW emulated-value-semantics types can manually opt into it? Maybe that's sound?

1 Like

This looks mostly like a self-contained step towards fully usable non-copyable types.
The only concern I have is if we want to instead have @noncopyable type to imply ?Copyable on used/declared generic parameters.

2 Likes