[Pitch] Noncopyable (or "move-only") structs and enums

Oh hi, you're poking the ol' move-only bear again! The mem::forget stuff is interesting! :smile_cat:

Here are some useful links:

  • The Pain Of Linear Types In Rust - an article I wrote in the early days of the Swift team looking into "move-only types", discussing why "true linear types" (as opposed to the "affine types" that Rust has) are a huge mess. Also a look into what I originally believed to be the generics-migration story for Swift adopting move-only types back in 2017 (good call to punt on generics, Joe :joy_cat:).

  • Rustonomicon - Leaking - official documentation (also by me) on how to deal with the Possibility of leaking (mem::forget) when desiging safe abstractions around unsafe code, and some examples of APIs that had to be adjusted in The Leakpocalypse™ (where Rust came to terms with the fact that mem::forget could be emulated in safe code and so may as well be marked safe To Be Honest).

  • std::mem::ManuallyDrop a wrapper type that prevents the compiler from automatically dropping a value, allowing the developer to either forget it or drop it when they think is the right time. In essence, this type is inverted mem::forget: everything is forgotten by default, and you need to manually not-forget it.

  • Two Memory Bugs From Ringbahn - some cases where incorrect mem::forget usage resulted in double-drops (relevant to the checking stuff you added).

23 Likes

The docs for mem::forget cite Rc as a way to invent forgetting in safe code without it, but is there a way in Rust to invent forget for a lifetime-dependent value (which Rc seems like it can't since it would impose indefinite lifetime on the thing being refcounted)? Indiscriminate forget has come up as a wrinkle in answering the question "are locks movable without storing them in an out-of-line cell". Apple's os_unfair_lock for instance doesn't have any lingering OS resources when it's not in a locked state, so its unique owner is free to move the OS_UNFAIR_LOCK_INIT bit pattern around in memory to give the lock away. It only needs to be pinned in memory while locked, which can be handled by ensuring locking always happens within the lifetime of a borrow on the lock. That works great unless some jerk mem::forgets the MutexGuard that would normally unlock it, leaving the lock in a locked state while statically appearing to be un-borrowed again while the OS still has guns pointed at its current address, leading to hijinx if the presumptive owner of the lock moves it in this state. It seems to me that the "if it didn't exist, you could invent it" argument doesn't hold as strongly for lifetime-dependent type families like Mutex and MutexGuard, but maybe I'm missing the trick to invent it in this case and it's all hopeless.

1 Like

Apologies I am a bit sick so this might be slightly incoherent.

The Leakpocalypse™ really was about reference-counted cycles. Also that Mutex and MutexGuard were never actually the problem because if you mem::forget the lock just gets stuck "locked" and the program deadlocks (although your example is an interesting wrinkle!). It's actually exceptionally rare to encounter the issue, to create A Leakpocalyse™ Type you must:

  1. Create a type with some non-trivial validity invariant (Array)
  2. Create another type which mutably borrows (inout) Array and violates that invariant (Drain)
  3. Argue that it's ok for the invariant to be transiently violated because:
    a. The mutable borrow held by Drain prevents anyone from touching Array and observing the violated invariant
    b. The mutable borrow expiring implies Drain's destructor ran
    c. Drain's destructor repairs/enforces the invariant

All of this is valid except 3b, because you can write code like (pardon my rusty pseudo-swift):

// Make the contents of this non-POD if you want real UB
var arr = [1, 2, 3, 4];

// Start consuming the values out of the array
{
  var drain = arr.drain();
  drain.next();
  // At this point the Array is invalid, as index 0 is logically deinit
  // But it's ok, `arr[0]` should be a compiler error because 
  // `arr` is `inout` borrowed by `drain`!
  
  // ... now leak it with a strong refcounted cycle (Node is a class)
  var node1 = Node();
  var node2 = Node();
  // At this point the compiler sees that node1 holds the borrow
  // that drain held, so now we still can't access arr until
  // node1 is gone
  node1.value = drain;
  node1.other = node2;
  node2.other = node1;
}
// OK! drain and node1 are both DEFINITELY gone, it's impossible
// for the program to access them (correct!) so the borrow can be
// released (correct?) and we know the dtor ran (incorrect!!!)

// Reading deinit memory!!!
print(arr[0]);

You can do this kind of thing because Rust would type these nodes as like Node<Drain<'a, T>>, and anywhere a Node goes the compiler infects with the 'a borrow (or perhaps more accurately, expands 'a to cover those parts of the program, and the borrow follows). Basically refcounting isn't really indefinite. The compiler can follow all the places the strong references get to, and it knows when they all go away then the borrow can go away (anything that could really smuggle things forever like globals or passing things across non-scoped threads requires T: 'static, the forever lifetime, which would make the borrow never expire).

I'm afraid I can't really extrapolate to what this looks like in Swift. Like, classes were always going to be a nightmare for noncopyable types. I'm not really sure what it Means to put a noncopyable type into a class, since taking an inout borrow of that field is unknowable to some other code that happens to also have a reference to that class instance. This is why Rc/Arc in rust enforce immutable access to their contents (no inout), which you need to claw back with dynamic "interior mutability" types like RefCell or Mutex.

1 Like

We have an existing scheme for dynamically enforcing exclusive access to classes, globals, and other dynamically shared-mutable things, which we're also planning to use here. So every class ivar is in essence wrapped in a RefCell.

To be clear it's not impossible to build a version of Rust that says "building your own safe version of mem::forget is unsound (and any API which can be used to do so is Also Unsound)". We just decided that on balance it's not worth the effort.

Someone actually tried to fork rust (or at least, the stdlib) to do exactly that in response to The Leakpocalypse™, although obviously that didn't get far politically.

One approach is to introduce a LeakSafe marker trait (analogous to Send/Sync) and then make Rc<T> require T: LeakSafe. Types like Drain would opt out of LeakSafe, and be disallowed from being put in refcounted cycles.

Another approach would be to introduce a marker trait that prevents interior-mutability types from being put in Rc, making cycles impossible (really restrictive...).

Another another approach would be to mandate no borrows stored in classes (equivalent to Rust requiring 'static). This one seems perhaps the most viable for Swift, since you've already embraced restrictions on where borrows can go. This would Suck for collections (an "Array of references" is pretty common), but maybe CoW emulated-value-semantics types can manually opt into it? Maybe that's sound?

1 Like

This looks mostly like a self-contained step towards fully usable non-copyable types.
The only concern I have is if we want to instead have @noncopyable type to imply ?Copyable on used/declared generic parameters.

2 Likes

Was consideration given to using NonCopyable as a marker protocol, rather than @noncopyable as an attribute? I didn't see it discussed in the "Alternatives Considered". It seems like Swift has recently preferred to express such features with protocols, rather than attributes. (For example, see Sendable.)

6 Likes

I think this is implicitly answered by:

NonCopyable is not a constraint, it is the lack of a Copyable constraint. It doesn't make sense to think of it like a protocol in any other context, since a generic parameter or protocol that allows for noncopyable types to conform would not require the type to be noncopyable, it allows the type to be noncopyable, and would still accept copyable types.

It could make sense to adopt a "don't require this constraint" syntax like Rust's ?Trait, so you'd write struct Foo: ?Copyable or something like that to opt the type out of satisfying the Copyable constraint. Then eventually you'd also be able to write foo<T: ?Copyable> to specify a generic parameter that doesn't have to be copyable and so on. "Copyable" as a constraint still doesn't quite make sense to think of as a protocol, though, because it's a fundamental trait of the type, and can't be retroactively added or have multiple independent conformances like a protocol can, and it's not a "marker protocol" either because it has a fundamental runtime ABI impact on the type. I think it would end up being more akin to a layout constraint like AnyObject, which is also implicitly "conformed to" by all types that satisfy its requirements (that is, single-refcounted-pointer types like classes and class existentials without witness table requirements), and cannot be explicitly or retroactively conformed to by types that don't fundamentally satisfy that constraint.

4 Likes

Is “Unique” is an accurate synonym for “NonCopyable”? That’s a positive constraint.

I also don’t understand the conclusion that copyable types would always satisfy NonCopyable type parameters. struct T<U: NonCopyable> doesn’t allow any copyable types to substitute for U.

There isn't any fundamental that prevents a copyable type from being used as a noncopyable one. The implementation just wouldn't use the copy functionality. It's like talking about "types that don't have a foo() method" when you have a protocol Foo { func foo() }. We could make up a different requirement for "must-be-unique", but we'd have to decide what the important properties of types that satisfy that requirement are, and what sorts of code you'd write that takes advantage of those properties. It still wouldn't be something that noncopyable types just automatically satisfy, because nothing prevents a noncopyable type from implementing some manual-copy API, or being used to represent a handle to a shared resource that the client is not allowed to propagate but might have handles elsewhere in the program or OS.

1 Like

That’s precisely what I would want the @noncopyable attribute to do: prevent var x = MyNonCopyable(); let y = x. Generalized, this is func withHandle<Handle: NonCopyable, T>(_ handle: borrowing Handle, perform closure: (borrowing Handle) rethrows -> T) { closure(handle) }. Another name for withHandle could be ensuringUniquelyOwned.

I presume MyNonCopyable() is concretely a non-copyable type, so let y = x would in fact move x into y rather than copy it. When we generalize the generics system to include noncopyable types, then withHandle can't assume Handle is copyable, and it would act as a noncopyable type within the body of the function. So the function would not be able to introduce copies of whatever Handle is, but callers can still pass in copyable types for the Handle that they can copy themselves.

You may also be interested in the @noImplicitCopy attribute, which we're implementing to suppress implicit copying on values that are of copyable type.

1 Like

It is useful to be able to write generic code that works on an arbitrary type, copyable or not. Many basic data structures and algorithms do not innately require copying values and so can naturally be generalized to work with non-copyable types. For correctness, code like that has to work generically with the type as if it were non-copyable. When given a copyable type, it effectively promises not to (locally) copy it.

From that perspective, copyability is a positive constraint, just like an ordinary protocol constraint. For example, if Array were generalized to allow non-copyable types, it would still be conditionally copyable:

extension Array: Copyable where Element: Copyable

(You might think that you could do the same with NonCopyable as a positive constraint, and in simple cases this does make sense:

extension Array: NonCopyable where Element: NonCopyable

But the basic direction of the logic is wrong, as you can see when you consider a similar generalization of Dictionary:

extension Dictionary: NonCopyable where Key: NonCopyable, Value: NonCopyable

This is wrong; it is making Dictionary non-copyable if both of the key and value types are non-copyable, but in fact Dictionary needs to be non-copyable if either argument is non-copyable. Similar logic applies to e.g. inferring non-copyability for structs. This is a clear sign that the direction of the constraint is backwards.)

12 Likes

How does the compiler know that it’s OK to pass a @noncopyable type to such a function? Just because a parameter is marked borrows doesn’t mean the function can’t copy it internally. There must be some other element of the function’s type signature that the compiler can check at the call site, no?

That's the part of the proposal we're punting to another day by saying "you can't do that" at all. But yeah, in the fullness of time, there will need to be some way to indicate generic parameters that don't require their types to be copyable. A few ways we've thought about doing this are to have a Rust-style T: ?Copyable syntax to say that T does not require Copyable, or a declaration level @allowNonCopyable that specifies that generic parameters don't implicitly require copyability, so you can then explicitly put T: Copyable on the ones that do require copying.

2 Likes

Without this, how can one actually do anything useful with a @noncopyable type? The pitch talks about borrowing/inout/consume, but none of these parameter decorations imply the callee won’t try to copy the argument. It seems like solving that problem is a prerequisite, not something that can be deferred.

The callee won't try to copy arguments that are concretely of noncopyable type, because it can't; there is no copy operation anyone can use on the type. The impacts this has on behavior inside the function are pretty much identical to those that apply to @noImplicitCopy bindings: if you have a borrow parameter, you can only do borrow things with it and can't consume it; if you have an inout parameter, you can borrow or modify the parameter, but have to give it back when you're done; if you have a consume parameter, it's yours to do with as you please, up until you give it someone else to consume.

Ah, this explains why @noncopyable cannot be applied to classes—otherwise a value of @noncopyable class Sub : Super could be passed to a parameter of type Super, and the callee could would try to copy it. The problem is any kind of polymorphism, not just generics.

One big difference is that @noImplicitCopy is documented to be ABI-neutral, whereas it would be an ABI break to remove the marker for “generic over copyable and non-copyable types”. So there would be another keyword or syntax which does almost the same thing as @noImplicitCopy.

One of the good changes that happened during the design process for concurrency checking is the change from @Sendable-everywhere to : Sendable. It reuses existing language concepts, making sendability feel more integrated with the standard library and the language. By contrast, all these variations, with and without @ prefixes, are reminiscent of Java annotations and make move semantics feel more like a bolt-on feature.

I'm not a great fan of the @noImplicitCopy naming or syntax, and would definitely love to come up with something better. The reason @noImplicitCopy doesn't affect ABI is that it's an artificial local constraint, that ties the compiler's hands from using its normal ARC and value semantics tricks to manage memory for a particular value, whereas whether a type even has a "copy" operation at all to stick in the value witness table is a fundamental ABI concern. I think we will however want to be able to allow some degree of retrofitting noncopyable type support into the ABI; the basic calling conventions for passing a noncopyable value don't change, and it would be useful to be able to replace standard library APIs with implementations that don't rely on copying, and then expose them as not requiring copyability in a new version of the OS.

The "bolted-on"-ness is maybe a little bit intentional, since we still expect a good majority of Swift code will remain oblivious to copy controls and still be fine with automatic memory management, so we're trying to keep noncopyable types and the various controls on copy behavior on the "progressive" side of "progressive disclosure". I like the idea of a Rust-style : ?Copyable constraint remover; that seems like a nice unifying syntax for opting types out of providing copyability and stopping generics from requiring it. Is var x: ?Copyable = copyableValue() too weird as a way of spelling @noImplicitCopy on an individual binding?

2 Likes