[Pitch] Noncopyable (or "move-only") structs and enums

This is very nicely subsetted out from the full feature set. Obviously I look forward to generics-compatible non-copyable types, but I agree that this has use cases on its own, and it is definitely easier to review these proposals in chunks rather than all at once, even as we treat them as part of a whole. Kudos!

People already seem to be discussing a number of things I'm interested in, especially "should the attribute affect the type's generic parameters" because that does affect future proposals. There's only one thing that's jumped out at me that no one's mentioned:

For a local var or let binding, or consume function parameter, that is not itself consumed, deinit runs after the last non-consuming use.

What does "after" mean here? The example shows a non-consuming use in a function call, followed by deinitialization on the next line. What if there are nested function calls? Thus far, Swift does not share C++'s concept of a "full-expression", so would it happen between the inner and outer calls, like inout ending/cleanup? If the function is inlined, could the deinit happen before it's even complete? And if I want my binding to live longer, I can certainly use an explicit consume x, but the cost of forgetting that could result in a bug.

Between the discussion about class instances being released unexpectedly early (which I'm having trouble finding at the moment), and the precedent set by Rust, I would appreciate more discussion on why the default behavior isn't to deinit at end of scope, with an early consume allowing for more control when needed.

EDIT: I can think of one reason why this rule isn’t sufficient: directly passing an owned return value to a borrow parameter, or discarding one. It would make sense to me if those behave like inout while named bindings behave like defer, just as struct members and enum payloads have their own ordering.

5 Likes

Yes please. A major reason to adopt move semantics is to end reliance on optimizer magic. It seems counterproductive for the default mode to defer to the optimizer’s lifetime analysis.

2 Likes

Sorry for not being clear. The intent is to specify that the value is destroyed immediately after its last borrowing use ends. So that's stricter than end of scope, but should still be a well-defined location, not subject to optimizer whims, since borrows of noncopyable values begin and end at well-defined places. Looking toward values with lifetime dependencies, which may be lifetime-bound to borrows or directly contain borrows of other values, I expect that shrinkwrapping the lifetimes would get us closer to Rust's "non-lexical lifetimes" model, so code doesn't need to manually shorten lifetimes to avoid interfering borrows when values linger. I thought that was how Rust worked in general—is there a different rule for Drop types?

On that note, another area of design here has to do with library evolution and deinits—do we want to allow for public types to add deinits without affecting API or ABI? Lifetime aside, the presence of a deinit also puts some restrictions on how code outside of the type can consume it—there needs to be a whole value for the deinit to consume, so you can't partially destructure a value with a deinit by consuming some of a struct's fields or doing a consuming switch on an enum. Library evolution would also be a wrinkle in allowing for different lifetime semantics for types with or without deinits.

1 Like

Yeah, non-lexical lifetimes only apply to references; other types still use the classic “end of scope” model. See 2094-nll - The Rust RFC Book

3 Likes

Just a naming thing: I find myself thinking about what @John_McCall said about naming conventions, and my own vague intuition about there being an lvalue/rvalue-like distinction here that naming should respect.

Looking at the code examples in context in this proposal, my gut feeling is that the keyword should be a gerund (consuming / borrowing) in declarations:

  func write(_ data: [UInt8], to file: borrowing FileDescriptor) {
                                       ^^^^^^^^^

  func close(file: consuming FileDescriptor) {
                   ^^^^^^^^^

That reads better to my eye. John was skeptical of the gerund, but darn it, in context that just flows off the mental tongue, as it were. The write function writes to a file by borrowing a FileDescriptor. It’s right there in the code.

I do also see the case for a past participle:

  func write(_ data: [UInt8], to file: borrowed FileDescriptor) {
                                       ^^^^^^^^

The colon usually reads as “which is a”, and this naming fits: “Write data, which is a UInt8 array, to file, which is a borrowed FileDescriptor.” Both those options read better to me than the imperative verb borrow.

To be clear, the keyword should still be an imperative verb (consume / borrow) when applied to an expression that supplies a value:

  munge(borrow thinger)
        ^^^^^^

  funge(consume thinger)
        ^^^^^^^

Trying to articulate my intuition here: one describes what will happen elsewhere whenever the thing is used; the other describes what does happen right there in the usage site. That's post hoc explanation, to be clear; I'm just reacting to the fluency of the code itself.

5 Likes

@Paul_Cantrell Did you mean to post this over in the other review thread?

Perhaps? It's the code examples from this proposal I was referring to, though you're right that it's more on-topic for the other proposal. Feel free to move it (or let me know if I should).

I just think perhaps they'd like to hear about your opinions over there in that thread :)

munge(lend thinger) in that case, or?! :-)

1 Like

I had a similar thought. lend / loan seemed like a bridge too far at first blush…but you do have a point!

1 Like

You could argue for lend(ing)/borrow(ing) too…

That set of alternatives was already covered during the review for parameter ownership modifiers, which is also still ongoing here. Thoughts on naming the modifiers might be better discussed in the active review thread; we will use whatever modifiers get selected there in the noncopyable types proposal when it is ready for review.

4 Likes

Thanks Joe, apologies for the noise - I must say it is very hard to keep up with all threads and the preceding discussions in general, utmost respect for those of you who manage.

3 Likes

No worries, I'm not sure the review managers for those other proposals are keeping on top of this thread, so I want to make sure they see all of your naming feedback relevant to the proposals too.

1 Like

I would be interested to see whether the Rust community has a sense of how often lexical-scoped lifetimes prevent issues that would've been caused by using non-lexical time-of-use lifetimes for everything instead (@Gankra maybe you've blogged about this in the past?).

Mutex guards are the classic C++ example…that immediately becomes irrelevant once you put the data inside the mutex like Rust does.

5 Likes

Yeah, my current thinking is very much along those lines. Semantically relying on destructor ordering is inherently subtle in a way that makes it dangerous even without compiler action. It promotes bugs, and good APIs don't do that.

Drop running on scope exit is kinda just a sacrosanct truth in rust that exists as a concession to making programs easier to reason about. It's most important for unsafe code where you may have untracked raw pointers into a buffer and Really Don't Want That Buffer To Go Away Early. I'm honestly a bit surprised you're still looking at this, given you were previously finding issues with this approach with nice and safe ARC code.

The Rust team looked into what you're proposing a bit a couple years ago, calling it "eager drop", although I wasn't around for it, so I'm just linking things:

The conclusion was "no", of course.

(Currently typing up a followup comment that just braindumps a bunch of extra random facts about drops+borrows in Rust.)

6 Likes

The Dream is still that there is a well-behaved-enough subset of types for which eager-drop semantics are actually safe and sound; the two biggest issues with ARC we tried to mitigate come up from the unknowable nature of dense reference-counted object graphs, and (as you noted) the interactions with unsafe constructs.

Lexical lifetimes prevents the buffer from going away to early issue from happening. We are able to at the SIL level represent the notion of what is an escape and in such a case, just leave the lifetimes alone.