[Pitch] Noncopyable (or "move-only") structs and enums

Perhaps? It's the code examples from this proposal I was referring to, though you're right that it's more on-topic for the other proposal. Feel free to move it (or let me know if I should).

I just think perhaps they'd like to hear about your opinions over there in that thread :)

munge(lend thinger) in that case, or?! :-)

1 Like

I had a similar thought. lend / loan seemed like a bridge too far at first blush…but you do have a point!

1 Like

You could argue for lend(ing)/borrow(ing) too…

That set of alternatives was already covered during the review for parameter ownership modifiers, which is also still ongoing here. Thoughts on naming the modifiers might be better discussed in the active review thread; we will use whatever modifiers get selected there in the noncopyable types proposal when it is ready for review.

4 Likes

Thanks Joe, apologies for the noise - I must say it is very hard to keep up with all threads and the preceding discussions in general, utmost respect for those of you who manage.

3 Likes

No worries, I'm not sure the review managers for those other proposals are keeping on top of this thread, so I want to make sure they see all of your naming feedback relevant to the proposals too.

1 Like

I would be interested to see whether the Rust community has a sense of how often lexical-scoped lifetimes prevent issues that would've been caused by using non-lexical time-of-use lifetimes for everything instead (@Gankra maybe you've blogged about this in the past?).

Mutex guards are the classic C++ example…that immediately becomes irrelevant once you put the data inside the mutex like Rust does.

5 Likes

Yeah, my current thinking is very much along those lines. Semantically relying on destructor ordering is inherently subtle in a way that makes it dangerous even without compiler action. It promotes bugs, and good APIs don't do that.

Drop running on scope exit is kinda just a sacrosanct truth in rust that exists as a concession to making programs easier to reason about. It's most important for unsafe code where you may have untracked raw pointers into a buffer and Really Don't Want That Buffer To Go Away Early. I'm honestly a bit surprised you're still looking at this, given you were previously finding issues with this approach with nice and safe ARC code.

The Rust team looked into what you're proposing a bit a couple years ago, calling it "eager drop", although I wasn't around for it, so I'm just linking things:

The conclusion was "no", of course.

(Currently typing up a followup comment that just braindumps a bunch of extra random facts about drops+borrows in Rust.)

6 Likes

The Dream is still that there is a well-behaved-enough subset of types for which eager-drop semantics are actually safe and sound; the two biggest issues with ARC we tried to mitigate come up from the unknowable nature of dense reference-counted object graphs, and (as you noted) the interactions with unsafe constructs.

Lexical lifetimes prevents the buffer from going away to early issue from happening. We are able to at the SIL level represent the notion of what is an escape and in such a case, just leave the lifetimes alone.

Ok so there's a LOT of weird little fiddly things going on with drops and borrows in Rust. A completely random braindump:

  • There's a weird specific rule/guideline in rust that certain aspects of borrows/drops are purely syntactic. I can never remember the details but the gist of it is that you don't want improvements to the borrowchecker or lifetimes to change what code visibly executes. The borrow checker checks for errors and then goes away forever. Lifetimes Do Not Affect Codegen. Syntactic Drops Good.

  • I do have one salient article on Designing Deinitialization In Programming Lanuages. Starting in section 3.3 and 3.4 I discuss how, Rust and Swift both expose Definite Initialization to the end user and allow for a variable to have delayed initialization. In sufficiently dynamic situations this necessitates a flag on the stack to track the current initialization state of a variable (basically making it an implicit Optional but you get compile errors if you ever read when the Option is maybe None). This necessarily effects whether Drop/deinit runs.

  • Once you introduce noncopyable types you also get dynamic deinitialization based on whether a value was moved out or not. We messed around with the concept of "static drop" and decided it was too spooky (see section 4.5 of the article in the previous bullet, I'm being rate-limited on links...). The TL;DR is that if you move a variable out in an if, the Drop would get hoisted up from where the variable went out of scope to the (implicit) else block of that if. In this way it would have guaranteed that variable initialization status is always statically known. But again, spooky as hell.

  • What immediately jumps out to me as the kind of thing that would break with "eager drop" is "unwind guard" types which exist to emulate finally-blocks. These often are otherwise unused, and strictly exist to run when the scope ends. I can't remember exactly but I'm 100% certain that Swift, the language with every feature, has some kind of finally/defer so this is less of a problem for y'all.

  • Another thing that's really sketchy is people reasoning about stable addresses under moves. I know this is a thing Swift has largely told people they're not allowed to do (see withUnsafeMutablePointer) but hey I'm dumping Rust stuff, not Swift stuff. As an example, you might want to pass Box<T> and a *mut T into a function that points into it, and "know" that's fine because the Boxes contents are a stable address and won't get messed up by moves... except that we want to mark Box as noalias and so llvm might get the wrong idea and believe that the raw pointer can't point in there! It's a whole fucking thing and I hate it. Under eager drop this is definitely also a busted pattern, since the compiler has no idea one borrows the other.

  • The scoping of ~temporaries sometimes surprises people, although the failure-mode is generally only observed with types like Mutex where keeping a value alive for too long is a correctness error (causes a deadlock). In particular how long locks get held when the matched-upon expression involved a lock. This situation would be improved by trying to more aggressively drop the MutexGuard and release the lock. There's a clippy lint to try to catch this but last I checked it's too aggressive (complains about using Drain idiomatically and correctly because a for loop is basically a match and Drain has a destructor): Clippy Lints

  • In a similar vein, people get surprised by the fact that let _ = x; and let _y = x; have different destructor behaviour. The former isn't a variable binding, but rather a pattern that captures nothing. As such the temporary x goes out of scope and is dropped. The latter actually captures it in a normal variable (the prefix underscore tells the compiler it's fine that it's seemingly unused) and drops it when the variable goes out of scope.

  • In another similar vein, people get surprised by the fact that let (x, y) = z and let x_and_y = z has different drop behaviour (iirc). This is because of the other surprising fact that Rust drops fields in declaration order, while it drops variables in reverse-declaration order (arguably a bug, but it is simply truth now). In practice neither of the issues in this bullet or the previous one are problems because in 99.9% of the cases it would matter you just get compiler errors from the borrow checker or definite-initialization checker.

  • Actually as far as the compiler is concerned, the two cases in the previous bullet are dropped at the EXACT SAME time. Specifically if y borrowed x, the "lifetime" of y and x are "equal". In conjunction with the extremely-sketchy-and-long-storied dropck eyepatch, this allows a destructor to run while a type contains dangling pointers! Safely! Correctly! here's a demo with more details, but this is useful/important for supporting Arenas which end up being very intrusive and a mess for drop order.

8 Likes

Oh also two things where I'm moreso talking out of my ass but know things are spooky:

  • There is some messiness in formalizing Rust around the fact that drop takes &mut self which is supposed to mean that self is valid for the entire body of the function, but drop is explicitly doing things that invalidated the value. I don't know if there's been a real resolution to this, last I checked we were in "try not to think about it". I think this issue might have been the one I recall reading?

  • There is also some messiness around Drop and Pinning, which is why there's several sections in the docs on Pin<T> dedicated to precise interactions with Drop. I think there's generally a desire to have some kind of notion of "async drop" but I'm just gonna be bluntly honest and say that Pin/async is stuff I simply don't understand properly and is clearly pushing up against semantic limits of the language.

6 Likes

Thanks for the memory dump @Gankra!

In Rust, I recall that panics generally prevent &mut borrows from being temporarily invalidated, and it makes sense to me that that could also be a hazard for drop implementations, since you probably don't want every drop to have to do the reverse definite initialization thing and maintain a dynamic bitmap of the value state in case it panics and you have to destroy the currently-valid components as you unwind. We're initially taking the "self is inout in deinit" tack because we also don't want to hold up putting noncopyable types in developers' hands on implementing that partial invalidation analysis right away, but I think we want to in the fullness of time.

Just a comment on some code block in the document, isn't the second parameter missing an ownership keyword?

when a function parameter is declared with an noncopyable type, it must declare whether the parameter uses the borrow , consume , or inout convention:

func redirect(_ file: inout FileDescriptor, to otherFile: FileDescriptor) {
1 Like

I've incorporated some feedback from the discussion so far into the proposal; thanks everyone!

I'd like to continue the design discussion, and I have a few particular open questions I'd like to hear more feedback on:

  • Should noncopyable values have scoped lifetimes, or "eager drop" lifetime that ends after their last use, if they are not consumed?

  • Should noncopyable types be able to add a deinit without breaking ABI and/or API? In what circumstances? The potential existence of a deinit on a type imposes some interesting constraints on how the value can be used. Since there needs to be a complete value to be consumed at any point a deinit can run, this generally means that client code shouldn't be able to consume any part of the value, since doing so would invalidate the value without going through deinit:

    @noncopyable
    struct Foo {}
    
    @noncopyable
    struct Bar { var x, y, z: Foo }
    
    let bar = Bar()
    let foo = bar.x // Error, not allowed to take `x` away from `bar`
    
    @noncopyable
    enum Bas { case x(Foo), y(Foo), z(Foo) }
    
    let bas = Bas.x(Foo())
    switch bas {
    case .x(let foo): // ERROR: can't steal bas's payload, that might bypass deinit!
      ...
    }
    

    The restriction makes absolute sense for resource-managing types with meaningful deinits, but is inconvenient for types that really are intended to be simple aggregates, and it's particularly limiting for enums to not be able to pattern-match and consume their payloads. On the flip side, I think anyone who's worked in an object-oriented language with some kind of destructors in it has had reason to retroactively add a deinit to their classes at some point in their careers. So it stands to reason that some amount of resilience to adding deinits would be a good thing, but there are also types that will never have deinits which should allow for flexible destructuring of their members. Is an existing control like @frozen sufficient, or do we need finer-grained controls?

  • To reduce annotation burden, should we treat copyability like Sendable, and say that structs and enums are implicitly copyable when all of their members are, and they don't define a deinit (and they don't do other things we might add to the language in the future that require noncopyability)? That could significantly reduce how often developers need to explicitly tag noncopyable types with a @noncopyable attribute, Self: ?Copyable generic anti-constraint, or other annotation.

2 Likes

I think it'd be very interesting to see how far the Sendable analogy can be pushed here. If viable there's a really elegant consistency.

2 Likes