[Pitch] Noncopyable (or "move-only") structs and enums

I expect that at some point in the future Copyable will be a protocol, and that the existential any Copyable will be a synonym for today's Any type, useful for cases where copyability needs to be explicit.

1 Like

Yes, I believe we should implement both of these when we get a chance. They're both useful ways to store data on the heap, even though they behave quite differently:

A non-copyable class does not need any reference-counting. That makes it a very efficient way to store data on the heap as long as you don't need multiple references.

An indirect struct is essentially a compiler-generated copy-on-write structure. It acts just like a regular struct in that assignment is semantically the same as a copy. But note that in order to support copy-on-write, the backing storage will need to be reference-counted.

3 Likes

I've made some more small revisions to the pitch:

5 Likes

I had an interesting (to me, at least) realization this afternoon: a value type that is not movable and not copyable is essentially a value type that can't escape. For instance, if you have a non-movable and non-copyable pointer to a local variable, lifetime-wise, that pointer is safe.

I don't know what to do with this information but I figured this would be a cool place to leave it.

1 Like

@fclout interestingly this also applies to move only types passed as a non-consuming function argument.

2 Likes

Hi everyone! I've revised the proposal again with some new material. Even if you've read it and been involved in the discussion before, I'd ask you to check the proposal again, as we've encountered some new design issues with the ABI and borrowing behavior while working on the implementation:

One particularly important issue is that noncopyable types are incompatible with the ABI we currently use for properties in library-evolution-enabled libraries. Properties in these types are normally abstracted behind a get accessor, but a get returns its result, which for a stored property necessitates a copy in order for the return value to have independent ownership from the field inside the value whose property is being accessed. We can't do that for noncopyable values, of course, so instead, we would need to use a _read coroutine as the most general ABI, since a _read coroutine can yield access to the field value in-place, or if the property is in fact computed, it can invoke the getter, yield access to the temporary return value, then destroy the temporary.

This is fine, but it means that either adding or taking away copyability from an existing type is potentially ABI-breaking, where previously we had hoped that copyability could be added resiliently. Hopefully noncopyable types don't need to become copyable that often in practice, but if they do, we would need some attribute to indicate when a type was formerly-noncopyable in order to ensure that properties of that type continue to use the noncopyable ABI (which would work fine for a copyable type).

Aside from the additional ABI discussion, I've also ported over the discussion of what existing language operations are borrowing, consuming, or mutating, which I had originally written up as part of the @noImplicitCopy attribute pitch; since we've pivoted away from that design direction, this seems like the next natural place to put it. I also added "future direction" discussion for handling tuples and Optional non-copyable types, which seem like high-value specific use cases we might want to handle ahead of fully general generics support for noncopyable types.

8 Likes

Thanks! I read the new proposal and I have a few comments and questions.

In Consuming operations, the consume operator is used on bindings to copyable values and it invalidates the binding. On the other hand, consuming parameters accept arguments that don't have to be variable bindings and don't invalidate variable bindings to copyable types. Is that intentional?

Since there are many consuming operations, including assignments and returns, it's not clear to me what consume does that requires a keyword. _ = x seems like it would be the same as consume x, no? This is almost certainly something that has been discussed while I wasn't looking; having a short paragraph on the matter would probably help a bit.

Also, as I understand it, this is legal:

consuming func foo() { /* ... */ }

consuming func bar() {
	if baz {
		foo() // deinit not called
	} else {
		return // deinit called
	}
}

but this is not:

consuming func bar() {
	if baz {
		forget self // deinit not called
	} else {
		return // error: must consume self
	}
}

Why is forget different in that regards?

Looking at future directions, if we had a language in which any consuming method could leave self in a partially deconstructed state, would forget still be necessary?

Lastly, I think that close as a consuming and throwing operation is problematic. In the current implementation, failure leaks resources and I don't think that anything appealing can be done about it. In order to make sure that what's being built is actually correct, I was wondering if there's another example on hand that we can have for an operation that consumes and throws.

If you use a copyable value as the operand to a consuming operation, it will generally get copied, and the copy will be the thing that's consumed; the ARC optimizer might run over the code and figure out it can forward the original value along without copying because nobody else uses it afterwards. consume on a copyable variable forces that forwarding to happen by ending the lifetime of the value. With a noncopyable type, you have no choice in the matter; a consuming operation on a noncopyable value must either end its lifetime, since we can't pass a copy along, or must be an error if we are unable to consume the operand at that point. The explicit consume operator is only supported on local variables because those are the only things we can generally know we have ownership of in order to consume, though we could generalize it to struct fields inside consuming methods and other situations where we allow for partial consumption in the future.

It doesn't have to be, but it seemed like a good idea to us to avoid surprises if someone is explicitly trying to avoid invoking the default deinit.

I would say that forget would be necessary to allow yourself to leave self in a partially deconstructed state if you have a deinit, because having a deinit would otherwise imply that it's invalid to leave self in a partially-valid state.

[deleted]

1 Like

POSIX leaves the state of the file descriptor unspecified. Linux documents it to be closed, but it's not portable behavior. XNU certainly does not close file descriptors for you when close fails. This can happen, among other reasons, if the executing thread is on the receiving end of pthread_cancel (in which case close returns EINTR without doing anything).

2 Likes

Ah, thank you for correcting me. What are those other situations? Is there a reliable way to dispose a file descriptor on XNU? pthread_cancel is already going to cause resource leaks if you try to summarily terminate a Swift (or ObjC ARC, or C++) thread.

Aside from that, it can happen for any reason the descriptor-specific close routine can fail. In theory this isn't a large set of reasons, but in practice, all of pipes, FIFOs, kqueues, semaphores, shared memory, sockets, device files, various filesystems, etc have the opportunity to fail in their own specific, exciting ways. It's possible that some close anyways and some don't. I don't know enough about them to comment with much more accuracy, sorry.

That's unfortunate. I'm not sure what XNU's exact behavior is, but if it is indeed "unspecified" as POSIX indicates, it seems like there's still a reasonable argument to treat the file descriptor value you have as invalid even if close(2) fails, since an attempt to repeat the close could end up racing with another open that reused the file descriptor number if the unspecified close behavior did in fact close the file descriptor. It seems impossible to write anything portable, correct, and non-leaky as specified; I guess this is a "choose two" situation.

2 Likes

IIRC, there is no portable way to use close safely on different POSIX systems (in regards to error handling). This is touched on here. And by the Austin group itself, here.

The bizarre thing about all this, is that there is a POSIX error code specifically for situations where you need to retry the function that failed: EAGAIN. I have no idea why the systems that require close to be retried return EINTR in the first place.

3 Likes

I thought about this more and I think I found a counter-example. If there's eventually a swap function for move-only values, then you could escape move-only values if you have an inout reference to them:

withMoveOnlyInout {	longerLifetime in
	withMoveOnlyInout { shorterLifetime in
		swap(&longerLifetime, &shorterLifetime)
	}
	// shorterLifetime has escaped!
}

If those are inout parameters, then they are not what I am talking about. I am talking specifically about cases where you have a normal non-consuming function argument that is not inout.

Some questions after re-reading...

invoking a consuming method on a value, or accessing a property of the value through a consuming get or consuming set accessor:

Do you happen to have an example of where one would ever use a consuming set? (Though I see why this operation could be included if just for the sake of composability.)

passing an argument to an init parameter that is not explicitly borrowing:

Does this imply all parameters of the generated memberwise init are consuming and implicit copies are made for all copyable types?

pattern-matching a value with switch, if let, or if case:

  • Assuming Optional<noncopyable T> was allowed why is the explicit consume needed here if let y = consume x { ... }?
  • Is the idea that one could write switch borrow x to avoid this consume-by-default behavior?

Iterating a Sequence with a for loop:

let xs = [1, 2, 3]
for x in consume xs {}
use(xs) // ERROR: xs consumed by `for` loop

Same question about explicit consume, otherwise this case seems covered by “explicitly consuming a value with the consume operator”

guard let condition = getCondition() else {
  consume(x)
}

This would probably clearer to the reader with a function called consumesAnX(x) and an explicit return

Accessing a computed property or subscript through borrowing or nonmutating getter or setter borrows the self parameter for the duration of the accessor’s execution.

Does this imply that nonmutating would be discouraged in favor of borrowing in swift 6?

it must declare whether the parameter uses the borrowing, consuming, or inout convention:

Is there a rational to requiring this as opposed to defaulting to borrowing like methods mentioned just below?

class Foo {
  var fd: FileDescriptor

  init(fd: FileDescriptor) { self.fd = fd }
}

This appears to be missing a borrowing or consuming

A @noncopyable struct or enum may declare a deinit, which will run implicitly when the lifetime of the value ends

is it possible to call MyType.deinit as an explicit consuming operation or is the only way to write this consume myInstance?

A noncopyable type can be made copyable while generally maintaining source compatibility.

Do you have an example or deeper description of how this might work when making noncopyable with a deinit copyable? Would I assume you would be forced to remove the deinit and if so would that not be ABI breaking unlike: “However, an noncopyable type can be made copyable without breaking its ABI.“?

However, for progressive disclosure and source compatibility reasons, we still want the majority of types to be Copyable by default, without making them explicitly declare it; noncopyable types are likely to remain the exception rather than the rule, with automatic lifetime management via ARC by the compiler being sufficient for most code like it is today.

I could see a package/module where you one would not want this default. Ex: if you’re writing embedded firmware or device drivers (aspirational). In this case you would want a way to spell the “positive” case. I argue that this spelling is naturally extension MyType: Copyable. In which case the negative spelling (to mirror Sendable) would be extension MyType: @unavailable Copyable.

This could expressed to the compiler in a configurable manner as something like --implicit-sendable=internal or --implicit-copyable=public so a embedded language mode could be made where one could not set these flags.

Additionally, is there ever a world in which we may want a type to be immovable, e.g. Literals stored in a .rodata/STRINGS sections? In this case I think spelling this would be extension MyType: @unavailable Moveable. While I don't think necessarily think this would be written frequently, I think it would be nice to be able to show these semantics in a generated interface when jumping to a module definition in Xcode/vscode.

I don't have a concrete example in mind.

Yeah, all initializer arguments are consuming by default, which is how the ABI is implemented today. Whether a copyable type is implicitly copied or not depends on whether it's the last use of the value; if the construction isn't the last use, and the parameter is used as part of the new constructed value, then a copy has to occur somewhere, but the consuming convention allows a copy to be avoided by forwarding when the construction is the final use of a value.

I use the explicit consume operator in these situations primarily because noncopyable types can't be used in these situations today, but I want to establish the conventions these operations have for future reference when we do have noncopyable types available in these situations. Extending if let, for, and friends to work with borrows is being discussed under the proposal for borrow and inout bindings. As you noted, you will most likely use a different syntax form to explicitly borrow.

Primarily that forgetting to specify the convention and getting the wrong default could leave you stuck with the wrong behavior for move-only types, whereas when working with copyable types the difference is more of an optimization.

As part of the borrow and inout bindings, we're proposing that those bindings don't allow for implicit copying, which I think will allow you to get the effect you like by choosing those bindings over the typical var and let.

Maybe, though nonmovable types are pretty niche in their applications. In Rust, you usually interact with global literals through &'static borrows. Since the values are owned by global constants, and globals can't be consumed out of because any code anywhere in the program can access them, they are effectively nonmovable.

I’m sorry for not checking whether this is already in the proposal, but what restrictions do we need to put in place around accesses to make consuming accessors on non-copyable types work? I’m think in particular about read-modify-write accesses.

Is consuming _modify legal in current Swift? I suppose I shouldn't be too surprised given that consuming set is, but they both seem fairly useless in this context.

It would be, yes. It really comes down to whether we have to do something else with the value after calling the accessor, and the calls set and _modify always come at the end of their respective sequences. I think consuming willSet is the only one that I think needs to just be straight-up impossible because it’s never at the end. My bigger concern is about consuming get — we might have to require a _modify to make a property like that mutable, because otherwise we will have no way to do a read-modify operation.

3 Likes