Proposed modification to SE-0377: should explicit parameter ownership modifiers suppress implicit copying?

Hi everyone, thanks for all the feedback on SE-0377 and our other ownership-related pitches. Although the current review is nearing conversion, we've gotten a lot of feedback both on the forums and from early adopters with concerns about the annotation burden on code that wants to be ownership-aware while working with normally-copyable types, and that's guided our thinking about upcoming features in a way that may play back into the design of these parameter ownership modifiers. Particularly, we want to propose: should marking a parameter as borrowing or consuming prevent it from being implicitly copied in the function body?

borrow, inout, and consume bindings

We plan to introduce new binding forms which allow for binding to a mutable or immutable value in-place within its original container, such as:

// Get a mutable handle to an element in a nested array of arrays
inout entry = &array[0][1][2]

// Perform multiple mutations on the value, without re-projecting it each time
entry += "foo"
update(&entry)

// Get an immutable reference to an element in a nested array of arrays, without
// copying it
borrow otherEntry = array[3][4][5]

// Perform multiple operations using the referenced value,
// without copying or re-projecting it
process(otherEntry)
doMoreProcessing(with: otherEntry)

One observation is that, if a developer has already made the choice to use one of these binding forms over let and var, they are probably doing so because they want to be aware of ownership and copying overhead. When a binding is explicitly borrowing something, it is probably surprising if the compiler is allowed to implicitly copy it if you try to do something consuming to the borrow:

func consume(_ value: Entry) {}

borrow entry = array[3][4][5]
consume(entry) // Would have to pass a copy of `entry`, not the borrow

So we're leaning in the direction that these bindings should not be implicitly copyable. We could avoid the need for a @noImplicitCopy attribute for local variables altogether by also having a consume binding form, which has ownership of its value like a let or var would, but which also:

  • does not allow implicitly copies
  • requires that the bound value is consumable without copying
  • has "eager move" lifetime semantics, so its lifetime ends after a consuming use, or immediately after its last borrowed use if it isn't consumed

There is also a nice analogy between this consume binding and the way the consume operator from SE-0366 works, that parallels the relationship between inout bindings and the & operator, and borrow bindings and the (to-be-pitched) borrow operator. Each operator can then be thought of as expanding to a local scope that defines a temporary using the corresponding binding declaration, and passing the temporary as a parameter to the surrounding function call:

foo(&x.y.z)

// Can be thought of as shorthand for:
do {
  inout temp = &x.y.z
  foo(&temp)
}
foo(borrow x.y.z)

// Can be thought of as shorthand for:
do {
  borrow temp = x.y.z
  foo(temp)
}
foo(consume x.y.z)

// Can be thought of as shorthand for:
do {
  consume temp = x.y.z
  foo(temp)
}

As an alternative to @noImplicitCopy as an attribute that you have to sprinkle all over your performance-sensitive code, we hope this design direction feels more integrated with the language. And it lets us give a more consistent story for working with ownership, whether with noncopyable or copyable types: if you want to (or must) think about ownership, use consume/borrow/inout bindings; if you don't care, use let and var and have the compiler figure it out.

Relation to parameter ownership modifiers

We've tried to align the spelling of the ownership modifiers with the related operators because we want the relationship between the various possible calling conventions and the mechanics of ownership and borrowing to be clear. As such, to keep that consistency going, it would make sense to be able to say that if you declare that a parameter is borrowing T, that the parameter ought to act like a borrow binding within the body of the function, and likewise, a consuming T ought to act like a consume binding, and an inout T as an inout binding, including the lack of implicit copyability.

I had personally started from the position that these parameter modifiers should not affect anything except for the ABI of copyable parameters, either in callers or the callee, since it's important for libraries to be able to have the ABI changes as an optimization tool without imposing the disruptive need to think deeply about ownership on their clients. But other folks on the Swift team and community feedback have helped convince me that imposing copying constraints locally in the callee is likely a net positive—choosing to explicitly annotate your parameters' ownership already indicates an intent to consider ownership, and imposing the need for explicit copies on the modified parameter can serve as an indication whether changing the convention has the desired optimization effect in the body.

One wrinkle in this plan is that the inout modifier already exists, and does not currently constrain the current value of the parameter from being implicitly copied, so we can't change its behavior without breaking source.

I know it's unusual to propose a change to a proposal relatively late in the review process, but I think this is an interesting design choice to consider that leads to a more consistent and integrated-feeling overall model for ownership. What do you all think?

20 Likes

Is this consuming x.y.z? In other word, is it a shorter way to write this?

consume temp = consume x.y.z

Another question is whether these two things would be equivalent:

_ = consume x.y.z
consume _ = x.y.z

In general I like this direction, but I'm a bit confused by the effect on the initial assignment. And perhaps a bit weirded out by having consume being used both as a variable declaration and an operator at the same time.

3 Likes

That's our current thinking yeah. These would be slightly different, still, for a non-_ binding:

since the former consumes x.y.z but assigns it to a possibly non-consume binding xyz. If xyz were a normal var binding, then it would still be treated as implicitly copyable in the rest of the function body.

1 Like

I definitely think this is an improvement. As I mentioned in the other thread, I think that once you're in a mode of thinking about ownership, you want to be thinking about ownership everywhere, and you want the compiler to help enforce those invariants. Using the parameter annotations as a way to infer "ownership mode" is an elegant solution to the "how do I opt in" issue.

For the inout issue: do we have a sense of how source breaking it would be? I'd guess that if you combined it with ImplicitlyCopyable types (e.g. Int, Float etc. which never need a copy annotation) you might end up with fairly narrow breakage, but that's only a guess (albeit one based on skimming through my own code).

Mm. :-/ Is there consume var? Or are these always immutable? And borrow locals are as-of-yet unpitched, so I’m not sure how to think about how often I might want to copy them.

(I don’t think we can possibly change the existing behavior of inout either. Bah.)

2 Likes

No, consume takes the place of var. As a local variable binding, we want it to to be assignable so that you don't need to lose control of ownership just to update a variable. The default assumption is that consume parameter binding is not assignable, but maybe that should change.

"borrow variables" were pitched here:

I quite like the idea of making @noImplicitCopy unnecessary, and I like the general idea of tagging the binding with that semantic based on how it is spelled. At first I was concerned about more than doubling the set of assignment keywords, but after playing with it I think the concision is a serious advantage.

One lingering concern is how this will impact readability of structs and classes. Hunting for variable declarations isn’t as common as scanning a type for property declarations, and this change would increase the set of watch words to let, var, inout, borrow, consume, and presumably const, eventually.

Some other remaining questions:

  • If I declare inout x = &y, is y still in scope? I would assume not, because x now has exclusive access.

  • Since the same keywords are still being proposed as operators, would let x = consume y and consume x = y both mean the same thing?

  • Does this pitch also cover parameters? If so, does that mean that func foo(_ arg: inout T) now means arg is no-implicit-copy within the body of foo?

  • Can property declarations use borrow and consume?

  • I also feel like this doesn’t address @John_McCall’s point about the inconsistency of &. I’ll again plug my previous suggestion of making & mandatory for all borrowings or consumptions of bindings:

Mutability Local Variable Local Borrow Local Consume Value Parameter Borrowed Parameter Consuming Parameter
Immutable let x: T inout let x: T in let x: T func f(x: T) (preferred)
or func f(x: let T)
func f(x: inout let T) func f(x: in T) (preferred)
or func f(x: in let T)
Mutable var x: T inout x: T (preferred)
or inout var x: T
in var x: T func f(x: var T) func f(x: inout T) (preferred)
or func f(x: inout var T)
func f(x: in var T) (mutability is not visible to caller)
Assignment Rule
let x = y Implicit immutable copy of y.
var x = y Implicit mutable copy of y.
inout let x = &y
or in x = &y
Immutable borrow/consume of y. y can be mutable or immutable.
inout x = &y
or in var x = &y
Mutable borrow /consume of y. y must be mutable.
inout let x: T
var y = copy(x)
Values must be explicitly copied out of in let, in var, inout let, or inout var. Mutability of copy is independent.
var x: T
func f(_: inout let T) { }
f(&x)
All bindings can be passed to inout let parameters. (Callee immutably borrows the binding for the duration of the call.)
inout var x: T
func f(_: inout x: T) { }
f(&x)
var, in var, and inout var bindings can be passed to inout var parameters. (Existing Swift semantics.)
let x: T
func f(_: in T) { }
f(&x)
let, var, in let, and in var bindings can be passed to in parameters. Consumes the caller’s binding; callee loses mutability. (Can’t pass inout because access is transferred to the callee, but the callee wouldn’t transfer ownership back.)
let x: T
func f(_: in var T) { }
f(&x)
let, var, in let, and in var bindings can be passed to in var parameters. Caller doesn’t see any difference from in parameters, but value is mutable within callee.
// To end the lifetime of a binding early, assign the binding to _.
// Ending the lifetime of an inout binding writes its value back.
func addFortyTwo(to x: inout Int) {
    x += 42
    _ = &x
    print("Added!")
}

// To end the lifetime of a binding while passing it as a parameter, use the standard library’s drop() helper function.
// (note this won’t work with inout bindings; drop those explicitly with assignment to _):
func drop<T>(_ binding: in T) -> in T {
    return &binding
}

var heavyWeight = MyStruct()
doStuff(with: drop(heavyWeight))

// To read the value of a binding, use the standard library’s copy() helper function.
@_semantics("binding.copy")
func copy<T>(_ binding: inout let T) -> T {
    // Unused implementation; the compiler actually never emits a call to this function.
    // It just exists to make the type system happy.
    return _Builtin.copy(&binding)
}

inout var x: T
var copyOfX = copy(&x)
2 Likes

Rust has by-value as the default and & as the marked case, and it feels very backwards. Most arguments are borrowed; consuming is the unusual case. (EDIT: it also makes sense in Rust because references are first-class types, but that won't be the case for Swift.)

1 Like

I think that’s an explicit non goal for Swift, though?

That is the default state in Swift today for method params and subscript indexes (but not setter values or init params), and the most likely thing you'll want to use for non-copyable types. Universal annotation of borrow/consume is a non-goal for Swift, but if you want to think about copies explicitly, borrows are the ones you don't need to worry about.

EDIT: I should probably mention that borrow for Swift doesn't necessarily mean "pass an address" like it does in Rust. The "borrow" of a statically trivial value (like Int) uses the same ABI as the "consume" of a trivial value. (Or at least it was last time I looked into the implementation, which was admittedly several years ago.)

The logical argument for & everywhere is that it indicates the operation is happening on the binding, not on the value. But I suppose the real value of & is in knowing that subsequent access through that binding might see a different value.

If I’m applying your logic correctly, I think you’re saying that in, in var, and inout let shouldn’t require a &. That also means making a copy is spelled copy(x) no matter what kind of binding x is. But it also means the spelling of drop(&x) is no longer uniform. I’m not sure whether I like the lack of a signal that the binding’s lifetime is ending. I’m even more concerned about _ = x ending the lifetime of x if it happens to be a borrow.

1 Like

Hmm, that doesn’t smell right to me. A reference x passed +0 to a borrow NSObject parameter does not count as a read of x that lasts for the entire call. If it did, we’d have exclusivity violations in existing code like

var obj = …
doSomething(obj) {
  obj = …
}

But maybe that’s because the compiler inserts an implicit copy here. So okay, borrow may have exclusivity implications and thus it may be worth marking.

Yes, borrows definitely have exclusivity implications. For example, you should be able to fork off immutable borrows of a mutable variable to concurrent code, but you shouldn’t be able to form a mutable borrow while any immutable borrows are in scope, to avoid the need for dynamic exclusivity checks.

EDIT: this also made me think about captures. It seems like it should be safe for a non-escaping closure argument to have immutable borrows—since the closure is non-escaping, the borrow formally ends no later than the return from the call. The closure can choose to terminate the lifetime of its binding earlier, but if the binding was created by capturing a mutable borrow as immutable, when does the writeback actually occur?

1 Like

Borrow-by-bitwise-copy is powerful because it lets us change the representation of borrowed values without forcing the semantics of a copy. Particularly useful for sticking non-copyable types in enums and aggregates. Sadly, we can't do this for generics and resilient types because of ObjC weak refs.

But it’s also counterproductive if the point of adding borrow everywhere is to avoid copying large structures.

Large structs get copied for generally the same reasons that references get retain/released, if there's the possibility of some interfering write that prevents sharing one copy. borrow prevents those interfering writes being possible, so when a struct is big enough that we decide to pass it indirectly, it will generally be passed by address without copying. As Andy notes, we can however still bitwise-copy the struct into a larger borrowed aggregate if we need to, say if you want to go from a borrowed T to a borrowed Optional. To me, though, the more interesting aspect that borrows of most types don't require fixed addresses is that it allows us to represent borrows of small values, particularly refcounted object references, using pass-by-value at the machine code level, instead of needing double indirection like Rust's Rc or C++'s shared_ptr do.

3 Likes

For clarity, how does a potentially interfering write happen with normal let, var, and inout, given the Law of Exclusivity?

The downside is that one therefore cannot assume that a move-only struct has a stable address, which would be useful for storing locks and atomics alongside the data they control without having to switch to classes (which have retain/release traffic) or rewrite the entire codebase to only ever touch these structs via UnsafePointer (since presumably borrowing through UnsafePointer.pointee still doesn’t guarantee that the value won‘t be temporarily moved out to a different location in memory).

It’s probably a better design not to conflate move semantics with fixed addresses, but it is a potential pitfall.

For local variables, they generally can't, and if you pass a large struct by borrow today, it ought to pass by address (if we're not, that's inefficiency in the compiler and/or optimizer that we can fix). But if you're working with shared mutable variables, like globals or class ivars, we generally have to assume that any function you call may try to mutate the same variable, so we'll usually defensively copy around a call, as in:

class Foo {
  var x: VeryBigStruct
}

let sharedFoo = Foo()

func doBadThing(with veryBigStruct: VeryBigStruct) {
  // Interfering write to sharedFoo.x...
  sharedFoo.x = veryBigStruct
}

func invokeBadThing(on foo: Foo) {
  // means we can't unconditionally borrow foo.x in place,
  // in case foo === sharedFoo, so we'll copy it here
  doBadThing(with: foo.x)
}

Using an explicit borrow binding or operator on foo.x here would specify that you know for sure that an interfering write won't happen, and it's OK to trap on an exclusivity failure if it does.

We haven't quite gotten there yet, but some move-only types will necessarily have fixed addresses and not be "bitwise borrowable". Like Andy noted, this is already the case for types that contain ObjC weak references, since they need a stable address for the ObjC runtime to be able to update. My vague idea for how to implement locks and atomics is that there would be a way to ask for a raw memory reservation within a class or move-only type, and that doing so would imply a fixed address for values of the containing type, so you could do something like:

struct AtomicCounter: ?Copyable {
  @Buffer(of: Int.self, count: 1) var atomicVar: UnsafeMutableBufferPointer<Int>

  func increment() { atomicAdd(to: atomicVar.baseAddress!, value: 1) }
}

Borrows of AtomicCounter would then be by-address rather than by bitwise copy in order to share the buffer.

1 Like

…and what do these mean mean??

consume foo = borrow bar
borrow foo = consume bar
inout foo = consume bar
// etc

I find myself asking many questions along these lines, trying to form my mental model of this feature family.

I more or less understand these modifiers applied to expressions; it’s the application to declarations I’m somehow having trouble getting my head around. I want the latter to mean “always apply this modifier to the rvalue.” But within that model, I’m confused by Joe’s answer here:

Would there still be a difference between these two?

let foo = consume bar
consume foo = bar

The way I see it, the binding forms effectively imply the corresponding operator on the thing being bound, so this is like writing:

consume foo = consume (borrow bar)
borrow foo = borrow (consume bar)
inout foo = &(consume bar)

Of those, only borrow (consume bar) is theoretically valid—you're ending the lifetime of bar, but then holding on to a borrow of its final value—and the others are clearly errors. It's probably most straightforward to disallow the combination of an operator with one of these bindings.

let foo would still be implicitly copyable, whereas consume foo would not be.

5 Likes