[Pitch] Formally defining consuming and nonconsuming argument type modifiers

Lantua · December 28, 2021, 9:16pm

How does it work with protocol declaration? Does the call convention written as part of the protocol get included in the functions conforming to that?

Similarly, how do functions of the same name but different calling conventions affect each other? Can I have func foo(_: consuming Foo) and func foo(_: nonconsuming Foo) in the same scope? My intuition would be to say no, but I'll ask anyway.

It might help if we show the retain- and release-call (for empty function calls),

func foo(cons: consuming Foo, non: nonconsuming Foo) { }
func bar() {
  foo(cons: arg1, non: arg2)
}

// turns into

func foo(cons: consuming Foo, non: nonconsuming Foo) {
  // Use cons, use non

  // release cons -1
}
func bar() {
  // retain arg1 + 1
  // retain arg2 + 1
  foo(cons: arg1, non: arg2)
  // release arg2 - 1
}

That is,

consuming is a caller+1-0 (1 retain, 0 release) and callee+0-1, and
nonconsuming is a caller+1-1 and callee+0-0.

Whether the caller actually retains/releases or simply forwards the retained count could be seen as optimization.

It also illustrates why it pays to get the convention right; if the caller doesn't use the consuming argument afterward, then there's nothing to be done on the caller's side. OTOH, if the caller usually uses the nonconsuming argument after the function call, it will just balance out the release-after-call, allowing the caller to continue using non-consuming argument after call w/o an extra retain. This is already a big win from the ARC perspective alone.

I think it could be made clearer that changing the heuristic is off the table. I doubt we can do much at this point, so might as well prevent people from getting distracted.

I vaguely remember that we used to have a guaranteed calling convention. Is that related to these two? It feels like guaranteed just turned into nonconsuming.

Michael_Gottesman · December 29, 2021, 12:19am

In situations that are not along module boundaries, the optimizer is able to change the convention. In fact, it has been doing this for many years. These keywords are most important along ABI boundaries.

The reason I asked you to not do that is because once words like that are raised in a thread, a storm of responses can come in that change the topic of the thread. I am afraid that this thread is turning into a thread about whether or not to ask the core team to change the default ABI (which as I mentioned above is very unlikely). That is a completely different subject from what this thread is actually supposed to be about: adding consuming/nonconsuming argument type modifiers to the language given the way the language is today, not hypothetically in the future. I am afraid that this thread is now about the former (the default ABI), rather than the latter (the actual work that I am trying to pitch). I would appreciate it if we could re-focus the thread onto the specific work and whether given the current model they are important or not.

Michael_Gottesman:

When I look at this on ToT, I see that sum is not inlined into doSomething() and that the retains that you are talking about are on the dictionary, not in sum due to the iterator.

Okay - well, they are actually due to the iterator (@_assemblyVision shows that a Builtin.BridgeObject is retained and a _NativeDictionary<String, Int>.Iterator is released).

I tried out 3 versions of sum() using the OSSA modules flag - Godbolt.

Using a for loop, or the default makeIterator(), seems to cause a retain/release pair, whilst a custom makeIterator() (or using indexes) does not.

Adding __consuming to the custom makeIterator() doesn't add a retain/release pair though, so I guess it's off the hook. Maybe the flag won't apply OSSA optimisations to code which is inlined across modules or something.

Michael_Gottesman:

As I mentioned above, the heuristics are part of Swift's ABI so they aren't something that we can discuss without talking about a large ABI break that is so fundamental that I doubt it would be an acceptable proposition.

Not even with thunks?

Thunks add code-size for minimal benefit considering that the only time when the optimizer can not specialize conventions is along ABI boundaries. It would add much complexity to the compiler, slow down code generation, etc. Even if we just added them along ABI boundaries, that could increase the code size of real world frameworks in a unacceptable way. We have over time in fact been figuring out ways to eliminate thunks specifically to reduce code-size since the cost is so significant.

Michael_Gottesman · December 29, 2021, 12:26am

xwu:

Michael_Gottesman:

Alternatives considered

We could reuse owned and shared and just remove the underscores. This was
viewed as confusing since shared is used in other contexts since shared can
mean a rust like "shared borrow" which is a much stronger condition than
nonconsuming is. Additionally, since we already will be using consuming to
handle +1 for self, for consistency it makes sense to also rename owned to
consuming .

What if we just called these owned and unowned? Seems to me this would most directly demonstrate that they're talking about the concept of ownership. Would that conflict at all with the existing usage of unowned captures? If so, how about retained and nonretained?

At least for me, I find the terminology about "consuming" to be actively working against intuition. In common usage, that which is consumed no longer exists, since to consume something is to eat it up or destroy it. But, as in the example where an element is appended to an array, it's precisely because the array wants to keep the value around that we want it to be "consuming" the element—this is the exact opposite of the natural meaning of the word.

By contrast, to "own" a thing suggests the correct reason why one might want to use this feature here, which is to keep the thing around.

I suspect one reason that some folks are surprised or alarmed at the already existing heuristics is how unnatural it sounds that an initializer or setter, for instance, "consumes" its arguments by default when clearly initializing or setting a property means you'd want to hang on to its value.

I am using the term consuming with respect to arguments since Consumed Argument is a specific term of art from Automatic Reference Counting that specifically means a +1 parameter. For instance, one can see consumed parameters documented as such in the Objective-C ARC Reference in Clang: Consumed Parameters

nikitamounier · December 29, 2021, 12:29am

I agree, case constructors and enums with associated types in general are an interesting...case. It seems reasonable to me that case constructors should already be consuming, since the associated data belongs as part of the enum itself.

However, I'm interested in what happens when we switch / pattern match over the enum itself. When we extract the associated value into a constant or a variable using case let / case var syntax, maybe we should be allowed to mark that as consuming. This would be useful if it's the last time we're about to use the enum instance, so that we're able to use the same memory as the associated value. Like this:

switch productBarcode {
case consuming let .upc(numberSystem, manufacturer, product, check):
    state.manufacturer = manufacturer
case consuming var .qrCode(productCode):
    productCode.normalize()
    state.productCode = productCode
}
move(productBarcode) // since we're not using it anymore

And in if statements:

if case .qrCode(consuming let productCode) = productBarcode {
    let product = Product(name: "Apples", qrCode: productCode)
}

I see this being especially useful in Redux-like architectures, where we send an Action (which is always an enumeration with or without associated values) to a Reducer, switch on that action, and mutate the state, without ever referencing back to that action afterwards. For example:

let reducer = Reducer { (state: inout State, action: Action) in
    switch action {
    case consuming let .setName(newName):
        state.name = newName
        return
    case consuming let .changeProfileColor(to: color):
        state.profileColor = color.toHex()
        return
}

We're fully discarding the action afterwards, so it could be interesting to be able to consume the enum's associated value's memory.

dabrahams · December 29, 2021, 6:08am

Just so my position remains perfectly clear, I'll clarify that I'm not suggesting anything that adds hint keywords. I am saying the hint and semantic modifiers should be one and the same. We should allow @escaping for all types; that should mean consuming, and non-@escaping arguments shouldn't be allowed to escape… perhaps unless they are known to be copyable non-closures. I suppose @nonEscaping may be needed for initializer arguments, because they are currently assumed to be consumed and the ABI won't change. All of this is a compatibility/migration headache but the pain is worth it because where we'll end up otherwise is too awful.

rayx · December 29, 2021, 7:36am

(Never mind. I have understood the basic model with the help of @Lantua and @Avi.)

Original Questions

If I understand it correctly, there are three scenarios:

A) The caller gets the value from a non-consuming argument. In this case step 1 is necessary
B) The caller gets the value from a consuming argument. In this case step 1 isn't necessary
C) The caller constructs the value (or gets it from function call). In this case step 1 isn't necessary.

I wonder how the compiler tells which scenario it is?

So one can mix consuming and non-consuming functions like the following:

func a() {
  let o = Foo()
  b(o)
  print(o) // this should be fine because b() doesn't consume it
}

func b(_ o: non-consuming Foo) {  
  c(o)
}

func c(_ o: consuminng Foo) {
  ... 
}

I found this was confusing to me at first, because I thought a consuming function would deallocate an object because it owned that object. But it turns out what it actually does is just to decrease the reference count by 1.

John_McCall · December 29, 2021, 8:13am

I’m really excited to see this.

I don’t love consumed/nonconsumed. Among other things, we’ll want a local immutable-reference binding eventually, and it would be good if it could share the same keyword as parameters to emphasize the connection, but nonconsumed would be terrible on a local declaration. owned/borrowed is really nice except that owned (owning?) doesn’t read well as a function modifier for self. There’s precedent from mutating/inout for the keywords to differ a lot, but I do kindof agree that if we used consuming for one, we should use consumed for the other. And borrowed is not a great complement to consumed. Maybe we just have to go with borrowed on the local declaration and make that the inconsistent one. I’d love to hear a more unifying option.

Should consumed parameters be mutable? Or are we just going to say that people who need that should move into a local var, and the optimizer will promise to eliminate any actual work associated with that?

Nobody1707 · December 29, 2021, 10:35am

Isn't a move from a parameter into a local variable semantically just a renaming operation?

Avi · December 29, 2021, 11:09am

I don't think so. The binding of the parameter would be undone after the move, and presumably the compiler would be free to insert a release() for the parameter at that point if it had been passed in as a +1.

rayx · December 29, 2021, 11:37am

One more question: is it valid to access a value after passing it to a consuming method? In scenario A, it's apparently OK. But what about scenario B and C? Take C as an example, I thought it would be invalid because the value should have been deallocated by the consuming method, but the experiment below showed that's not the case. Not sure if it's because I misunderstand how the mechanism works or it's caused by the fact that deinit() isn't executed at a specific time in the current compiler.

class Foo {
    var x: Int = 1
    
    deinit {
        print("Foo is deallocated()")
    }
}

class Bar {
    init(_ foo: __owned Foo) {
        // Do nothing. Note we don't save foo.
    }
}

func test() {
    let foo = Foo()
    let _ = Bar(foo)
    print("After Bar.init()")
}

test()

// Expected output:
//   Foo is deallocated()
//   After Bar.init()

// Actual output:
//   After Bar.init()
//   Foo is deallocated()

Lantua · December 29, 2021, 12:15pm

It would be valid. The caller just needs an extra retain call prior to the consuming call to prevent the object from deallocating. (There are more nuances around move-only types and no-implicit-copy instances, but we aren't there yet.)

The main semantic difference (beyond ARC) is that you can't refer to the old binding/variable if you move it.

rayx · December 29, 2021, 12:32pm

Ah, that should work. This is much more complicated than I had thought.

When I first read about ownership, I thought either the caller or the callee could have the ownership. This is an example which demonstrates that's not the case. The callee consumes the value, so it takes the ownership away from the caller. But on the other hand the caller also owns the value, in the sense that it can still access the value. The exact meaning of the "own" becomes a bit confusing here, I think.

Avi · December 29, 2021, 12:37pm

I think the way to frame it is as follows:

To say that Bar() owns its parameter is to say that the function is responsible for releasing it. It does not mean that no other strong references may exist.

Lantua · December 29, 2021, 12:47pm

Right, your mental model is much closer to Rust's ownership. In Swift, it's more like you own a reference to that particular object. There can be multiple references of the same object, each with different owners, and each owner is responsible for discarding their references (i.e., stop using the reference and call release). You can then own a new reference to the object by copying the reference and call retain.

rayx · December 29, 2021, 1:03pm

In non-consuming case, the call doesn't increase reference count. So it can be thought as a single reference shared by two (or multiple) owners. Maybe this is why it's called __shared in current implementation

Lantua · December 29, 2021, 1:23pm

Yes, well, maybe. It's more useful to treat each reference as having a single owner (i.e., the one that will call release). If you're using the reference but did not participate in the reference counting, you could also say that you're simply borrowing the reference (and will return it to the original owner afterward). It does help you keep track of

Who will call release (owner), and
When can release be called (after all borrowers return the reference).

This is why you can see borrowing being proposed as an alternative for nonconsuming and sharing.

FWIW, Much of the ownership concept predates ARC; see Memory Management Policy and Ownership Policy. Though I'm not sure where the concept of borrowing begins in this ecosystem. If you're still curious about that, you can spin a new thread.

asdf · December 29, 2021, 2:23pm

borrowing can mean immutable borrowing and mutable borrowing,
for e.g.:

func testBorrowing(_ arg1: borrowing Int){
...
}

is it clear what kind of borrowing it means? Mutable or immutable?

Lantua · December 29, 2021, 2:54pm

Anyone can mutate the object (simply by assigning it to a new variable*). Whether the mutation is observable to the caller depends on whether the argument is marked as inout** (or the object has reference semantic). If you want the mutation not to incur CoW, you must ensure that the reference is unique and pass the argument as consuming.

Also, these markings don't mean much to trivial types, which have no ARC traffic. We wouldn't want to reject that outright, though. That'd be too magical, and we need to consider such a combination anyway for cases like generics.

Ok, many people seem to be confused about the term borrowing. It shows that that may not be the best word choice or that Rust's influence is simply too strong.

* Arguably, that just creates a copy, then mutate that copy. However, that is also the case for all CoW types with non-unique references.
** We might want to be careful around inout. In addition to ARC, it also plays a role in Exclusivity Rules.

Alejandro_Martinez · December 29, 2021, 3:56pm

I'm quite happy to finally have more official tools in the language to control memory ownership when necessary. +1

Thanks for that, it clarified a bit how it worked. While reading I felt like I forgot my days using MRC! I'm actually still a bit confused why the +1 of a consuming arg needs to be in the caller and not the callee, probably obvious but I'm not seeing it right now.

In terms of naming I guess consume/nonconsume is getting into people's heads already but for me is more confusing than the own/borrow words used in Rust. That said in that language things are more clear/explicit so they are also less confusing.

For example, right now using these keywords doesn't change the semantics of our code, since Swift with insert ARC/copies when needed anyway right? It would only matter for semantics once we have move-only types (or the @noImplicitCopy mentioned in the roadmap) since at that point a consumed argument won't be able to be used in the caller after the call. Corrrect?

From the roadmap:

Would it be useful to specify in the proposal what role inout plays (if any) in this?

ryansobol · December 29, 2021, 4:23pm

One of the unique things about the inout parameter modifier is that any corresponding argument variable must be prefixed with an & symbol.

func doubleInPlace(number: inout Int) {
    number *= 2
}

var myNum = 10 
doubleInPlace(number: &myNum)
print(myNum)  // 20

It lets the reader (and presumably the memory safety checker in the compiler) know that the caller intends to pass an argument by-reference (or by-value-result) instead of the default heuristic.

@Michael_Gottesman what are your thoughts on prefixing caller arguments with the & symbol (or any symbol) for consuming and/or nonconsuming callee parameters? Would this extra syntax be useful for the compiler’s safety checker at all? Should the reader be informed about an intentional change away from the default heuristic at the call-site?