[Pitch] Formally defining consuming and nonconsuming argument type modifiers

Lantua · December 29, 2021, 6:09pm

The idea is that the caller, bar, retains an ownership to the consuming argument arg1 (+1), then gives that ownership to the callee, foo. At this point, foo owns the reference, and bar does not*. Since the foo now owns the reference (that bar gave), it is responsible for releasing it (-1).

This is the main idea of consuming. The caller retains and gives the retained reference to its callee. The callee then releases. It is ARC-efficient even across modules if

The caller does not use the argument afterward, and
The callee does keep the argument around.

Depending on the level of deviation from this scenario, we may see more (unnecessary) ARC traffic.

* If needed, it can retain another reference.

It matters for cross-module interfaces when the compiler can't optimize away the ARC traffic across module boundaries. Libraries like the standard library would be the biggest consumers (ha!).

That sounds like it would be either inconsistent (implicit non-consuming arguments would not get the & prefix, while explicit ones do) or source-breaking (implicit non-consuming arguments now require the prefix). It's quite a high price to pay in either case.

Could you elaborate more on this? AFAICT, let a = b is the non-consuming local variable assignment, let a = move(b) is the consuming one, and ref a = b or whatever it calls would be a +0 anyway. So I'm definitely missing something.

Nobody1707 · December 29, 2021, 9:04pm

Why would it ever be allowed to do that? Moving into a local variable wouldn't change the refcount, so it should never cause a release.

John_McCall · December 29, 2021, 9:12pm

Well, I don’t know what a local ref means because that’s not concretely proposed. If you want to guarantee that something isn’t copied, you may need to borrow it from its current location, which means enforcing exclusivity on that location for as long as you need the borrow. Like inout but without the implication of mutation.

asdf · December 29, 2021, 9:31pm

Using owned instead of consuming could mislead because Swift already has unowned keyword, which means unowned references, but owned would mean owned values.
And i doubt consuming is a good keyword.

Saklad5 · December 29, 2021, 9:45pm

The thing is, the compiler could actually know when that is the case. If it was left implicit, could the compiler actually choose between the two at each call site? It’d have the necessary information, unlike the programmer.

I feel like there’s room for compile-time heuristics that are more sophisticated than “Is this an initializer or setter?”. If there is, we may want to treat this more as a compiler hint (if you can’t tell which is better, do this) than a hard rule.

Lantua · December 29, 2021, 10:00pm

Right, I only vaguely remember ref from a post long ago, and I couldn't find it anymore. Please ignore it.

That said, I'm still not sure if consuming and nonconsuming are the right tools for the local borrowing (name TBD) you're looking for. consuming and nonconsuming as proposed don't even have any notion of exclusivity attached to it. The local borrowing seems more akin to inout and whatever non-mutating pass-by-reference argument convention is (normal arguments would make a copy, and doesn't even hold read access to the original value).

This hinges on the requirement that the compiler can emit different binaries for the same function, which isn't the case everywhere, and most definitely not at module boundaries.

In the scenarios where the compiler can do that, such as within the same file or the same module (with WMO enabled), much of these call conventions won't do much (maybe a bit since it could help guide the compiler toward a better call convention).

Saklad5 · December 29, 2021, 10:02pm

That’s what I’m thinking: we should define these modifiers such that a hypothetical future compiler has room to overrule them.

ksluder · December 29, 2021, 10:05pm

Without commenting on the larger point, isn’t this exactly what @usableFromInline does?

Saklad5 · December 29, 2021, 10:10pm

There are many situations where the compiler could be able to switch between the two, most notably inlining and specialization.

As for module boundaries, cross-module optimization (which may or may not be stable right now, it’s frustratingly unclear) allows the compiler to ignore them entirely in favor of optimizing everything it can. Ideally, the compiler should be able to produce a binary that is literally impossible to improve upon (Pareto optimal) at that point. You know, eventually.

Lantua · December 29, 2021, 10:17pm

That sounds like we're putting the cart before the horse. To allow the caller to decide the calling convention even for libraries in binary forms, we would require a certain level of dynamism. I don't think these small ARC optimization would outweigh the cost of such dynamism.

Yes, but that just pushes these boundaries a little deeper into the module, not eliminating them. Compilers can see @inlinable functions and optimize ARC around them, but it still needs to follow the calling convention around @usableFromInline functions, which would only be available in binary forms.

If you want to eliminate such boundaries, you'd need to distribute the library as source code, which is not ideal, or even possible for many cases.

Lest we forget dynamic linking, where the libraries are compiled prior to the application(s) using them, and we can't change the compiled libraries.

I think we're getting off-topic, or rather, out of scope. I'm not exactly sure .

Saklad5 · December 29, 2021, 10:20pm

In most scenarios the compiler probably couldn’t be able to tell which would be better, most obviously when the caller isn’t being compiled at the same time as the callee. I’m merely pointing out that there are many scenarios where it could, and we should avoid a scenario where using either of these modifiers results in a worse outcome.

This is how @inlinable works: the compiler may or may not inline code with or without that attribute. In fact, CMO ignores it entirely! Code may be inlined in some places but not others, based on whether binary size or runtime performance is being emphasized, etc. There’s a lot of variables, and most of them are complete unknowns when the code is written.

That’s how Swift Package Manager works, which is where I expect the bulk of the usage is going to be.

Saklad5 · December 29, 2021, 10:21pm

It’s from the performance roadmap.

A roadmap for improving Swift performance predictability: ARC improvements and ownership control

Borrow variables

When working with deep object graphs, it’s natural to want to assign a local variable to a property deeply nested within the graph:
let greatAunt = mother.father.sister
greatAunt.sayHello()
greatAunt.sayGoodbye()
[…]
We can do this by introducing a new kind of local variable binding that binds to the value in place without copying, while asserting a borrow over the objects necessary to access the value in place:
// `ref` comes from C#, as a strawman starting point for syntax.
// (It admittedly isn't a perfect name, since unlike C#'s ref, this would
// actively prevent mutation of stuff borrowed to form the reference, and
// if the right hand side involves computed properties and such, it may not
// technically be a reference)
ref greatAunt = mother.father.sister
greatAunt.sayHello()
mother.father.sister = otherGreatAunt // error, can't mutate `mother.father.sister` while `greatAunt` borrows it
greatAunt.sayGoodbye()

Lantua · December 29, 2021, 10:31pm

Yes, there are definitely such cases, common even. Better yet, in some cases, such as when the callee is non-public, the compiler can even ignore the marked convention entirely (though I'm sure figuring out the optimal convention is also a rather complex process).

OTOH, these keywords are essential for where the compiler most definitely can't do such things, by design and necessity. It's not just about optimization, it's also about the interfaces that the library can craft.

Saklad5 · December 29, 2021, 10:33pm

I think you’re assuming that what I’m describing is already in the pitch. I don’t think it is.

Lantua · December 29, 2021, 10:40pm

I suppose, but what you said essentially boils down to "the compiler can optimize the code while maintaining the original semantic." So, I guess, . (Note though, that public ABI needs to follow the overridden convention, but that's probably not what you're thinking about anyway.)

Saklad5 · December 29, 2021, 10:44pm

What I’m thinking is, in order of priority:

If the compiler can tell which convention produces less ARC traffic, it will use that. (Assuming it doesn’t impede binary size unnecessarily, etcetera, much like how it chooses to inline or specialize)

I don’t think the compiler does this right now, but it definitely could someday.

If consuming or nonconsuming is specified, it will use that.

Lantua · December 29, 2021, 10:48pm

You should also note that it's a priority list where 1 > 2 > 3, took me a while to figure that out. But sure, let's also add

Since the calling convention is ABI, if public or @usableByInline functions specify consuming or nonconsuming, it will use that.

Saklad5 · December 29, 2021, 10:50pm

I’m counting that as not being able to tell. It obviously can’t inline functions in other binaries either.

Lantua · December 29, 2021, 10:52pm

It may be the cause of the overlong discussion, but I genuinely can't tell if you didn't think of it, or just assumed that everyone does.

Saklad5 · December 29, 2021, 10:54pm

The latter. I’m assuming a range of scenarios from “the compiler knows nothing” (library evolution) to “the compiler knows everything” (cross-module optimization).