@Andrew_Trick before we make more progress on this, I have one remaining concern: I want to make sure this doesn't impact anything about what @John_McCall is exploring. If this will fit in with what he wants to do, then I am fine with taking this. I just don't want to put this in and have it not gel with that.
@Michael_Gottesman @John_McCall I'll address that, as briefly as I can ;)
We may want to create a user model (for various reasons) that allows us to constrain a particular reference to be unique (in the "unique SSA reference" sense that I mentioned above). In other words, the reference is guaranteed to be unique at all of its use points, but otherwise may alias. None of the "flags" we're talking about so far in this thread constitute a unique reference constraint.
First consider function arguments. We actually need two properties:
-
A unique constraint as part of the function type. Only unique references can be passed to that argument.
-
A uniqueness-preserving property that results from some -Onone level interprocedural analysis. Unique references could potentially be passed to these even if they don't have a unique constraint. Think support for no-escape closures.
In this thread, we've only been discussing problem #2. And the only reason we could consider representing the uniqueness-preserving property as a SILFunctionArgument flag is that it's easy to verify and nothing should invalidate it, but that's really just an implementation detail.
An interesting question is whether the unique constraint #1 should also imply #2. I'm not sure that's possible, because we may need a way to opt-out of the constraint with unsafe code.
In addition to constraining function arguments, I expect the type system to tell us some day that allocating certain classes requires uniqueness, that loading certain properties requires uniqueness, or that a particular variable declaration was annotated "unique".
Additionally, we need to be able to dynamically impose a uniqueness constraint after performing the runtime check in cases where we can't statically prove the incoming reference is unique.
This is different from what end_cow_mutation
does today, which simply asserts uniqueness, but still allows the reference to be copied later. Let's generalize the end_cow_mutation
semantics and call it an assert_unique
operation.
Let's call an operation that imposes the uniqueness constraint constrain_unique
. For the sake of discussion we can reduce the problem of implementing all the constraints I mentioned above to emitting constrain_unique
instructions.
To diagnose constraint_unique
, we need the same basic analysis that we're talking about in this thread, although we would only do it in OSSA which makes the analysis simpler and much more powerful. @zoecarver originally wanted this in non-OSSA, but I think we've found other ways to workaround the issues in the short term. So if we add any sort of uniqueness-preserving analysis, I now think we should just focus on OSSA.
What information would constrain_unique
provide to optimization passes?
Uniquely identified: No. We can't really say anything about the origin of the reference.
Uniquely referenced: No. It can be derived from a non-unique reference.
Unique SSA reference: Yes, constrain_unique
implies assert_unique
, so we can prove that the reference is still unique at its use points, and for example we can remove any redundant uniqueness checks. At least this would work in OSSA as long as any unsafe opt-out is a consuming operation.
Agreed, I'm not sure this would be necessary.
At that point, you don't really need the alloc_ref flag anymore, since you might as well fold it's discovery into the same analysis.
I'm thinking this flag is needed less and less but...
I could potentially see adding both an alloc_ref uniquely identified flag and a "uniqueness preserving" flag on SILFunctionArgument since that would be much more powerful that the alloc_ref flag alone, and would still be easy to check in SILVerifier. But I can't justify that unless there ends up being a widespread need to call "isUniquelyReferenced" and we don't want to recompute interprocedural analysis each time.
I think this would be extremely powerful and would probably fix @dabrahams CoW issue, maybe even without breaking ABI. Right now, this seems like the best argument for keeping this flag (and adding another flag for SILFunctionArgument
s).
- A unique constraint as part of the function type. Only unique references can be passed to that argument.
We could even have two overloads of this function so that if only one was present (i.e before a certain version of swift) then we could use that one. This is how we would fix the problem without breaking ABI.
An interesting question is whether the unique constraint #1 should also imply #2. I'm not sure that's possible, because we may need a way to opt-out of the constraint with unsafe code.
I agree. I don't think this is possible, especially without exposing this at a language level.
Additionally, we need to be able to dynamically impose a uniqueness constraint after performing the runtime check in cases where we can't statically prove the incoming reference is unique.
Hmm. I think we should aim not to have to do this. Maybe just for debugging. But, in theory, if the argument says that the incoming reference has uniqueness constrained, we should take that at face value. It is then the responsibility of the caller to enforce this, which they should be able to do statically. Even if these functions don't have visibility into each other, this should work. And it should all be statically verifiable.
Let me outline what I'm thinking with "unique SSA reference arguments." Still just brainstorming. But, if there's interest in further investigation, I'm happy to put together a sample implementation.
// Module 1
sil @mutator_for_unique(@unique %ref) {
// Mutate ref in place.
}
sil @mutator_old(%ref) {
// Copy then mutate.
}
// Module 2
sil @test1() {
%r = end_cow_mutation
// When both are known to exist:
apply @mutator_for_unique(%r)
// When module 1 was built before this feature existed:
apply @mutator_old(%r)
}
sil @test2() {
%r = unkown
// Either way:
apply @mutator_old(%r)
}
I'm thinking this would be automatically generated by the compiler. So the user would just write something like:
var r = Class()
mutator(r)
Now, after the apply
, those references are no longer unique. That could be fixed with another argument flag, though.
How would the necessary dynamic bit be communicated without any change to the ABI?
That's what I mean. There wouldn't be a dynamic bit. This would all be static. It would only benefit code built after this feature lands but, it wouldn't need to have visibility through the various functions.
If a reference could be proved to be a "unique SSA reference." Then it would call out to the "new" function overload that takes a unique SSA reference. It doesn't matter if it had visibility into that function or not.
I don't have the mental bandwidth to solve another CoW implementation problem. Those problems are tricky because the reference is stored in mutable memory. Each time we read the reference, we're creating a new value, so it's hard to boil it down to the properties of a single reference value.
If we could agree on a conceptual model for unique reference values, then it would be easy for me reason about what's being proposed. Then maybe we could extend those ideas out to some value-like abstraction on top of mutable memory which doesn't really exist in the type system today.
I can imagine three primitives:
assert_unique: reference value is unique here-and-now (maybe because we just checked)
constrain_unique: statically requires its reference operand to originate from assert_unique
. Trivially produces an assert_unique
reference.
preserve_unique: reference value is not copied. If its operand is unique, its result remains unique at all uses.
assert_unique
is useful on its own to remove redundant uniqueness checks.
constrain_unique
implies assert_unique, but does not otherwise provide more information to the optimizer. It is useful to enforce some user model where the compiler diagnoses uniqueness.
preserve_unique
can be discovered by analyzing the uses of any reference value. That information could be "cached" as SIL flags, but that's just an implementation detail. It doesn't affect the soundness of any proposal.
The combination assert_unique + preserve_unique
tells the optimizer that a reference is has flow-insensitive uniqueness. That reference can't alias any other reference in the function.
The combination assert_unique + preserve_unique + constrain_unique
guarantees the absence of ARC operations. That would make it possible to implement user-level guarantee of unique references or move-only types at the level of SIL values. The real implementation complexity comes up when dealing with addressable memory locations that hold unique things.
alloc_ref
already implies assert_unique
. Knowing that preserve_unique
is true for an alloc_ref
would therefore give us stronger aliasing guarantees.
Knowing that preserve_unique
is true for functions arguments simply makes preserve_unique
much more broadly applicable to other values.