ABI compatibility: hazards and new language proposals

typesanitizer · March 6, 2020, 2:51am

Recently SE-0280 Enum Cases as Protocol Witnesses has been proposed. It doesn't make much sense for me to summarize the proposal here, so please read it before reading the rest of this post.

One implication of the proposal is that it introduces a way into the language in which one can break binary compatibility without breaking source compatibility. More specifically, one can make a property witness and replace that with an enum case (which also acts as a witness) but this would break binary compatibility (the other way round breaks source compatibility, as a client may be pattern-matching on the enum case you're getting rid of). There are already other ways in which this happens today (such as with SE-0267), but this proposal adds one more way this can happen.

Compared to most other languages, Swift has a much better story around binary compatibility. Part of the reason why I really like the design is that (n the majority of cases) changes that "should" not break binary compatibility do not break binary compatibility. To reappropriate the phrase, fast and loose reasoning is morally correct wrt binary compatibility -- (in the majority of cases) source-equivalent code is ABI-equivalent.

If we add more cases like this, that adds an extra wrinkle that users need to be wary about. I suspect that a large fraction of users of this feature will probably not be affected by this change. A small percentage of people might be affected: should we have a backup plan for them?

Other people have tried to assuage this concern in the proposal review. Owen points out:

With regard to the resilience issues brought up by @jrose, I don't think the inability to switch between satisfying a requirement with an enum case and other static members should block this proposal. A similar issue came up during the review of SE-0267 (where clauses on contextually generic declarations) where writing equivalent declarations with different syntax results in mangling differences, and it's likely parameterized extensions will run into similar issues. In comparison to those two features, I think it's intuitively clearer that enum cases and static vars have different ABI.

However, I think there is a meaningful distinction between the situation for SE-0267 (or parameterized extensions) and the situation here for SE-0280. In the former case, if one wants to have the same representation for the two similar-semantics-but-distinct-ABI cases, one needs additional special-casing for when/which generic parameters should be "floated out" to achieve the desired effect. Such special-casing would be complex as it would need to account for type nesting + multiple methods in a protocol. One can make an argument based on simplicity/predictability that we should accept the similar-semantics-but-distinct-ABI representations over this additional complexity of understanding exactly when it is acceptable to migrate generic parameters/constraints outwards/inwards if one wants to preserve ABI compatibility. [At this point SE-0267 has already been accepted, but this is speaking hypothetically if it were still under discussion. This point will again apply to parameterized extensions.]

To the best of my understanding, no such complex special-casing is necessary for SE-0280. Instead, we can mandate that witnesses backed by enum cases must have the same ABI as the corresponding property/method, and use thunks to mediate between the enum case and property/method representations. The thunks would only apply in the case when the witness is an enum case, so existing code is unaffected. The cost of this is some additional implementation complexity, some extra compile time, some extra binary size, and some performance lost to thunks. The runtime costs can be eliminated in the presence of @frozen, as usual. The benefit of such a design is it creates one less additional wrinkle that users may need to worry about.

The question I have here is two-fold:

How high of a bar should there be for proposals which introduce cases where source-equivalent code is not ABI-equivalent? Should proposal authors strive to not break this unless the alternative is "strictly worse"?
Alternately, are we ok with adding extra wrinkles wrt ABI compatibility because the decisions people make will be increasingly tool-driven, so we don't expect people to reason about the ABI manually?

To be clear, this post is not meant as a -1 to the proposal. On the whole, I'm +0 because I think it is a good addition to the language that solves an actual pain point, I think there is one axis along which we can improve the proposal. The point of this post is to start a broader conversation on how language proposals should be evaluated along the axis of new ABI compatibility hazards.

jrose · March 6, 2020, 4:27am

To the best of my understanding, no such complex special-casing is necessary for SE-0280. Instead, we can mandate that witnesses backed by enum cases must have the same ABI as the corresponding property/method, and use thunks to mediate between the enum case and property/method representations.

To be honest, that was my expectation all along. By default*, protocol conformances don't reveal how their requirements are satisfied, i.e. clients are not allowed to "devirtualize" a call through a conformance if the conformance is in a module that has library evolution enabled. Additionally, protocol witnesses already have a different ABI than normal functions because the conformance is passed in, whereas for normal functions on an enum you know the concrete type you're working with. So (IIUC) there's almost certainly going to be a thunk anyway.

The point I was bringing up wasn't really about conformances at all; it's more about enums in general. It's not binary-compatible to replace an enum case with a static property or a static function with an enum case, even if you ignore matching, because enum case construction doesn't go through any sort of function call at all today. That's true whether or not it's satisfying a conformance, though we certainly could make the rule that any enum case generates a "shadow property/function" that you can call (at the cost of extra code size).

* That "by default" can actually be a problem in practice; currently the devirtualization of conformances is tied to whether the conforming type is frozen, which isn't really correct. I mentioned this very briefly in Backwards-deployable Conformances, since that's where I was proposing an attribute syntax for conformances.

The wrinkle that SE-0280 adds is that you're making additional promises: not only does this case claim a name forever, but also that name has to satisfy the requirements of the protocol forever. That's why formalizing @_implements would be a way out in this particular case.

typesanitizer · March 6, 2020, 4:49am

To be honest, that was my expectation all along

That's my hope too, but I don't think this is part of the proposal, is it? If it is, I think there is a big misunderstanding on my part and I apologize.

The point I was bringing up wasn't really about conformances at all; it's more about enums in general. It's not binary -compatible to replace an enum case with a static property or a static function with an enum case, even if you ignore matching, because enum case construction doesn't go through any sort of function call at all today.

Yep, that makes sense, that's why I focus exclusively on the static property/method -> enum case direction in my post. The issue with the opposite transformation is two-fold: (1) clients may pattern-match on the case and (2) there is no indirect dispatch for enum cases today.

sveinhal · March 6, 2020, 7:09am

Isn’t it already the case that we don’t expect people to reason about ABI? The vast majority of Swift-users are application developers, and not framework engineers, methinks.

We should certainly strive for ABI to follow the “principle of least astonishment”. As someone who only has a very hand-wavy understanding of the ABI, I certainly wouldn’t expect swapping a case for a static var to be ABI-compatible, in the sense that accessing it directly on the type would work if used from an app that was linked against a version of the library that swapped one for the other. But I guess I expect some kind of indirection when accessing an enum case as part of a protocol in generic code, and that accessing it through that indirection should work before/after the change. I’m not sure if my expectations are valid.

jrose · March 7, 2020, 9:23pm

I didn't look at the implementation at all. I guess I just assumed it was going to work that way because any other way would be harder to implement in library evolution mode. But making that change (changing a static property to an enum case) still isn't binary-compatible today, with or without protocols involved. I agree with you that the proposal ought not to be implemented in a way that makes that problem harder to fix, though, and that indeed could be a good guiding principle of new proposals / implementations, though not a strict requirement. (We could decide that no one will ever want to change a case into a static property, for example, and not worry about introducing more incompatibility in that direction.)