[Pitch] Controlling the ABI of a declaration

I've written up an early draft proposal for a new @abi attribute.

This idea originated as a way to make subtler changes than @preconcurrency allows, but it looks like it will be a very flexible tool. It's sort of like a safer, more ergonomic version of @_silgen_name that's compatible with a lot more declarations.

I'm also working on an early prototype of an implementation. It does not have nearly all of the capabilities described here, but once I've worked out a few bugs, I think it'll be able to replace most of the @_silgen_name backwards compatibility hacks in the standard library (about a quarter of all its uses of the attribute).

This is still pretty early, so there are a lot of challenges I haven't worked out yet. A big one is how to keep this feature from breaking name lookup in module interfaces emitted against older versions of the library; I may need to make module interfaces use ABI names rather than user-facing names, but that gets tricky with inlinable code.

There are also major questions about safety checking—both the exact traits that it should make sure line up, and the philosophy of how tolerant it should be of semi-safe hacks.

28 Likes

This new way of spelling ABI has been on my wishlist for, well, as long as I cared about ABI. Thank you for bringing it forward!

Off top of my head, I have the following questions:

  • Since it’s not explicitly stated, is adding isolated(any) to the type of a closure an argument a compatible change?
  • with @_silgen_name, we can only change one mangled name per declaration. This is limiting if the changing declaration has multiple related mangled names (e.g. an opaque return type has its own descriptor separate from the the function declaration that returns it). Does the proposed attribute lift this limitation?

Thanks!

3 Likes

Signs point to “no”—from what I can tell, @isolated(any) functions have a different memory layout because they carry their isolation around with them. However, if you want to replace a non-@isolated(any) function with an @isolated(any) one, this feature would let you rename the old one out of the way without an ABI break; combine that with @backDeployed on the new one and you might have the tools you need.

Yes, this fixes that. @_silgen_name has those limitations because it actually avoids calling into the mangler at all for the symbols it renames, while leaving the mangler’s behavior for all other uses unchanged. With @abi, the compiler calls into the mangler like usual and then the mangler substitutes in the ABI decl whenever it’s needed, so it composes correctly with all the other mangler behavior. (The tradeoff is that it doesn’t let you specify an arbitrary symbol like @_silgen_name does, but I regard that use case as simply being out of scope for @abi.)

6 Likes

Future directions:

It might be possible to allow an ABI-providing decl to belong to a different context than the original—for instance, turning a global variable into a static property, or moving a method to a single-property @frozen wrapper struct.

Another use case is to allow existing top-level protocols to be nested. Would the original context be given as another parameter, or as a declaration qualifier?

extension Unicode {
  @abi(public protocol _UnicodeParser) // FIXME
  public protocol Parser {
    /* Existing requirements … */
  }
}

@available(*, unavailable, renamed: Unicode.Parser)
public typealias _UnicodeParser = Unicode.Parser

(Unicode.Parser existing implementation and missing documentation.)

1 Like

I don't do much resilient library work so I don't have a lot to add from that side but it looks like a great solution to this problem. There's no better way to express the declaration you want to imitate than to just write it out as Swift code. Super elegant.

From a tooling point of view, having the declaration be bare tokens inside the attribute parentheses seems like it might pose problems? To parse that, you'd either have to parse it as an unprocessed token list to be re-parsed later (keeping track of parentheses for balancing), or the parser would need to drop into a declaration parsing state after seeing @abi(, which I imagine could lead to its own large complexities. Should it just be a string literal instead that the compiler parses on-demand when it needs to determine the ABI?

I don't remember the context but there was a conversation recently about how attributes usually fall into two groups: (1) look like function calls, or (2) an ad hoc stream of tokens, and there was a strong feeling that we shouldn't introduce more of the latter into the language.

3 Likes

Or, taking a page from @_silgen_name, can it be an actual naked declaration without body with some other syntax to associate it with the 'real' implementation? For example, something like:

extension Collection {
    @binaryInterface(for: __rethrows_map)
    public func map<T>(_ transform: (Element) throws -> T) rethrows -> [T]

    @usableFromInline
    func __rethrows_map<T>(
        _ transform: (Element) throws -> T
    ) throws -> [T] {
        try map(transform)
    }
}

This was in the context of the steering group's discussion of trailing comma support.

7 Likes

I thought about this, but there are plenty of use cases where the API name is genuine API surface (not something you can give an arbitrary __whatever prefix), and we don’t have a good way to name another declaration with extreme specificity. For instance, what if there are a couple of overloads and the only thing that distinguishes the one you want is a particular generic constraint or type attribute?

Yeah, it is a downside of this approach.

However, in full generality, the ABI-compatible entrypoint could also want to invoke a new API that takes an extra argument, or (as shown in your example here) that throws instead of rethrows, or that requires performing a simple arithmetic operation on an argument or converting it to a different type. The general workaround for all of the above is (as shown in your example) to write an additional @usableFromInline internal __whatever shim that can have an arbitrary prefix.

It's not clear to me a priori that the specific scenarios where there are multiple overloads differing in generic constraint which can't be disambiguated in today's Swift[*], say, so outnumber the scenarios where there are differences in number of parameters, throwing versus rethrowing, isolation, etc. that it should totally drive the design of this feature.

However, I am open to being convinced if, empirically, you're finding that this really is overwhelming scenario in which you'd use this feature.

[*] That we can't ergonomically disambiguate among certain overloads is something that we probably should work on, orthogonally. But I would consider it a solvable expressivity problem rather than a fact-of-life that we should yoke other designs to.

1 Like

A very early prototype implementation, including standard library adoption: Implement experimental @abi attribute by beccadax ¡ Pull Request #76878 ¡ swiftlang/swift ¡ GitHub

My feeling on this is that we should not go off and invent totally new ad-hoc syntaxes for attribute parameters anymore, but when there's an existing grammar production which is a very good fit for an attribute, we should probably allow the attribute to use it. For instance, if we were stabilizing @_specialize(where <constraints>) today, I wouldn't tell the authors to find a way to replace the where clause with a function-call-style syntax, because a where clause is the natural way for Swift developers to represent constraints on a generic signature. I see this design as similar.

(But I am thinking about Xaiodi's suggestion to put @abi(<name of API decl>) on a stub declaration that's already at top level, and I might try to prototype it.)

4 Likes

I'm willing to be convinced! Now that I can see an implementation, the parser changes in the PR you linked seem pretty reasonable (I really wish we were rid of the C++ parser by now...). If we can parse a nested declaration head there without causing ambiguity issues, it'd be nice to see that compared to Xiaodi's suggested approach to see which one is more readable in a decent-sized corpus like the standard library.

As someone who constantly has to resort to stuffing mangled names into silgen_name to keep ABI compatibility in our concurrency library I really welcome this change :partying_face:

I really like the proposed way of expressing it by putting the complete declaration of the method that this is supposed to "keep the mangling of". It is simpler to reason about than another cross-referencing thing, as the signature is local to what I'm looking at -- I really like this property of the design. The jumping around between various "almost the same methods" is a nightmare the more we legacy implementations we accumulate, so the simple "here's the signature" works very well IMHO.

For what it's worth the exact same pattern is very interesting for distributed method versioning and I'd love to catch up about it as I'm figuring things the direction for that. That would not reuse the abi per se, but the mechanism would be very similar I think... and again, I think the pattern of having the "base signature" in the attribute is very easy to reason about therefore I like it as a general pattern to adopt for these "backwards compatibility" things.

4 Likes
  • Kind: Must match (func for a func, class for a class, etc.).

In my experiments, actors and classes seemed to mangle identically, so maybe that can be an exception there.

I would like to use this feature to convert a class’s async member to an actor’s non-async member. This is pretty much source-compatible from a caller’s perspective, though given isolation inference, maybe it needs that @abi(unsafe:) variant you were thinking about.

This feature is intended to help ease the adoption of other new features by allowing a declaration's ABI to be "pinned" to its original form even as it continues to evolve. Note that there is only ever a need to specify the original form of the declaration, not any revisions that may have occurred between then and the current form; there is therefore never a reason you would need to specify more than one @abi attribute, nor to tie an @abi attribute to a specific platform version.

I largely disagree with this paragraph’s conclusions. Fully transitioning callers of a deprecated ABI can take a while, and evolution may continue while such work is outstanding. There could be a need for several defunct ABI to exist at the same time, depending on which callers haven’t transitioned yet and which iteration they started at.

1 Like

I hope that with this proposal, we’ll be able to undo some of the mistakes made in the Concurrency module. For example, the generic parameters of the UnsafeContinuation and CheckedContinuation types have the names “T” and “E” instead of proper names.

Also, AsyncStream’s initializer and makeStream type method both have an elementType: Element.Type = Element.self parameter. Usually, such parameters are used to specify the generic parameters of an initializer or method, not to specify the generic parameters of the type. It’s unnecessary to have this parameter because Swift already allows you to specify an AsyncStream’s generic argument by writing it like AsyncStream<TheElementType>.

MemoryLayout’s parameter is also called “T”; that should be fixed as well.

Those are a slightly different issue. Someone can correct me if I'm wrong, but the names of generic type parameters aren't encoded in the ABI; they're represented as two integers, a depth and an index.

Where they become a user-facing nuisance is when you want to write your own extensions and you have to reference them by that name; i.e., extension UnsafeContinuation where T: Whatever. The standard library could rename T without breaking binary compatibility (which is what this proposal is dealing with), but it would be a source breaking change.

8 Likes

I think you're right — I thought the generic parameters were included in mangled names, but it seems they aren't.

If we added a public type alias named T, would it still be a source-breaking change?

I thought that this would be a breaking change, because I thought I remembered getting errors in the past when trying to reference type aliases in generic constraints, but somewhat to my surprise the following compiles:

struct Foo<Value> {
    var t: T
    typealias T = Value
}

extension Foo where T == Int {
    func test1() {
        
    }
}

I wonder if something changed in the past couple years and I haven't fully noticed, or if there are additional restrictions that come into play in certain situations (or if no such restrictions have ever existed and I’m just remembering wrong).

4 Likes