Special Case Protocol Conformance: Long-term Tradeoffs for Near-term Conformance

As I see it, there are source-level and ABI-level commitments we'd be making with these special case conformances. At the source level, as it stands today, nobody can make structural types conform to any protocols. If we lift this restriction later, then it seems inevitable that we would provide conformances for tuple Equatable, Hashable, and Comparable in the standard library at the same time. This would in turn prevent any other modules from declaring overlapping Equatable, Hashable, or Comparable conformances on tuples according to our current rules. If we accept the builtin conformances, we end up in an intermediate position where the conformances are provided effectively by the standard library, but nobody else can declare any conformances at all, which seems like a situation that will be source-compatible with the future world where conformances are generalized.

On the ABI side of things, code that instantiates generic definitions using these conformances will need to be able to reference symbols for the witness table template and conformance descriptor that identify the conformance. These will be special case data structures if the conformances are built-in, but equivalent data structures would exist for the conformances declared by the standard library. To maintain ABI compatibility, we may need to continue exporting the special-case symbol names from the standard library, but chances are good that these could become aliases for the more general data structures that would be generated by the compiler for a library-defined conformance. (This would mean that people looking at the data structure of the builtin conformance descriptor for reflection purposes would need to be aware that the exact nature of that descriptor may change in a future runtime.)

Based on this, I think the long-term impact of these conformances should relatively small and manageable.

6 Likes

I would ask that this thread be kept focused on the tradeoffs of implementing builtin conformance, and that design issues about whether tuples can conform to protocols and how their conformances specifically ought to behave go into appropriate threads for those subjects.

1 Like

Yes, I noted that. The problem isn't specific to tuples either. Its a restriction for type classes (protocols) in general. I keep giving the example than there are two obvious orders for integers, up and down.

Type classes (protocols) basically do two things: they package up a set of functions, and, they do so statically. This allows use to reduce the number of uncoupled arguments otherwise required by general parametric polymorphism, and it allows replacing a type check with an instantiation problem. Type class "constraints" can ignored when doing polymorphic type checking, pretending that they're universal, and then errors issued during monomorphisation when no instance can be found. (Although Haskell actually does constraint checking, since it has run time polymorphism through boxing, but constraint checking is hard)

The downside is that since the constraints with a given name can only be applied once to a type or combination of types, in cases where a second constraint is required you have to revert to parametric polymorphism, or, provide a boilerplate copy of the type class. This is basically because the methods have names fixed in the type class (protocol), and instead of the usual coupling of a parameter and argument, instead the name is used directly.

Ocaml uses a different mechanism, which does not have this problem, by allowing packaging of functionality in modules, and giving modules types so you can write functions from modules to modules (called functors). The downside is of course that you have to explicitly couple the parameters and arguments, the very thing type classes don't require but which also limits their generality.

Roughly speaking the rules is that you should use a type class (protocol) when the operations are canonical. one could argue, though I'm not, that lexicographic ordering for tuples, arrays, and strings, is canonical. A librarian in a library would disagree. Book titles are normalised, removing case from the ordering, and also throwing out "A" and "The". If you really want to get into it, just TL;DR the rules for Unicode and its encodings. One should note the extreme effort Swift went to, with "clusters" of code points, to get a sane universal ordering for Unicode.

The "type class"-ness of Swift protocols is artificially imposed by the compiler. At the implementation level, protocol conformances are more akin to ML modules; conformances are distinct runtime entities, and at a use site where a generic argument is bound, a specific conformance is chosen and bound as an independent argument, so the model supports multiple conformances of the same type to the same protocol. Indeed, because Swift supports separate compilation and post-hoc conformances, this is impossible to prevent, because two modules can independently extend the same type to conform to the same protocol without the compiler being able to stop them.

2 Likes

It means (1), and as I recall, SE-0015 was meant to state that as the plan of record for the language. (2) cannot be supported: only one conditional conformance to Equatable is possible at a time, and while the compiler does not stop you, it is not supported for users to conform types they do not own (tuples) to protocols they do not own (Equatable).

For clarity, there is no special-case behavior proposed. This is about allowing the compiler to synthesize a conditional conformance that cannot yet be expressed in Swift. The only thing special about it is the under-the-hood implementation. It will behave identically to a future extension (T...) : Equatable where ...T : Equatable (or whatever the spelling ends up being).

@Joe_Groff just answered the question below; aliases would be the answer for ABI compatibility. I think that’s satisfactory.

Off-topic question: Is this actually an official position documented somewhere?

I’ve seen plenty of people recommend against such conformances, and in general I agree that it is potentially fragile, but it is also extremely useful and in my opinion it would be a major and detrimental regression to the language if we prohibited such conformances.

For example, if I import both the Numerics module and, say, BobsCustomPrecisionFloatingPointTypes (which does not depend on Numerics but does implement all the elementary functions), then I definitely want to extend Bob’s types to conform to Real, and I would strongly oppose any attempt to make the language ban that conformance in my own code.

I believe it was @jrose who mentioned at some point fully banning such conformances. It's probably too code-breaking at this point, but the reality is that such code is liable to break at any time. Less so when the type and protocol are owned by two disparate vendors but definitely the case if you try to conform Apple's types to Apple's protocols, or Swift's types to Swift's protocols.

Your example here would be a role for cross-import overlays. Bob would then vend the conformance.

Still off-topic

When Bob supplies such an overlay, then I can happily remove my retroactive conformance. Until then, I have to take care of it myself.

I think one pair of questions we should explore is:

  1. When we eventually have generalized tuple conformances, how will they work?

  2. Can we do enough of that design up front for us to build something now that will be compatible?

For instance, suppose we can decide today that, when we eventually allow generalized tuple conformances, they will work by converting the tuple into a generic variadic parameter. Strawman syntax to illustrate:

extension (Elements...: Equatable) : Equatable {
  static func == (lhs: Elements, rhs: Elements) -> Bool {
    for (l, r) in #zip(lhs, rhs) {
      if l != r { return false }
    }
    return true
  }
}

It would, of course, be lovely to implement all of that into the language today. But if we're not ready to do that yet, all we really need to know is the ABI this particular implementation will eventually have. For instance, suppose we decide that:

  • The generic parameter will be represented by a pointer to a structure of the form:
    struct variadic_generic_param {
      uintptr_t flags : 16;     // meaning of bits TBD, but will be 1 for Equatable, 2 for Hashable...
      uintptr_t size : REST_OF_BITS;
      variadic_generic_param_elem *elems[];   // of `size` elements
    };
    struct variadic_generic_param_elem {
      value_witness_table *vwt;
      protocol_witness_table *pwt[];          // width known from constraints on generic param
    };
    
  • Each variadic generic parameter will be represented by a pointer to the elements as they would be laid out in a tuple.
  • The mangling will represent variadic generic parameters using 'X', and will represent tuple conformances with 'T' in the "type" slot.
  • …additional bullet points for metadata, etc…

(To reiterate: Some of those details are just made up on the spot; the point is not exactly what I've written above, it's the kind of information I've written above.)

Then we can hand-write the necessary implementations and add special cases to the compiler and runtime that cause these implementations to be called for the conformances we support today. As we implement the features underpinning this, we can delete parts of the special-casing until eventually, it's all gone.


If we can do this, I think the backwards compatibility concerns are very small. The main downsides are:

  1. Designing the ABI of variadic generics and generalized tuple conformances would take less time than designing the whole feature, but it would still take some time, possibly from the same people who don't have the time to design the full feature now.

  2. If we discover that the pre-designed ABI needs to be changed when we design the full features, we might need to leave in some backwards compatibility hacks.

But if #1 seems feasible, I think this is a great way to go.

If we can only implement these conformances as special cases we'll have to accommodate later, it might still be worth it, but it's certainly a more difficult question.

1 Like

Hi Matt,

Thanks for starting this thread. I was one of the people on the core team that raised this concern. To level set, here's my worldview, I use Equatability for tuples as the example, but this applies to the other conformances as well:

  1. Independent of how it works, I completely agree tuples should be Equatable when their elements are equatable. I would like to see the existing 6 overloads of ==/!= be removed (from name lookup, they should stay in the ABI for compatibility), as they were a stop-gap and make overload sets larger.

  2. My general worldview is that Swift features should aim over time to push "special case" complexity out of the language/compiler and into the standard library. This is done by introducing "fewer and more general" features that are composed by libraries, instead of adding "more special case features" to cover the individual cases that come up.

  3. Building "more general" features is harder than building one-off features, and I'm open to pragmatically taking stop-gap solutions (e.g. the existing 6 == overloads) to help Swift programmers "today" if the right solution won't be available for years.

This question has come up many times before. For example, the compiler has builtin logic to synthesize specific protocol conformances like Hashable when asked. It would be way way better (but also harder) to make this be a library feature that the Hashable protocol implements in the standard library, allowing other user defined types to participate in the same feature.

The question to me boils down to how long to wait, how predictable the solution to the general case is, and how much technical debt the stop-gap implies.

In this case, from my brief examination of the patch, this would add new runtime entrypoints that need to be maintained "forever" and (depending on how it is rolled out) could be a backward deployment problem. This is suboptimal, but isn't a showstopper.

To me, the question really comes down to "what is the alternative?" If there is a reasonable short path to introducing variadic generics, making (T, U, V) be sugar for Tuple<T,U,V> and then making Tuple be equatable, then I'd rather see that. OTOH, if that would never be possible or is a bad idea, or if it requires other longer term things (like non-type generic arguments to represent the labels) then doing the short term thing makes sense.

Another question I'd love to see explored is "how do we see sugared types"? My worldview has always been that they are sugar for library defined types (e.g. X? is Optional<X>, [X] is Array<X>, X! is IUO etc) which means that we should eventually have Tuple and Function. OTOH, some smart people I respect have argued that X? being sugar for Optional is a bad thing and they regret it. Resolving this debate would give me more confidence on the right path for tuple equality, because it would flow naturally from that design point.

-Chris

4 Likes

It's also worth considering that things like tuple Equatable/Hashable conformance are limited enough and likely to be used frequently enough that they might deserve an optimized ABI regardless. In these common cases, the type arguments to the tuple are already available directly from the tuple metadata directly, and so the witness table only has to capture a witness table for each of the elements in order to work. Being able to instantiate the witness table by passing those arguments directly up to some threshold could be a good code size optimization, similar to what we do generally with generic metadata with < 4 arguments.

3 Likes

IMO, tuples and functions have enough special functionality that they ought to always remain as structural types. In my eye, having tuples be the primitive thing that variadics build off, rather than the other way around, has the potential to lead to a much simpler and more usable design. Furthermore, for ergonomic reasons, we really want to preserve the identity (T) == T in the single-element case, because otherwise every function that returns a variadic number of values will need a special case for the single element case to avoid unwanted tuple unwrapping. Functions, on the other hand, carry a bunch of non-type modifiers, so a single Function type would be insufficient; you'd need an infinite number of Function2Inout0<T, U>, Function2Inout0Inout1<T, U>, etc. types to represent everything functions can today, or else invent a type-level encoding of those modifiers.

At a high level, nominal vs structural strikes me as orthogonal to whether structural types ought to be able to conform to protocols, though. Regardless of the language design, it seems like a given that we should allow them to some day.

2 Likes

Hey Joe,

I agree with what you are saying, but I don't think it necessarily argues for tuple types being special in this way. To clarify, I see three possible paths here:

  1. Have TupleType in the AST which is a special structural type and have no corresponding Stdlib type. Any functionality (e.g. protocol conformance) provided by nominal types would have to be special cased.

  2. Have a first class TupleType in the AST, and have a standard library Swift.Tuple type. Protocol conformances on TupleType (and other things that nominal types support, like reflection) would be implemented by just mapping them over to using Tuple.

  3. Have TupleType as a "sugar only" type in the AST, which desugars to a standard library Tuple type. This is how Optional works.

In my opinion, option #2 is an interesting path to consider, one that should reduce compiler complexity yet not lead to the weird problems we have with Optional. If this works well, we could move Optional to the same model.

I'll go point by point to just explain how option #2 would work:

IMO, tuples and functions have enough special functionality that they ought to always remain as structural types.

They'd still be structural types in the AST. Special rules can still exist on them because they are still first class things in the AST. We don't need to handle Swift.Tuple in the AST and see if it came from sugar or something else weird like that.

In my eye, having tuples be the primitive thing that variadics build off, rather than the other way around, has the potential to lead to a much simpler and more usable design.

I don't think there is any problem with that, but see below on the topic of single-element tuples.

Furthermore, for ergonomic reasons, we really want to preserve the identity (T) == T in the single-element case, because otherwise every function that returns a variadic number of values will need a special case for the single element case to avoid unwanted tuple unwrapping.

Agreed, this is just saying that (T) desugars to T like it currently does, but (T,T) has a semantic type of Swift.Tuple<T, T>. There should be no problem with that.

Having a standard library type would allow people to explicitly use Swift.Tuple<T> to get single element tuples. This seems important for variadics, since single element tuples are something that a parameter pack should eventually boil down to. Not having the ability to represent this would lead to complexity in the system.

Functions, on the other hand, carry a bunch of non-type modifiers, so a single Function type would be insufficient; you'd need an infinite number of Function2Inout0<T, U> , Function2Inout0Inout1<T, U> , etc. types to represent everything functions can today, or else invent a type-level encoding of those modifiers.

I don't think that mangling it into the type name is acceptable, it would have to be modeled with a type-level encoding or non-type generic arguments. Tuples have the same problem because they have labels as well, which is something that reflection should be able to get access to.

At a high level, nominal vs structural strikes me as orthogonal to whether structural types ought to be able to conform to protocols, though. Regardless of the language design, it seems like a given that we should allow them to some day.

I think this whole discussion is really about how to factor complexity. By saying that there is a correspondence between structural types and a stdlib type, my sense is that you end up with less redundancy in the system. Nominal types have a LOT of cool stuff that they support, and it would be sad to replicate all of that.

-Chris

3 Likes

As I see it, the main things structural types don’t support today are extensions and protocol conformances, and I don’t see why mirroring them as nominal types or otherwise changing their representation would be necessary to allow them to do either.

1 Like

What about using tuple syntax as literal syntax? We now got callable types, wouldn‘t it be an improvement to also allow other types to be represented by tuple literals, or the other direction where an instance would be destructured into a tuple?!

Wouldn‘t that require tuples to become nominal types?

2 Likes

Cool! Thanks for info. Looking at parts of compiler now. Strangely:

public:
  FunctionTypeRepr(GenericParamList *genericParams, TupleTypeRepr *argsTy,
                   SourceLoc throwsLoc, SourceLoc arrowLoc, TypeRepr *retTy,
                   bool GenericParamsAreImplied = false,
                   ArrayRef<TypeRepr *> GenericSubs = {})

it looks like internally at least functions actually use tuples anyhow.

I have guessed as much. Separate compilation is a pain. Does this also explain why associated types returned as opaque types with a protocol don't work? The associate typed is only known in the instance used at the call site, which then has to support the protocol specified by the opaque return type. Problem is, its not known by the function until run time what the associated type is, because its passed from the caller. So the information is available but not at compile time, so static conformance verification can't work. I'm just guessing....

As others had noted, it used to be the case that Swift represented all functions as having a single argument, but that is no longer the case. Bits of the old representation still linger in the compiler.

That used to be the case, but protocol witness tables now record all of the associated type information. The "Self or associated type" limitation was an artifact of that missing information that has lost its technical reasons for existing. You can see this thread for discussion about lifting the restriction:

1 Like

Since the core team has already said that protocols such as Equatable and Hashable are coming to tuples, effectively that means that methods are coming to tuples.

If tuple conformance is meant to be a public feature (i.e. not only for stdlib protocols), I would expect that the following proposals would also be accepted:

  • Extensions and protocol conformances for fixed-length, non-generic tuples, like (Int, String).
  • Parameterized extensions, to extend this to fixed-length, generic tuples.

Perhaps it would be a good idea to start with those. The latter already has an implementation, but is there anything that would make the former very difficult for a community member to implement?

I agree (that it is the critical question of this thread).

What would be the size/scope of the changes needed for a barebones, stdlib-only version of variadic generics? (so we can bypass discussions about syntax or complete feature-set, and focus on the minimum required to implement these few conformances)

1 Like