A few interesting things to catch this thread up on.
The first thrust is that I've done a lot of work on reabstraction, which is to say, transformation of values (mostly functions) that were built under one unsubstituted pattern so that they can be used under a different pattern. This is a more difficult problem with variadic generics because variadic generics can change arities, and arity is one of the main things that abstraction patterns are sensitive to. In fact, there are going to be cases that simply aren't implementable under our current ABI: put broadly, we cannot support variadic-arity functions as type arguments. That's worthy of a longer explanation.
Swift generally stores function values using the optimal calling convention for the generic context they appear in: for example, a (T, Int) -> Bool
passes the first argument indirectly (as a pointer to a T
value), passes the second argument directly (probably in a register), and returns the result directly (in a register). If you pass a function value between contexts that need different conventions, we'll actually rewrite it; we call that "reabstraction". We do this kind of rewrite when you pass values of these types directly (or wrapped in Optional
, which we treat as a special case), but we don't do it when you use function types as arbitrary generic arguments to generic types like Array
. That's because doing the rewrite on arbitrary generic types would be much more expensive, complex, invasive, and problematic. To rewrite a Pair<X,Y>
type, we'd just have to rewrite the individual stored properties. To rewrite an Array
, we'd have to allocate a new array and rewrite all the elements, which the Array
implementation would have to be involved in. To rewrite a class object — well, there's actually just no way to do this without breaking semantics.
If we can't rely on rewriting arbitrary generic types like that, how is it possible to use an [(Int, Int) -> Bool]
from contexts that expect [T]
or [(T, T) -> Bool]
or [(T, Int) -> Bool]
? The answer is that we treat the generic type like its own context, and the declarations dictate an abstraction pattern and therefore a particular set of conventions. In particular, whenever Array
receives or produces an element value, it does it opaquely, using its unrestricted type parameter Element
; for example, the getter of Array.subscript
returns Element
. If we're using that subscript from a context that knows that Element
is (T, Int) -> Bool
, we know that the original declared return type doesn't have any function structure. So we assume that the return value will be produced using the appropriate conventions for a totally opaque function value (more on this in a second), and we need to rewrite that value to use the expected conventions for our only-partially-opaque type (T, Int) -> Bool
. (Ideally, we'd see that we just use the return value in some way that doesn't require us to rewrite it. But in the worst case, we have to do this rewrite.)
What are the appropriate conventions for a totally opaque function value? Swift relies on being able to derive a "most general" abstraction pattern for any particular type. For the function type (Int, Int) -> Bool
, the most general abstraction pattern we use is <T,U,V> (T, U) -> V
. We call this the most general abstraction pattern because we assume that all possible generalizations of the original type (other than the fully-opaque generalization <T> T
) are specializations of it. In particular, we assume that (1) nothing that isn't a specialization of this pattern can substitute to the original type and (2) everything that's looking at a specialization of this pattern will derive the same most general abstraction pattern from it. As a result, we know that any context which knows that the opaque type is actually a function type will agree on the most general abstraction pattern of the function type and thus also agree on the conventions that the function type stored under that pattern will use. They can then rewrite the value from those conventions to their locally-preferred conventions as they see fit.
Unfortunately, variadic generics mess this up on two levels. The first is that the "most general" abstraction pattern we chose before is no longer actually the most general possible; for example, <each T> (repeat each T) -> Bool
is a possible generalization of (Int, Int) -> Bool
but is not a specialization of <T,U,V> (T, U) -> V
. It's tempting to try to solve this problem narrowly by adjusting the algorithm for selecting the most general abstraction pattern to respect the existence of pack parameters; then the most general pattern for this function would be <each T, U> (repeat each T) -> U
. Unfortunately, the combination of parameter ownership and empty packs makes it impossible to do this; for example, the type <each T, each U> (repeat each T, repeat inout each U, Int, Int) -> Bool
is a possible generalization of (Int, Int) -> Bool
that is not a specialization of <each T, U> (repeat each T) -> U
. (We don't currently support inout
parameter packs, but that's likely to be a temporary restriction.) There are function conventions that can generalize all possible functions, but they cannot be expressed as a generalized function type — functions would have to take an array of pointers to the concrete parameters and decompose that array dynamically according to the actual dynamic pack sizes. And switching to that (or any other new convention) as the most-general abstraction pattern (or any other change to the pattern) would be a pervasive ABI break and cannot be done on our target platforms with ABI stability.
So we're not going to be able to allow function values with pack expansions in their parameter lists to be abstracted in and out of totally opaque contexts. You'll have to wrap them in structs or something else that establishes a concrete abstraction pattern for the function. We'll need to work that into the proposal.
On an implementation level, reabstraction also posed a surprising problem for my nice routines for doing parallel walks on orig/subst tuples and parameter lists. It turns out that they were unfortunately limited by the fact that I'd written them as higher-order functions. The reabstraction machinery really wants to simultaneously abstract out of one pattern and into another, and the problem with that is that it's a parallel walk across two different combinations of orig/subst types. You can't do parallel walks with higher-order functions: there's an inversion-of-control problem. So I had to rework my routines to present themselves as generators, which also solved a different annoyance I had about how verbose my callbacks were.
I rewrote how we did pack substitution. It might be easiest to just link to the PR for this one.
The latest interesting conceptual issue is vanishing tuples. It turns out that variadic pack expansion can naturally create singleton tuples. This is actually a pretty serious annoyance when working with singleton packs, which for some APIs can be really common, so we'd prefer to flatten those singleton tuples to their underlying element type; that's what I'm calling a vanishing tuple. This means that e.g. a function that returns (repeat each T)
will actually just return Int
if T
is a pack with one element. But that causes some problems for type and expression lowering because we do all these walks over tuple patterns, and suddenly it's possible to have a tuple pattern without having a substituted tuple type. (And even if you do have a substituted tuple type, you might need to ignore it — maybe the singleton element type was a nested tuple.) Fortunately, I already did all this abstraction to use generic tuple-element traversals, and it's really easy to update it to recognize vanishing tuples; now we've just got a few places where we do unnecessary casts or otherwise pay too much attention to the substituted type. There was even one place that was doing a whole parallel walk that turned out to be unnecessary — we should always have been doing that computation by just considering the orig type.
Next up: more reabstraction.