SIL instructions for variadic generics

I'm going to start with the lowered SIL instructions, i.e. those going into IRGen. I think we'll also need instructions to support the use of opaque values coming out of SILGen, but I don't want to focus on those quite yet.

Explicit pack syntax

As usual, textual SIL needs to be more explicit about some things than ordinary Swift source code. There are several places in SIL, such as apply instructions, which have explicit generic arguments for arbitrary generic signatures. Arbitrary generic signatures can contain multiple parameter packs, so we can't flatten argument packs in these argument lists without losing key structure. Instead, we have to have a way to spell packs explicitly in textual SIL. (I anticipate that we'll eventually want this in source as well, but we don't need to design that now.) The current pretty-printer for PackType uses the syntax Pack { ... }, which seems perfectly adequate, so we just need to teach the parser to accept this in SIL mode.

Status: implemented

Pack ownership and inline-ness

While pack types in the source language are always simply lists of types, pack values in SIL must express layout and ownership of their element values.

For layout, the chief question is whether values are stored inline in the pack or as addresses. We really want pack element to have uniform layout within the pack — that is, we want the pack value itself to be a flat array. So storing values inline in a pack is basically only imaginable in very specific circumstances, like if we have a parameter pack of class-constrained type, or a parameter pack of repeat Array<each U>. It's unclear if those optimizations are worthwhile.

As for ownership, we want to make sure that just changing ownership of a pack (borrowing the elements of a consuming parameter pack, for example) doesn't require extra operations dynamically. And we also want to encourage SIL patterns that still permit exclusivity checking.

Status: still being designed

Opened pack archetypes

When iterating over a pack, we need to be able to refer to the current pack element type in basically arbitrary type positions in SIL. Naturally, we use an archetype for this; the AST calls it an "opened element archetype". Like opened existentials, these archetypes are resolved dynamically within the body of a function and can take on different dynamic types within a single activation, and they can't be resolved earlier than a certain point. We therefore have very similar problems to opened existentials in terms of needing to mark type dependencies on the point where the pack element archetype is bound. As a result, we need to generalize the type-dependency tracking for opened existential archetypes to work for all kinds of "local archetype". This is straightforward.

Unlike (current) opened existentials, we may need to bind multiple pack element archetypes simultaneously. This is because two packs may be not only constrained to have the same length, but also constrained to refer to one another. For example, you could have <T..., U... where T: Collection, U: Collection, T.Element == U.Element>. In this situation, type derivations rooted in one element archetype can end up being expressed in terms of a different element archetype. The only reasonable way to avoid this problem is to bind them at the same time. This isn't really a deep problem for SIL's type-dependency tracking; it just needs to be taught that a single instruction can bind multiple archetypes. (Note that there's no need for the different types bound by an instruction to use different type dependencies.)

Currently, for subtle reasons, this is set up to bind all packs in a signature with the same length (to be precise, the same "shape class"). We may want to have a high-level optimization which recognizes that some of this data is unneeded in a particular iteration and so removes unnecessary parameters and requirements from the opened generic signature. (The opened signature has no inherent relationship to the enclosing contextual signature; see the section below on open_pack_element.)

Status: implemented

Pack index type

We need some way to refer to a specific index of a pack. On a primitive level, this could just be an integer. But we really want to try to keep pack indices well-typed: we want to make sure we don't use an index in the wrong pack. To make that possible, we need to preserve the use/def structure of how pack indices are immediately produced. That means we need to use a different type that discourages abstraction (e.g. hiding the production of a value behind a basic-block argument) and algebraic manipulation, which basically means "not just an integer type".

For this purpose, we introduce a Builtin.PackIndex, which is basically just a UInt32.

An alternative design would to be reflect the pack that's being indexed into the type of the pack index. But the specific pack isn't necessarily meaningful — we might want to use the same pack index to index into several different packs that share the same shape class. We can just require that things have the same shape class, and ultimately we are going to have to do that. Still, I think it's better to not overstate the meaningfulness of some specific pack type. It also should lead to more readable SIL.

Status: in progress

Instructions for indexing into packs

We define three instructions for indexing into packs, in service of two different basic patterns. These instructions are collectively the pack indexing instructions. All pack indexing instructions produce a $Builtin.PackIndex and define a pack shape that they index into.

dynamic_pack_index

The first pattern for pack indices is that we may need to do something uniformly for an entire pack, treating its elements opaquely. This is what we'll naturally produce in SILGen when we expand over a pack parameter. For this, we need to basically produce a for-like loop pattern. The easiest way to do that is to generate integer indices normally with arithmetic, then translate them into a specific pack index value, locking down the structure of any subsequent derivation.

It is undefined behavior to use an index that is greater than or equal to the length of the given pack.

%packIndex = dynamic_pack_index %i : $Builtin.Int32 of $Pack{repeat each T}

Status:: in progress

scalar_pack_index and pack_pack_index

The second pattern for pack indices is that we may need to decompose a pack. For example, if we are passing values of type Int and Float to a variadically-generic function, we will need to construct a concrete pack of type Pack{Int, Float} (or maybe borrows thereof). But we could also be doing the same thing to produce a Pack{Int, repeat each T, Float}. We need some way to produce pack indices for specific statically-known components of this. This actually requires two different instructions: one for deriving a scalar pack index, and one for deriving a pack index of an element of a component pack.

The scalar pack index case is easy: we just provide the constant index of the desired component within the pack. The component must not be a pack expansion. SIL substitution may need to change this index if there are packs preceding the index.

%packIndex = scalar_pack_index 2 of $Pack{Int, repeat each T, Float}

The component pack index is a little more complex: we need to select the component pack, but after substitution, this may no longer be a single component. The shape constraint is therefore that a slice of the pack components starting at the given constant index must have the same shape as the pack that the given pack index indexes into.

%subPackIndex = dynamic_pack_index %i : $Builtin.Int32 of $Pack{Int, repeat each T, Float}
...
%packIndex = pack_pack_index 1, %subPackIndex of $Pack{String, Int, repeat each T, Float, Double}

Status: in progress

open_pack_element

We need an instruction to bind a pack element archetype. In order to work after inlining (or any other form of substitution), it must not rely on the pack being a contextual pack parameter of the enclosing function. It also needs to stably define the generalization pattern of the pack elements independently of the enclosing environment. Consider, for example, a substitution that replaces a pack parameter with the concrete pack Pack{Int, Float}; when iterating this pack, what protocol requirements should we bind for each type? We derive this pattern (called the opened generic signature) from the original function and generally preserve its structure through passes (although it would be acceptable to reduce it when information is provably unused). The structure of this signature will therefore be the same as the enclosing context during initial emission in SILGen, but not necessarily at any later stage. We then provide contextual-meaningful substitutions for the parameters in this signature, which for packs will be PackTypes. SIL substitution (SILCloner) then just needs to substitute these substitutions.

The shape operand must be a pack parameter in the signature. The pack parameters in the signature with the same shape as the shape operand are collectively called the opened pack parameters. The instruction binds an opened element archetype for each opened pack parameter. The substitutions for each opened pack parameter must have the same shape, which must also be the shape of the pack indexed by the pack index operand.

This instruction expresses a relationship between each opened element archetype and a type pack in the substitutions; this relationship is important for the type-matching algorithm below. The instruction doesn't work with any pack values itself and has a value result solely for the use of type-dependency tracking.

%typeToken = open_pack_element %packIndex of <T...> at <Pack{Int, Float}>,
               shape $T, uuid "01234567-89AB-CDEF-0123-000000000000"

Status: implemented

Structural type matching for pack indices

When manipulating a pack value at a specific pack index, we must generally provide or produce an element value of an appropriate type for the pack index. These instructions will generally state the SIL type that they expect or produce. This type is then required to be well-typed for the element according to the following algorithm:

  1. The inputs to the algorithm are a SIL type for the pack value (e.g. $Pack{*Int, *repeat each T, *Float}), a pack indexing instruction that indexes into a matching shape class, and a SIL type purportedly for the element at that index.
  2. If the pack indexing instruction is scalar_pack_index N of $Pack{...}, then N must be the index of a non-pack-expansion component of the pack value type. The element type is well-typed if it is exactly this component type.
  3. If the pack indexing instruction is pack_pack_index N, %idx of $Pack{...}, then %idx must index into a pack with the same shape as a slice of the pack value type starting at component N. The element type is well-typed if it is well-typed for that slice of the pack value type and %idx.
  4. If the pack indexing instruction is dynamic_pack_index %idx of $Pack{...}, then let S be the set of opened element archetypes in the purported element type which use this same pack indexing instruction as their index. Transitively by construction, the substitution packs corresponding to these archetypes in their opening instructions must have the same shape as the value pack type. Then the element type is well-typed if, for component i of these shapes, there is a substitution which replaces opened element archetypes in S with component i of their corresponding substitution packs that makes the element type equal to component i of the value pack type (handling pack expansions appropriately).

Status: not yet implemented

alloc_pack and dealloc_pack

Pack values, in this design, are mutable arrays, generally of addresses. That's not great — it'd be much better if we had a way to iteratively construct them as immutable values — but it's the easiest thing to do in the short term, because all we really need to do is allocate them and then insert and extract values.

Pack allocation uses the stack allocator and therefore must obey stack discipline with every other instruction that does.

%pack = alloc_pack $Pack{*Int, *repeat each T, *Float}
...
dealloc_pack %pack

Status: not yet implemented

pack_set_element_addr and pack_get_element_addr

This instruction is specifically designed for packs of addresses; inline value packs have ownership questions that shouldn't be conflated with this. Honestly, the ownership questions posed by this are bad enough; it ends up creating abstraction of addresses, which we specifically do not generally want in SIL.

The type of the element must match the type of the pack at the given pack index.

pack_set_element_addr %addr: $*@opened(...) U, %packIndex, %pack: $Pack{*Int, *repeat each T, *Float}

%addr = pack_get_element_addr %packIndex, %pack: $Pack{*Int, *repeat each T, *Float} as $*@opened(...) U

Status: not yet implemented

tuple_pack_element_addr

In lowered SIL, tuples containing packs must be address-only, since we don't know their size. We can also force address-only patterns on tuples when we use them in element-abstracting ways (I think?). So at that point, the only thing we need is an instruction which dynamically projects out a single element of a tuple, treating the tuple as a sort of pack with heterogeneous layout. Imagine a pack type that contains all the elements of the tuple. Then a pack index into that pack is concretely the index of the tuple element, and we can use the pack index manipulations to construct that element index.

The pack index must index into a pack with the same shape as the imputed pack of the tuple elements. The result type must be an opening of the tuple element type at the pack index.

%componentPackIndex = dynamic_pack_index %i of $Pack{repeat each T}

%tuplePackIndex = pack_pack_index 1, %i of $Pack{Int, repeat each T, Float}
...
%eltAddr = tuple_pack_element_addr %tupleAddr : $*(Int, repeat each T, Float), %tuplePackIndex as $*@opened("01234567-89AB-CDEF-0123-000000000000") U

Status: not yet implemented

6 Likes

Looks good to me

1 Like

PR for the pack indexing instructions. That brings us up to implemented for Builtin.PackIndex, dynamic_pack_index, pack_pack_index, scalar_pack_index, and structural pack indexing. The pack allocation and manipulation instructions are up next.

I haven't been updating the SIL documentation so far, but this thread can stand in until I have things working, at which point it should be painless to condense it into SIL.rst.

1 Like

I'm thinking that what I'll do for pack types is introduce a SILPackType whose element type can carry address-ness and ownership. The basic spelling for that can be the same as for the AST type (Pack { ... }), and we'll distinguish them by position, the same way we do for function types. Lowering will introduce SILPackType in concrete positions.