The syntax for variadic generics

For a tuple of length n , the complexity of converting a tuple value to a pack is O(n) .

That's disappointing. What is the difference between ABI representations of tuple and pack? It it possible to keep them the same?

The discussion on the ABI for variadic generics is over here:

1 Like

Two reasons:

  1. A tuple value stores its elements contiguously in memory; the offset of each field is computed either at compile time or run time from the layouts of the element types. A value pack is not a value in of itself, but is passed as a list of pointers to values. So the tuple expansion operator takes the address of each element of the tuple and forms a pack from these addresses.
  2. The runtime metadata for a tuple type stores a list of element/label pairs, where each element is a pointer to metadata and the label is a string. A type pack is just a list of pointers to metadata. So again, we have to iterate over the tuple elements because we need to extract the element types and skip over the labels.

The tradeoff with 1) is that forming a pack from individual values is cheaper than if a value pack stored its elements contiguously, like a tuple. We expect this to be more common than a tuple expansion. If value packs stored their elements contiguously, forming a value pack from individual values would instead require copying values around.

My assumption is that tuples and value packs aren't going to be huge in practice, so this might not be a big problem. If the tuple has a fixed size on the caller's side, we already have various optimizations that split tuples up into their constituent fields, so forming the pack might not always require O(n) operations.

It doesn't; there are in fact three operators which could in theory use different syntaxes:

  1. Declaring a type pack in a generic parameter list, <T...>
  2. Expanding a type pack T in a type context, like a function parameter type: args: Array<T>...
  3. Expanding a value pack x in expression context, like an argument to a call: forward(args: foo(x)...)

However I suspect that at least 2 and 3 should use the same syntax to avoid confusion, especially since types can appear in expression context, like Array<T>.self... to form a value pack of metatypes by expanding a type pack T.

Tuple expansion also makes sense with and without an immediately following ..., so I think overloading ... to mean tuple expansion might not be the best idea. Eg, suppose foo(x:y:) takes two pack parameters, and in the caller's scope, myTuple is a tuple and t is a value pack:

  • foo(x: t, y: myTuple.expand)... forms a value pack by applying foo() to each element of t and myTuple pairwise
  • foo(x: t, y: myTuple.expand...)... forms a value pack by applying foo() to each element of t in turn, together with all elements of myTuple
3 Likes

I'm a bit confused right now. Aren't variadic generic functions/types (and thus parameter packs) a feature that only the compiler sees, because it then specializes them to normal non-generic functions/types? Why would there be runtime metadata for parameter packs?

I suggest moving this discussion over to the dedicated thread on variadic generics ABI to keep this thread focused on syntax. Thanks!

Short answer

No, specialization of generic code is just an optimization in Swift; generally, generic code is separately compiled, and substitution happens at runtime.

2 Likes

I agree; I think it's important for pack expansion to be expressed in the same way at the type and value level.

Using a keyword for pack declarations also might help indicate that * means something different than a pointer type in Swift for those familiar with other languages where * means pointers. That said, the pack declaration keyword wouldn't always be directly visible, e.g. if you're in an extension over a variadic generic type:

struct Container<pack Value> { ... }

// in another file
extension Container {
  func getTuple() -> (Value*) { ... }
}

The problem isn't non-pack variadic parameters using .... The issue is that ... is used as a postfix and infix range operator, and attempting to migrate this family of operators would be a big undertaking that comes with a large source break:

1 Like

If we end up going with a keyword I think I like @xwu’s idea of variadic Element more than pack Element. I think it sounds more grammatically fluent/natural when I read it out loud/in my head while also aligning with the proper mental model (as far as I understand it) as well or better, and more importantly it’s the ideal search term for inexperienced readers of the code looking to understand it. It seems to me like variadic covers the needs of both the experienced and the inexperienced variadic generics programmer better than pack. Just for the sake of putting ideas on the table, zeroOrMore could be argued to be the most instantly/broadly meaningful option, but so far I quite prefer variadic.

6 Likes

This reminded me of a problem that we already have, and then I realized that maybe the two share a solution. I'll describe the possible solution here since it was directly inspired by this but if you consider it off-topic I'm happy to move the discussion elsewhere. This solution is a breaking change which would obviously need to be introduced in Swift 6, but it may be considered too radical/breaking even for that, or just not a good idea in the first place - I don't know.

The problem that we already have, unrelated to variadic generics

The problem we already have is that when I extend a generic type or a PAT I don’t get any help from code completion for the generic parameters/associated types until I have correctly typed at least the first letter, which makes discoverability poor. I often have to jump to the definition of a generic type to remind myself of the names of its generic parameters, and jump to definition doesn't always work in a complex and modular code base, so in my experience this problem is not entirely insignificant.

The solution to this current problem which leaves us just one small step from solving the problem Holly pointed out

The possible solution to this problem that occurred to me would be that we introduce and require a new syntax for declaring extensions, in which the “extension signature” (do we call it that?) is analogous to a case pattern, and a generic type or PAT is analogous to an enum case with associated values.

It's fine to extend the bare type without binding any of the generic parameters/associated types**, but it means that you can't reference them within the body of the extension:

extension Array {
    func mutated (by mutation: (inout Self)->()) -> Self {
        var copy = self
        mutation(&copy)
        return copy
    }
}

** Is there a shorter way to say "generic parameters/associated types"? Can we come up with one? I don't think "subtypes" is correct/available to take on that meaning.

If you want to reference the "subtypes" then you have to bind them (similar to having to bind the associated values of an enum case in order to use them when pattern matching using case). Autocomplete could make this generic type pattern matching very easy because after you type extension Arr it offers an autocompletion to extension Array<Element>. Using this syntax you could enable references to the "subtypes" in the extension:

extension Array<Element> {
    var first: Element? {
        guard self.count > 0 else { return nil }
        return self[0]
    }
}

If you want to extend an array of Int then you'd write:

extension Array<Element == Int> {
   
}

If you want to extend Array where the Element conforms to CustomStringConvertible then you'd write:

extension Array<Element: CustomStringConvertible> {

}
The final small evolution of the above solution that would address the issue Holly raised

And lastly, if you wanted to extend a generic type with a variadic generic parameter you would have to write:

extension Container <variadic Value> {
    func getTuple() -> (Value*) { ... }
}

Carat too tiny; didn't expand (ctt;de): Below is the syntax that would theoretically solve the issue @hborla raised:

extension Container <variadic Value> {
    func getTuple() -> (Value*) { ... }
}

This doesn't directly address your issue, but FWIW in a protocol or protocol extension, associated types are actually member types of Self:

extension Sequence {
  var first: Element { ... }

  // equivalent to:
  var first: Self.Element { ... }
}

What if the extended type has generic requirements, would you have to re-state those too? Like an extension of Set, whose Element is Hashable.

They're called type parameters. A type parameter is either a generic parameter, or a member type of another type parameter, recursively.

Wouldn't this just be the following then?

extension Array<Element> where Element == Int
1 Like

I know I've made this point in Language Workgroup meetings, but perhaps not fully in public before:

One of the first prototypes of the regex builder DSL used postfix operators like * instead of named quantifiers like ZeroOrMore, based on its use in the regex literal syntax. This resulted in code samples that looked something like this:

// Adapted from WWDC22 "What's new in Swift" -- not the shipping syntax!
let regex = Regex {
    CharacterClass.horizontalWhitespace*
    Capture(​CharacterClass.noneOf("​​<#​​")​​​+​?)??
    CharacterClass.horizontalWhitespace*
    "<"
    Capture(​CharacterClass.noneOf(">#")+)
    ">"
    CharacterClass.horizontalWhitespace*
    ChoiceOf {
        "#"
        Anchor.endOfSubjectBeforeNewline
    }
}

When we looked at code samples like this, we noticed that single punctuation characters in postfix position, like * and +, were easily lost in the clutter of identifiers and parentheses. A single incorrect or missing quantifier can dramatically change the behavior of a line in a builder, so rendering these behaviors near-invisible in the syntax seemed like a bad design, and we chose a different direction.

I think the same logic likely applies to using postfix * for variadic generics. The invisibility of postfix * is kind of a problem when all of these would be valid and would mean slightly different things:

printPack(tuple, pack*)             // Concatenating tuple with pack
printPack(tuple, pack)*             // Expanding tuple with pack
printPack(tuple.element*, pack*)    // Concatenating tuple element pack with pack
printPack(tuple.element, pack)*     // Expanding tuple element pack with pack

It would get worse in more complicated examples where the expansion was buried in a subexpression.

One thing I'll say for the map-style approach in particular is that it makes the section of code covered by the expansion very clear:

printPack(tuple, pack.map { $0 })                         // Concatenating tuple with pack
pack.map { printPack(tuple, $0) }                         // Expanding tuple with pack
printPack(tuple.element.map { $0 }, pack.map { $0 })      // Concatenating tuple element pack with pack
zip(tuple.element, pack).map { printPack($0.0, $0.1) }    // Expanding tuple element pack with pack

Outside of expressions, though, I agree that it's not a great fit.


I don't have a single set of recommendations at this stage, but here are some things I'm thinking about:

struct VariadicZip<Collections: many Collection>: Collection {
    var underlying: each Collections

    typealias Index = (each Collections.Index)
    typealias Element = (each Collections.Element)

    subscript(i: Index) -> Element { (each underlying[i.element]) }

    var startIndex: Index { ((each underlying).startIndex) }
    var endIndex: Index { ((each underlying).endIndex) }

    func formIndex(after index: inout Index) {
        for (c, inout i) in each (underlying, index.element) {
            c.formIndex(after: &i)
        }
    }
}

(Pardon any expression syntax mistakes—I'm still struggling a little with the scope of non-map-style keywords, particularly in expressions like the one in the subscript.)

Why?

  • many and each get across similar ideas to variadic/pack and .../expand without using jargon or overloaded symbols. I have to admit that the many/any rhyme is kind of pleasing too.

  • many is after the colon, not before the argument, because the fact that a generic argument is variadic feels type-y to me. (many would be allowed as a standalone keyword in these positions, short for many Any.)

  • Why two different keywords? I like something like each in expression context where there's literal iteration happening, and it feels strange to have either many or each used in opposing ways in a generic signature (labeling constraint) vs. a concrete type (labeling generic parameter).

An alternative that would avoid this last problem is:

struct VariadicZip<each Collections: Collection>: Collection {
    var underlying: each Collections
    // ...as before...

By moving the keyword before the generic parameter, it's no longer labeling the constraint in generic context, so it no longer has an opposing meaning in those two positions.

And, of course, I'm still thinking about map-style syntax. This seems way clearer than version with an each keyword, or ... or * suffix, or whatever else:

    subscript(i: Index) -> Element {
        (zip(underlying, i.element).map { $0.0[$0.1] })
    }
18 Likes
  • A new keyword in expression context would break existing code that uses that keyword name, e.g. as the name of a function.

I just remembered that we conveniently already have a keyword that fits with the concept of pack expansion that would not break existing code. I've been hammering the concept of "repetition patterns" for variadic generics, so... we could potentially use the repeat keyword, which already cannot be used as a regular identifier in expressions.

func zip<variadic T, variadic U>(t: repeat T, u: repeat U) -> (repeat (T, U)) {
  return (repeat (t, u))
}
5 Likes

No, only the name. I think of it as similar to this:

enum CandleUpgrade {
    case multipleWicks (wickCount: Int)
}
func demo1 (upgrade: CandleUpgrade) -> String {
    switch upgrade {
    case .multipleWicks (let wickCount):
        return """
            Here, we don't have to restate that wickCount is an Int,
            but we do have to write the name of the associated value
            if we want to use it in the body of this switch case.
        """
    }
}
func demo2 (upgrade: CandleUpgrade) -> String {
    switch upgrade {
    case .multipleWicks (wickCount: 3):
        return """
            Here, we have added an additional specification, letting us
            know that in this context wickCount is not just an Int
            but more specifically the integer 3.
        """
    }
}
extension Set<Element> {
    // Element is known to be Hashable
    // just like wickCount was known to be an Int
}
extension Set<Element == Int> {
    // Element is known to not only be Hashable
    // but more specifically an Int, just like in demo2
    // where wickCount was bound with extra specificity
}

Yes, but if we were to force people to declare extensions in this somewhat new way starting in Swift 6 then perhaps allowing my slightly more concise syntax would be an important way to ease the transition. But it's true that the extension Set<Element == Int> syntax is technically orthogonal to my suggestion/idea and could be considered separately.

I think not having to re-declare generic parameters is a feature of extensions. If generic parameters in extensions don’t show up in code completion, I think we should fix code completion instead of adding more verbosity to extensions.

In any case, my point wasn’t that extensions specifically are a problem, but rather that having a declaration keyword doesn’t really fix the subtlety of using * for pack expansion. Becca’s examples above show how easily * can get lost in an expression.

4 Likes

I really love that we have variadics on the horizon, thank you for pushing this forward. But could you quickly explain two things that I don't quite understand:

  • Why do we need a different keyword on the parameter side of thing, can't we simply use variadic?
  • Why do we need a keyword at all for the parameters, we've already established that T and U are variadic?

For the second question, I am presuming that naked T and U can't be used at all, which is why the repeated keyword seems like it might be superfluous. I mean, these are not well defined, right?

func f<variadic T>(t: T) { }
func f<T>(t: variadic T) { }

func f<variadic T>(t: variadic T) { 
    let s: T = ...
}

But I suppose it's partly about clarity, partly because we wouldn't otherwise be able to say distinguish a variadic tuple (the output) from a tuple of variadics (the input), without introducing boilerplate like a variadic V ... where V = (T, U) or something.

Another thing, more general, is that this particular function seems a bit too magical to me. It seems to be a kind of implicit zip, or can we say that the last repeat is doing the zipping? I think for me there is a general confusion about what the U and T mean on their own, it's not really clear to me what the pack is and what an individual member is. For example, in your syntax, what would these mean, if anything:

By the way, I feel that variadic is a bit misleading since that usually means that the number of arguments is variable, which isn't really the main thing here - for that we could just use arrays. The magic is that the arguments can be different, and hopefully we will also soon have fixed length variadics, which just doesn't sound right....

Anyway, my own syntax of choice is probably curly braces, reminiscent of set notation, for both parameters and generics. It also makes it more concrete I feel, a { T } is simply an ordered set of types, rather than say a sequence of types that varies over time/per call or something. It also gives us a "free brace", so we don't need extra () around complex expressions like { T: Codable }.

func zip<{T}, {U}>(t: {T}, u: {U}) -> { (T, U) } {
  return { (t, u) }
}

I'm not sure I think that already having a reserved word repeat is a strong argument, in the long run isn't it more important that the syntax is clear than saving one reserved word? Especially since it means a different thing currently.

2 Likes

I really like the overall idea of many and each. I disagree on some small details.

I’m not sure if pack properties are in the scope of this discussion, but I think they shouldn’t be allowed. A tulle would be much more straightforward in this case.

I’m not sure I understand your argument here. Since type packs are abstract entities and cannot (or at least I don’t foresee they can) conform to a protocol/class as a pack, it’s quite unambiguous that the conformance constraints refer to individual pack elements. Also <T: many>, if I understand your proposed syntax correctly, looks pretty awkward. The only advantage I can think of is that if we were to adopt labels for type parameters, the keyword after the colon would simplify the syntax.

My only reservation with this is that many & each mean different things entirely from some & any yet they are all quantitative words. I wonder what the impact of this will be on readability when used together, especially for first time learners. It may become quite ambiguous.

5 Likes

Or combined… many some Foo?

3 Likes

Is there any conceivable/reasonable way that double angle brackets could be used?

<<T>>

What would that look like in practice? Functor<Result, <Args>>?