Variadic generics and tuple shuffle conversions

So this topic has come up a few times in the past. We currently allow three kinds of conversions between tuple types:

  • if two tuples (x1: T1, x2: T2, ..., xn: Tn) and (y1: U1, y2: U2, ..., yn: Un) have the same length and labels, one converts to the other if T1 converts to U1, ..., Tn converts to Un.
  • a labeled tuple (x1: T1, ..., xn: Un) converts to an unlabeled tuple (T1, ..., Tn), and vice versa.
  • a labeled tuple (x1: T1, ..., xn: Un) converts to (x(sigma(1)): T(sigma(1)), ..., x(sigma(n)): T(sigma(n))), where sigma is a permutation of the label set x1, ..., xn.

The first conversion is totally fine, but the second two are tricky to implement correctly with variadic generics, because of the behavior where a pack expansion type on one side can absorb multiple elements on the other side.

Also, the conversion that drops labels is problematic because it allows for violating the invariant where a tuple element with a pack expansion type must be followed by a labeled element; eg, '(T..., x: U...)would convert to(T..., U...)`, which is no longer well-formed.

In my current work-in-progress implementation, the last two conversions are disabled if either side involves a pack expansion. I think we should get rid of them altogether in Swift 6 mode, even for tuples only involving scalar types.

Another oddity is that today the constraint solver distinguishes between Subtype and Conversion constraints. Conversion admits a few more conversions than Subtype, and is used in a few positions like the arguments of a call expression. A Conversion of function types though, eg (T) -> U converting to (T') -> U', only solves if T is a Subtype of T' and U is a Subtype of U'; that is, it's always downgraded from Conversion to Subtype when we walk into function parameter or result position. The two problematic tuple conversions are only allowed with Conversion constraints, not Subtype. The first one is allowed with Subtype. Eliminating these conversions would remove most of the distinction between the two constraint kinds, which would eventually simplify the language if we ever remove pre-Swift 6 mode.

Does anyone have compelling examples that rely on these conversions in practice, and how awkward would it be to write them out by hand instead?

Thoughts?

6 Likes

Would dropping #2 also affect enum payload matching? For example,

enum E {
  case a(this: Int, that: String)
}

switch e {
case .a(let this, let that):
  // This works today, would I be forced to write
  // `.a(this: let this, that: let that)`?
}

I admit to being frequently lazy and dropping the labels when the name of the variable matches the label, but I think it could be a large source of incompatibility if pattern matching is also relying on that conversion.

4 Likes

No. Enum payload matching is entirely position-based. The labels are optional, but it does neither #2 nor #3.

(Which is another good reason - for consistency - to drop these.)

5 Likes

I'm all for dropping tuple shuffles, but dropping the labeled-to-unlabeled conversion worries me a bit more. Even if enum matching is not affected, I know that I have in the past written code that is generic over n-element (for some n) tuples (which are unlabeled since there's no labels that make sense devoid of context) and then wanted to use it on code that uses concrete labeled tuples (since the labels are helpful for disambiguation in-context). As I understand it you're suggesting that this would be forbidden?

I'd want to see a pretty thorough source-compat analysis to justify dropping this conversion altogether.

EDIT: "source-compat" not "source-combat" :slight_smile:

12 Likes

Did I get it correctly, one of the pitched removal would be the conversion from non labeled tuples into labeled tuples? I use this conversion all the time in transformation functions like map or compactMap. It's not a dealbreaker for me if the language gets better this way but it would break quite a bit of code, not that I'm against it.

1 Like

What does that usage look like? I imagine this wouldn't have a huge affect on code that is generic over all T (since substituting a labeled tuple for T would maintain the labels) and I don't believe Slava is proposing getting rid of the ((T...)) -> U to (T...) -> U special-case argument conversion either.

1 Like

I think eliminating that second conversion would cause a fairly significant level of source breakage, without much tangible benefit for Swift's users. One conversion site that I see all the time is returning an unlabeled tuple from a function with labeled tuple return type.

func divide(_ x: Int, by y: Int) -> (quotient: Int, remainder: Int) {
    let quotient = x / y
    let remainder = x % y
    return (quotient, remainder)
    // or even `return (x / y, x % y)`
}

This kind of usage makes sense — the labels in the return type are useful for users of the function, because they give names to the tuple components, but inside the function the components already have names.

Another example is when working with dictionaries, which have the labeled tuple element type (key: Key, value: Value). The conversion is valuable at various times when working with dictionary elements (instead of just keys or just values).

Semi-relatedly, I think the Conversion/Subtype distinction may explain some of the oddity we see with initializing a dictionary from sequences with different element types:

// Dictionary(uniqueKeysWithValues:) takes `some Sequence<(Key, Value)>`
// Note: no labels on the element type

// OK: `(Int, Character)` element type
let zippedPairs = zip(1...3, "abc")
Dictionary(uniqueKeysWithValues: zippedPairs)

// Error: Initializer 'init(uniqueKeysWithValues:)' requires the types '(key: ClosedRange<Int>.Element, value: String.Element)' (aka '(key: Int, value: Character)') and '(Key, Value)' be equivalent
let labeledZippedPairs = zippedPairs.lazy.map { (key: $0, value: $1) }
Dictionary(uniqueKeysWithValues: labeledZippedPairs)

// OK: Array made from labeledZippedPairs, despite `(key: Int, value: Character)` element type
let labeledArray = Array(labeledZippedPairs)
Dictionary(uniqueKeysWithValues: labeledArray)

Is the fact that the last one works and not the middle one intended, or a type-checker bug?

14 Likes

Basically what @nnnnnnnn just showcased in the first example. The labels are dropped during the return and only maintained via the inferred / return type. This type of conversion is all over the place in our codebases.

3 Likes

Given that one of the big problems for variadic generics is the muffle (conversion dropping labels), but real-life code uses the stuffle (conversion adding labels), is it reasonable to split the difference and disallow the former but keep the latter?

2 Likes

I wonder if there's a middle ground where we could continue to support this conversion when substituting into generic contexts and also for tuple literals, but disable it for arbitrary values. So:

func split<T, U>(_ tuples: [(T, U)]) -> ([T], [U]) {
    return (tuples.map(\.0), tuples.map(\.1))
}
let pairs = [(x: 0, y: "abc")]
split(pairs)

func divide(_ x: Int, by y: Int) -> (quotient: Int, remainder: Int) {
    let quotient = x / y
    let remainder = x % y
    return (quotient, remainder)
    // or even `return (x / y, x % y)`
}

would both continue to be accepted but:

let x: (y: Int, z: Int) = (0, 0)
func f(_ vals: (Int, Int)) {
    print(vals)
}

f(x)

would fail, and you'd have to write

f((x.y, x.z))

Still source breaking, but maybe not as much of an ergonomic issue?

Dictionaries are a great place to focus for this. Because Slava's second tuple conversion type doesn't actually work with them directly, I use this:

/// Remove the labels from a tuple.
/// - Parameter tuple: A tuple that may have at least one label.
@inlinable public func removeLabels<T0, T1>(_ tuple: (T0, T1)) -> (T0, T1) {
  tuple
}

That allows for writing missing overloads, e.g.:

extension Dictionary: DictionaryProtocol { }
extension OrderedDictionary: DictionaryProtocol { }

public extension DictionaryProtocol {
  /// Creates a new dictionary from the key-value pairs in the given sequence.
  ///
  /// - Parameter keysAndValues: A sequence of key-value pairs to use for
  ///   the new dictionary. Every key in `keysAndValues` must be unique.
  /// - Returns: A new dictionary initialized with the elements of `keysAndValues`.
  /// - Precondition: The sequence must not have duplicate keys.
  /// - Note: Differs from the initializer in the standard library, which doesn't allow labeled tuple elements.
  ///     This can't support *all* labels, but it does support `(key:value:)` specifically,
  ///     which `Dictionary` and `KeyValuePairs` use for their elements.
  @inlinable init(uniqueKeysWithValues keysAndValues: some Sequence<Element>) {
    self.init(uniqueKeysWithValues: keysAndValues.lazy.map(removeLabels))
  }

  /// `merge`, with labeled tuples.
  ///
  /// - Parameter pairs: Either `KeyValuePairs<Key, Value.Element>`
  ///   or a `Sequence` with the same element type as that.
  @inlinable mutating func merge(
    _ pairs: some Sequence<Element>,
    uniquingKeysWith combine: (Value, Value) throws -> Value
  ) rethrows {
    try merge(pairs.lazy.map(removeLabels), uniquingKeysWith: combine)
  }
Relevant bits of `DictionaryProtocol`
public protocol DictionaryProtocol<Key, Value>: Sequence where Element == (key: Key, value: Value) {
  associatedtype Key
  associatedtype Value

  init(uniqueKeysWithValues: some Sequence<(Key, Value)>)

  mutating func merge(
    _: some Sequence<(Key, Value)>,
    uniquingKeysWith: (Value, Value) throws -> Value
  ) rethrows
}

It's worth maintaining this kind of thing, for usage at call site, but the language should make it so it doesn't need to be written. I think what we need is the ability to manually dictate how to handle labels, via a decoration on either each tuple label, or just the tuple itself, if that's decided to be enough control.

For input, labels need to match, or not. I don't know if the matching is ever actually important, but Slava is suggesting to throw out the conversion, which I think is much more useful. So we at least need an option.

For output, there's currently no way to preserve labels after de/restructuring. You either throw them out or assign new ones. I haven't followed the variadic generics discussion and don't know if a solution for this is built in there. It also may be out of scope for this discussion but it seems to me that addressing input and output should be handled for the same release.

For example, make it possible to write a function
with the argument: (a: true, b: 1)
that returns: (a: true, b: 2)
whose body is only: (tuple.0, tuple.1 + 1)
and never makes internal reference to the labels a or b in code
(and of course isn't assigned back to a tuple variable with those labels).

Both adding and dropping labels are valuable. For example, we only recently approved a proposal to improve the ergonomics of pointer types which includes, among other changes, the addition of labels to the tuple return type of an existing method to clarify which tuple element (of two) is which.

This was minimally source-breaking given the possibility of unintended tuple shuffles (the third item on the list), but deemed not plausibly problematic as the behavior change would arise only when users unintentionally thought one tuple element was actually the other. However, in a world where conversions that drop labels are prohibited, such an API change would be massively source-breaking in practice.

Independent of Swift Evolution considerations, I can say anecdotally that when I destructure a returned tuple, I frequently drop the labels when the APIs are familiar to me—one concrete example: (partialValue: ..., overflow: ...) for the *ReportingOverflow APIs on standard library integer types. I find the self-documenting clarity achieved by have labeled tuple elements in the method declaration doesn't carry through to writing out the labels at the use site, and in fact it can get quite noisy to do so repeatedly.

[Edit: On the other hand, restricting the addition or removal of labels when pack expansions are involved seems appropriate if they can be narrowly tailored to only those tuples conversions where there is the actual possibility of ambiguity—there wouldn't be, for instance, going from (Int...) to (foo: Int...), so it doesn't seem so intuitive to disable the feature in that case.]

5 Likes