Structural Sum Types (used to be Anonymous Union Types)

Paul_Cantrell · September 23, 2023, 11:16pm

[quote="taylorswift, post:63, topic:67432”]
requirements of P will still dispatch to the conforming type’s witness at run-time, unless the compiler can specialize it to a known type.
[/quote]

Right, my point is that in many cases, the compiler can specialize it to a known type. For starters, AIUI (others please correct me if I’m wrong), some P always specializes to a statically known type; that’s what some means. And I assume that means calls to requirements of P on a value declared as some P do not require a witness table dispatch under normal circumstances.

And isn’t it true that when a function f has a type parameter P of protocol type, the Swift compiler can at its discretion emit a type-erased version of f that uses witness table dispatch and / or a specialized version of f for specific concrete types implementing P?

At the risk of getting lost in the details, I think (again AIUI) it is not correct to say that protocols in Swift only provide dynamic polymorphism. Or perhaps I misunderstood the OP.

taylorswift · September 23, 2023, 11:19pm

this is only true if all your code lives in a single module. in larger projects, failure to specialize is a common occurrence and a major pain point, and is something that takes a significant amount of planning and effort to prevent.

toph42 · September 23, 2023, 11:24pm

That’s new information to me. I’ve paraphrased some P as, “It’s a specific type that conforms to P, but don’t worry about knowing which one. You don’t need to know, but the compiler knows.” So that’s not really true? There are instances where the compiler doesn’t know what concrete type your opaque type is? Does that code still compile?

Nobody1707 · September 24, 2023, 3:17pm

some P as a parameter type is exactly the same as func foo<T: P>(_ value: T) with all of the cross module issues that implies. And while some P as a return type acts the way you describe, I'm not sure that it's completely transparent to the compiler accross module boundries, since the underlying type is allowed to change in later versions of the code.

e.g.

// version 1.0
func foo() -> some P { return A() }
// version 1.1
func foo() -> some P { return B() }

is allowed.

Douglas_Gregor · September 25, 2023, 4:02am

I apologize as well for overreacting; I'm going to remove my reply, and let's get back to language design.

Doug

zienag · September 25, 2023, 1:24pm

Sounds amazing, a very long-awaited feature!

Also, I think it's worth looking at Scala union types Union Types | Scala 3 — Book | Scala Documentation

aggie33 · September 25, 2023, 11:32pm

For this particular case you could use buildExpression

sspringer · September 26, 2023, 3:44am

Does not really help, because to my understanding you then would have to convert all possible input to a common type, so I would have to convert all single items to sequences of items (e.g. by letting them conform to Sequence, but then still an iterator has to be initialized for each item). You can do that, but besides efficiency that feels like programming around a deficiency of the language which is exactly the “missing” sum types.

(It reminds me of using optionals as empty sequences to allow for optional chains in for-in loops which I really “need” in my XML processing — if I do not want to use forEach which is commonly seen as bad habit and also has some real disadvantages. It is those edge cases where I feel Swift could be more “streamlined”: Swift is such a nice language, it is a pity that in those cases you have to resort to strange or discouraged patterns.)

sveinhal · September 26, 2023, 6:52am

I wish both anonymous sum types as well as untagged "pure" union types welcome in Swift!

I think the former will fill a nice gap in the type system where we're currently missing a light-weight anonymous enum syntax for one-off use, similar to how tuples are light-weight unnamed structs.

I think the proposed syntax (String | Int) fits nicely with the equivalent (String, Int) tuple syntax, and I think they both work with and without labels. We could reuse most of enum pattern matching syntax like suggested, with unnamed cases getting tagged like unnamed tuple elements .0(let string) etc.

I also think it would be useful with proper untagged union types, what uses unordered set unions rather than nesting for composition. It would be useful for Error propagation, but also for a new Optional-like type (e.g. Maybe) where Maybe<String> | Maybe<String> would be structurally equivalent to Maybe<String>, that is a type whose values are either .some(String) or .none, in sharp contrast to today's Optional, which would have to be composed into a nested double-optional where the values would be .none, .some(.none), or .some(.some(String)).

I can also envision proper untagged Either unions, combined with Maybe, so that Maybe<String> | Maybe<String> | Maybe<Int> is just flattened to Maybe<String> | Maybe<Int>. That is, actually just either a string or an int.

Nickolas_Pohilets · September 28, 2023, 10:44am

When designing this features, it is also needed to consider what happens when types being summed up can overlap.

protocol P {}
protocol Q {}
struct R: P {}
struct S: Q: {}
struct T: P, Q {}

let x: (any P | any Q) = T()
switch x {
case let p as any P:
    print("It's a P: \(p)")
case let q as any Q:
    print("It's a Q: \(q)")
}

IMO, the switch above should desugar into the following:

if let p = x as any P {
    print("It's a P: \(p)")
} else if let q = x as any Q {
   print("It's a Q: \(q)")
} else {
    fatalError()
}

So if type cases can overlap, the first matching one is executed. Order of case's in switch matters.

Now, in terms of memory layout, I think layout of Any is a good starting point - i.e.value buffer + type. As an optimization, instead of using value buffer of hardcoded size, we can compute it precisely based on types being used. E.g. value buffer for (Int | String) is max(sizeof(Int), sizeof(String)). If any of the types in question in an existential, then only value buffer part of the existential is taken into account.

I'm not sure though what to do with protocol witnesses. If we are sure we will not need to support private/multiple conformances, then it is ok not to store them and lookup dynamically when matching. But if we assume that we will need to support private/multiple conformances, then protocol witnesses need to be stored. And probably all of them. So that layout of the (any P | any Q) would be: value buffer + type + witness for P + witness for Q. And when assigning T() to x both witnesses will be stored. Then with private witnesses, it would behave like this:

let x: (any P | any Q) = T()
switch x {
case let p as any P:
    print("It's a P: \(p)") // prints "T()"
    print("P as Q: \(p as? Q)") // prints nil - when casting to any P witness for Q was lost
    print("x as Q: \(x as? Q)") // prints .some(T()) - `x` has witness for Q
case let q as any Q:
    ...
}

wadetregaskis · September 28, 2023, 7:57pm

Yep, always has.

Furthermore this situation already arises, e.g.:

class A {}
class B: A {}
class C: B {}

switch C() {
case is A:
    print("A")
case is B:
    print("B")
default:
    print("Other")
}

(outputs "A")

The compiler already warns that "Case is already handled by previous patterns; consider removing it". It would presumably continue to do so for sum types.

technogen · September 28, 2023, 8:23pm

I've been meaning to get into contributing to the Swift compiler repository for a while now. Once I get myself familiar with the contributor's documentation, codebase architecture, and the build process, I'll start working on implementing this for the formal proposal. In the meantime, I'll try putting together a formal pitch once I have some time to spare.