Structural Sum Types (used to be Anonymous Union Types)

How about something like this?

let aOrB: (A | B) = <#...#> // where A, B: ExpressibleByIntegerLiteral & AdditiveArithmetic
let twiceAOrB = aOrB * 2

That is to say, a union type can be used to form an expression where all of its cases individually would be valid. Such an expression is of a new union type whose cases are the results of the original expression, but applied to the specific cases of the original union.

You could then pass a union into something like a generic function and get a similar-shaped union of the result.

If all cases of a union have the same type, an explicit syntax could be used to get the value out.
No implicit type conversion happening anywhere. Only union transformations and an explicit collapse at the end, if applicable.

EDIT:

For example, the explicit syntax to get the value out could take the form of an explicit unconditional type cast:

aOrB as Int // only valid if all cases are known to be of the same type.
// OR
aOrB as Equatable // only valid if all cases are known to conform to Equatable.

EDIT 2:

It would behave exactly like quantum superposition: It would continue to be on one of multiple possible states, even when you "rotate" that set of possible states. And once you "measure" it, it collapses into a single value.

Again, no implicit type conversion, just explicit operations on unions.

1 Like

Regarding the discussion of anonymous unions as opposed anonymous sum types, I could imagine wanting several possibilities with the same type in an anonymous enum, which would have significant problems with anonymous unions.

Take the following example, which also makes use of labels in an anonymous enum:

struct AlertState<Action> {
  var title: String?
  var options: [Option]
  
  struct Option {
    var title: String 
    var action: Action 
  }
}

// ...

let alertState = AlertState<(yes: Void | no: Void)>(
  title: "Do the thing?",
  options: [
    Option(title: "Yes", action: .yes(())),
    Option(title: "No", option: .no(()))
  ]
)

switch await presentAlert(alertState) {
case .yes: //...
case .no: //...
}

This sounds to me more like an anonymous protocol than an anonymous enum. I personally would much prefer that these be two separate language features.

Yeah, that makes total sense. Labels are certainly useful and even without labels, The distinction between two seemingly identical cases is still necessary.

I agree that none of this is strictly necessary for unions to exist and be useful. The superposition idea is entirely aimed at extra ergonomics. Even_"unwrapping_" the union for the purpose of expression evaluation can have explicit syntax (much like unwrapping a type pack or a value pack is done explicitly via repeat and each).

I would think that the core of a sum types feature would be that you just can write "foo" because else you can use enums (ok, you first have to define them). I am working with result builders where String, a certain class or sequences of both can be used and I have to “simulate” this sum type with a protocol in a feelingly awkward and imperfect way. I do not want to use enums because I like to present a nice convenient API, and if a sum type still needs a enum-like syntax I do not see a sense in it. Maybe the sum type I would like to have here is rather syntactic sugar / kind of conformances to the enum cases. So what does a sum type offer (besides anynomous usage) above enums if it is not for a nicer syntax e.g. when using according result builders? Maybe I am missing a point here.

2 Likes

It's important to note that syntax like:

switch mySumTypeInstance {
    case let int as Int: // ...
    case let string as String: // ...
    // no default, all cases are handled
}

…implies you can also do:

if let numericValue = mySumTypeInstance as? Numeric { // Might need an `any` or `some` in there too, I'm not sure.
    return numericValue * 2
} else …

i.e. you can treat the "sum type" somewhat generically as opposed to it being purely syntax sugar for a regular enum. I'm not saying that's good (or bad), just pointing it out explicitly.

In the other case, e.g.:

switch mySumTypeInstance {
    case let .0(int): // ...
    case let .1(string): // ...
    // no default, all cases are handled
}

I'm not sure the syntax sugar for declaring mySumTypeInstance's type is that valuable - as soon as you start having semantics exactly like enums today, including being centred around cases rather than types, you've already lost most of the potential ergonomic and capability gains. Plus you've now got naming problems (the case names, .0 etc), one of the hardest problems in computer science and language design by forum thread. :wink:

2 Likes

I would say that‘s nice.

This is actually exactly why I think this isn't a worthwhile feature for the language. The syntax being discussed with situations like this:

let x: (Int | String) = 4
switch x {
let int as Int: //...
let str as String: //...
}

...is already almost doable with protocols. The only thing missing is either exhaustive conformance checking or being able to limit the conformances of a protocol, and I would prefer that be its own feature separate from any anonymity of the type(s) involved.

5 Likes

I couldn't agree more! This is protocol territory and things like sealed protocols and exhaustiveness checking would be great, but outside the scope of this topic.

Yes, sounds like doing sum types via (better) protocols.

Hmm (sorry for citing myself). I wonder if the combination of sealed protocols and exhaustiveness checking for them is not kind of bending the protocol concept too much, and if sum types (with the “|” syntax) even if “internally” the same thing would not be a better way to express this. You then have one definition instead of many conformances plus a “sealed” annotation.

So conceptually the last few comments might be completely right, but I think the sum syntax could be the right way to express it. (Just my two cents and maybe not completely thought through.)

These are similar features in some respects but have very different use cases. Anonymous sum types would be useful in the same places tuples are: ad-how return types, in generic code that performs composition, etc. Sealed protocols with exhaustiveness checking would be useful when introducing an abstraction that has clear semantics and is not intended for conformance outside the declaring module.

4 Likes

About .0(T). I believe this isn't the way.

  1. There is no such thing as index in a union.
  2. Even if we introduce indices for a union X = A ∪ B, we can't remap them for Y = X ∪ C
  3. These indices are redundant. A type itself is an identifiable entity, there is no point to give them yet another tag.
  4. The proposed syntax with parenthesis hints, that there could be more than one type specified in them like .0(A, B). This should be illegal.
5 Likes

Stepping aside from syntax concerns, let's focus on properties we want:

  1. A ∪ B ≡ B ∪ A
  2. (A ∪ B) ∪ C ≡ A ∪ (B ∪ C)
  3. A ∪ A ≡ A

These properties imply:

  1. There is no "tags" on elements of a union. Element is identified by itself
  2. There is no "namespace" of a union. They are flat. (Technically union can be named and declare a namespace. This would be useful to write extensions. And we also would be able to check if a union is a strict subset of another union. But the elements of the union are not nested in it.)
  3. There is no order of elements.

I hope it's a common ground we can agree on.


Now a bit harder part. When one element is an ancestor of the other one we should reduce them into one element. I think natural thing to do would be to collapse them to the ancestor.
A ∪ B ≡ A where B: A
But there could be another option where we just demand invariance and produce error if it breaks.

4 Likes

Yes, but somehow I think the non-anonymous kind should “conceptually precede” the anonymous kind (think of functions), or in other words it feels awkward to have the anonymous kind without having non-anonymous kind, and besides the scope of definition they really “do” the same, aren’t they?

1 Like

You kinda can have a sealed protocol today if you want. Or better say you can achieve exhaustiveness.

protocol IntOrString {
  func match<R>(
    int: (Int) throws -> R,
    string: (String) throws -> R
  ) rethrows -> R

  func print1()
}

extension IntOrString {
  func print2() {
    match(
      int: {
        print($0)
      },
      string: {
        print("\"\($0)\"")
      }
    )
  }
}

extension Int: IntOrString {
  func match<R>(int: (Int) throws -> R, string: (String) throws -> R) rethrows -> R {
    try int(self)
  }

  func print1() {
    Swift.print(self)
  }
}

extension String: IntOrString {
  func match<R>(int: (Int) throws -> R, string: (String) throws -> R) rethrows -> R {
    try string(self)
  }

  func print1() {
    Swift.print("\"\(self)\"")
  }
}

Unfortunately Swift can't reason about control flow within a match call. It's not prohibitively bad, just a thing to keep in mind if you're going to do side effects within it.

It's interesting to consider that this:

let x: (Int | String) = 4
switch x {
case let int as Int: break
case let str as String: break
}

could get desugared like this:

protocol _AnonymousIntStringUnion {}
extension Int: _AnonymousIntStringUnion {}
extension String: _AnonymousIntStringUnion {}

let x: _AnonymousIntStringUnion = 4
switch x {
case let int as Int: break
case let str as String: break
default: fatalError("unreachable")
}

The later already works today, except you can't prevent other types from conforming to the protocol.

1 Like

The nominal equivalent of structural / anonymous sum types is enum. Sum types have similarities with protocols but many differences as well.

4 Likes

Swift's enum is at best a subset of sum types (not equivalent).

I absolutely agree! Among those differences are the performance implications, which are going to be critical for embedded systems, where such an abstraction is more likely to be used as opposed to protocols, which (in those circumstances) are only usable as constraints on force-specialized generics and never for existentials (which would be necessary to replicate this functionality).

1 Like