In this thread about supporting existentials in embedded Swift, I commented on some techniques to un-existentialize code, and that prompted @wadetregaskis to bring up language supported "sum types".
I have always wanted type disjunctions, which would use the syntax given in the example, like Int | String
. Two issues worth pointing out are: this is not the same as "sum types" (the dual to tuples the PointFree crowd has brought up), despite there being some confusing overlap, and type disjunctions have been vehemently rejected by the language team (probably the clearest "no, never" I've ever seen from them regarding any language feature).
On the first point, what is the difference between a type disjunction and a sum type? As mentioned a sum type is dual to a tuple (product type), which is a structural type (it's not an "anonymous" type as is often said, tuples are named types and can be referred to by name, such as in generic code) defined by the type of its members. A sum type is also a structural (not anonymous) type defined by the type of its cases.
We can almost express a tuple in Swift using variadic generics: struct Tuple<each Ts>
. We can't express the internals though. But the point is a tuple is a variadic generic struct. A sum type would be a variadic generic enum: enum Sum<each Ts>
. Just as a struct has members 0, 1, 2, ...
, a sum has cases 0, 1, 2, ...
.
Since I would use the syntax Int | String | ...
for a type disjunction, I would instead use more tuple-like syntax for a sum type. Something like (Int / String / ...)
.
Contrast this with a type disjunction. A type disjunction is dual to a type conjunction, Encodable & Decodable
. It is different from a sum type in the same way a tuple is different from a type conjunction.
A type conjunction is an abstract type and as such can only appear as a generic constraint, an existential and an opaque type. As a constraint, it's really just shorthand for two separate constraints T: A & B
is the same as where T: A, T: B
. As an existential, any A & B
is a box that can hold only concrete types that are subtypes of both A
and B
, exposes all the requirements available on both any A
and any B
, and is automatically upcastable to any A
and any B
. As an opaque type, some A & B
wraps a concrete type conforming to both A
and B
and conforms to both itself by forwarding all requirements to the concrete type.
This is of course very different than a tuple. Some immediate mechanical differences between the two are: (A, A)
is quite different than A
, but A & A
is identical to A
, and A & B
is a subtype of A
(in terms of casting) but (A, B)
is not.
A type disjunction A | B
is also an abstract type (well, maybe not, see below), and thus can only appear in generic constraints, existentials and opaque types. One immediate challenge is its use as a constraint is not mere shorthand for existing functionality (two separate constraints). The only way to mimick a disjunctive constraint would be to define two identical generic functions/extensions, each one constraining to one of the two types in the disjunction. Swift forbids this and directly calls out that you can't define non-overlapping constraints.
This also begs the question of what you could do in the generic code with T: A | B
. You certainly don't have access to the requirements of A
or B
. The only answer to what you could do with T
is either "nothing" (so it acts as if there are no constraints), or the compiler finds all supertypes common to A
and B
(let's say it finds C
and D
) and treats it like the constraint C & D
. This has led to the question, "what's the point", since you can write the constraint as C & D
today. However, while this is identical in terms of what you can do in the generic body, that's not identical to users because it's less restrictive in what it accepts. Also, I would want to be able to switch on the T
without an open-ended default, with a case for a concrete A
and a case for a concrete B
, and have full access to the concrete A
or B
inside the cases. That exploits the stronger guarantee of A | B
over C & D
.
It also begs the question of when both A
and B
happen to have identical requirements, whether those requirements should be available in generic code (if both A
and B
define func thisFunction() -> Int
, should you be able to call thisFunction
on a T: A | B
in generic code?). The answer consistent with Swift's type system is "no" because two protocols with identical requirements are not the same protocol. But this might be surprising to users who expect otherwise.
For an existential any A | B
, this would box any concrete type that either conforms to A
or B
. The duality to conjunctions is clear here through the opposite casting rules: any A | B
is a supertype of both any A
and any B
(so the latter both upcast automatically to the former). Casting the other way is a downcast and has to be conditional. This begs the question: what would you do with this existential? Once again you don't have access to the requirements of A
or B
and either have access to nothing or to the requirements of common conformances. Why not just use an existential box of the common conformances?
Similar to in generics I would want to be able to switch on its type without an open-ended default
. I want to switch on an any A | B
with cases only for any A
and any B
. This is of course "enum like" and hints at why it's confusing how this is different from a sum type.
It's also notable that conjunctions of concrete types, like Int & String
are pointless (no type besides Never
could qualify), but disjunctions of concrete types, like Int | String
, are definitely not pointless. And this wouldn't really be an existential (it wouldn't be any Int | String
). It would be closer to a structural enum, but still not exactly that because of its transparency in casting (an Int
would be automatically assignable to an Int | String
, and going the other way would be an as?
, not an if case let ...
).
Opaque disjunctions some A | B
have the same problem of what they should conform to (nothing, or the common conformances?), and what you'd do with it. I would again want closed switching on it with some A
and some B
cases.
Similar to conjunctions vs. tuples, some immediate differences between disjunction and sums are: (A / A)
is distinct from A
(it has two cases, 0
and 1
, and on access it matters which one was used, it doesn't matter the types are the same), but A | A
is identical to A
, and there is no subtype relationship between (A / B)
and A
. You have to manually box and unbox values into/out of sum types rather than just casting. The only similarity is that both would allow switching without open-ended defaults.
Whenever we talk about this, we need to be clear which one we're talking about. I've seen threads where some people are talking about sums and others disjunctions and they didn't realize they were talking about different things. I've even seen one person seem to switch between the two.
Given that, when people ask for sum types, they should ask themselves: are you really asking for type disjunctions? Are you imagining that String | String
is distinct from String
, and that you can create one either with (String | String).0("hello")
or (String | String).1("hello")
, and you have to switch separately on cases 0
and 1
later? Or are you imagining that String | String
is the same as String
and that an Int
should automatically upcast to an Int | String
?
I believe the core team has rejected type disjunctions because they have limited use and put a lot of strain on the type checker. I don't recall them discussing closed switching. That's the key capability they offer that answers the "what's the point/what would you do with them anyways" question, so I wonder if that would change things.
I want to propose an alternative though that might be easier to implement, taking inspiration from Kotlin: sealed types.
This would be a non-final class or protocol declared as sealed
, which tells the compiler the subtypes are defined all within the same translation unit. You can then switch on one, and as long as you have a case for each subtype, you don't need an open-ended default.
So then whenever you'd want to use a type disjunction like Int | String
, you can just do this:
sealed protocol IntOrString {}
extension Int: IntOrString {}
extension String: IntOrString {}
...
func process(value: some IntOrString) {
switch value {
case let int as Int: ...
case let string as String: ...
// No default
}
}
Using this I could eliminate a lot of boilerplate enums I define just to bring together a few types that I want to visit later, where I would have used just a protocol but that would force me to include an open-ended default on the switch
.
There are definitely some drawbacks to this compared to genuine type disjunctions. First of all, I can't do this with protocols I don't define, like sealed protocol EncodableOrRawRepresentable
because you can't retroactively make a protocol extend another protocol. For that you'd need generic extensions (extension<T> T: EncodableOrRawRepresentable where T: Encodable
), and even then I don't know if that would imply that any Encodable
becomes a subtype of any EncodableOrRawRepresentable
).
Also it introduces potential incompatibility of the exact sort that tuples avoid over defining bespoke structs (and that presumably sum types would also avoid). Two different libraries need an IntOrString
union and define their own sealed protocol for it, but this appears as different incompatible types to the type system.
It also makes it possible to misdefine a disjunction. In the example I named the protocol then had to explicitly opt the two types it's covering into it. If I had forgotten one or accidentally added a third, then I'd have an IntOrString
whose name is a lie. A true disjunction Int | String
couldn't be broken in that way.
But if true type disjunctions are simply off the table because they can't be practically implemented, I think sum types would be a decent alternative, as would sealed types. Both seem to have non-overlapping pros and cons.