Type Disjunctions are a commonly rejected proposal, but the recent addition of typed throws will, in (not just) my opinion, make the need for them much more pressing. I think it would be unfortunate to create a one-off solution that works only for typed throws but not for the general case, like what was briefly mentioned in this thread.
What I want to propose here is what, to me (admittedly with no knowledge of the Swift compiler), would be a hopefully easy way to get general type disjunctions. I believe it avoids all the problems that have been raised about type checker performance.
We only really need one new feature, which seems minor, to provide an underlying implementation of type disjunctions. That new feature is "closed" or "sealed" protocols.
A "closed" protocol is a protocol with the restriction that it cannot be conformed to outside its own module. Alone this is a useful feature, such as for a library that wishes to hand out instances of a protocol, perhaps as a some
type, but does not intend clients to define their own conformances, e.g. Metal (eliminating the common “please don’t conform to this yourself” documentation, which also indicates the library is doing unsafe downcasts somewhere).
Since no type outside the defining module can conform to the closed protocol, there is no point in refining the protocol outside its module either (no one would be able to conform to this refined protocol), so that is also not allowed. This is important for other reasons I'll explain later.
But a restriction on a type has a corresponding new capability (forbidding X at design time means Not X is now known at design time). The new capability that pairs with being "closed" is exhaustive switching. If I define a closed protocol MyProtocol
and, say, three conforming types struct MyStruct1: MyProtocol
, struct MyStruct2: MyProtocol
and struct MyStruct3: MyProtocol
in its module, then the compiler knows an any MyProtocol
instance is "inhabited" by one of these three concrete types. Correspondingly I should be able to exhaustively switch on one:
let value: any MyProtocol = getValue()
switch value {
case let struct1 as MyStruct1: ...
case let struct2 as MyStruct2: ...
case let struct3 as MyStruct3: ...
// No open-ended default
}
And not just from within the module, but anywhere where all the conforming types are visible (i.e. if they're all public everyone can exhaustively switch on it, otherwise that's only possible inside the module).
There is precedent for this feature in Kotlin, which supports "sealed" classes and interfaces and allows exhaustive visitation of them. This is in fact how in Kotlin devs write type-safe unions or variants... the same thing Swift supports through enums with associated values.
Speaking of which, while it is possible to use one to emulate the other (in both directions), type disjunctions and tagged unions, either nominal or structural (people here call the latter "anonymous sums" but this is incorrect, anonymous types don't have an accessible name, tuples do have an accessible name and so would the dual to tuples) are not the same thing, and a lot of confusion has been bred by conflating them. In my opinion type disjunctions are far more important than structural unions, and using the latter to emulate the former (I do this often with nominal unions by writing a type eraser for a protocol as an enum) is a hack to work around a missing language capability.
The above is the simplest example: an existential of a protocol with only concrete conformances and no refining protocols.
What if we add a refining protocol, like protocol MySubProtocol: MyProtocol
? The important question is: is this protocol also closed? This might seem strange at first but I believe the default answer (no added keywords) should be no: that is, a protocol should not inherit "closedness" from protocols it refines. The consistent universal rule is a protocol is open unless it is explicitly defined not to be.
But this means introducing a (non-closed) refining protocol partially "re-opens" the base protocol: types outside the module can now conform to it but only by conforming to the refined protocol. What does this mean for switching?
What it means is to exhaustively switch, you must handle an existential of the refined protocol as one of the cases:
let value: any MyProtocol = getValue()
switch value {
case let struct1 as MyStruct1: ...
case let struct2 as MyStruct2: ...
case let struct3 as MyStruct3: ...
case let subProtocol as any MySubProtocol: ...
// No open-ended default
}
Since the conformances to MySubProtocol
are open-ended, the only way to handle all possibilities is to handle all conformances to MySubProtocol
together. You can also handle concrete conformances to or further refinements of MySubProtocol
but you then must handle the existential or have a default
case.
I’m open to suggestions from others that “re-opening” a closed protocol this way shouldn’t even be allowed, but I believe it doesn’t really complicate things that much.
On the other hand, if a refined protocol also declares itself as closed, this problem does not exist, and the behavior of the top closed protocol cascades down to the refined protocol. Not only can we exhaustively switch on the refined protocol by handing all of its conformances or existentials of its refinements, we can exhaustively switch on the top protocol by handling all its conformances/existentials of refinements besides any MySubProtocol
plus all the conformances/existentials of refinements of MySubProtocol
.
For example, let's say MySubProtocol
has two concrete conformances struct SubStruct1: MySubProtocol
and struct SubStruct2: MySubProtocol
, and one open refining protocol MySubSubProtocol: MySubProtocol
. We can then exhaustively switch on MyProtocol
like this:
let value: any MyProtocol = getValue()
switch value {
case let struct1 as MyStruct1: ...
case let struct2 as MyStruct2: ...
case let struct3 as MyStruct3: ...
case let subStruct1 as SubStruct1: ...
case let subStruct2 as SubStruct2: ...
case let subSubProtocol as any MySubSubProtocol: ...
// No open-ended default
}
So in general, the list of all the types that must be handled to fully cover a switch is generated by walking the tree of subtypes (where concrete conformances are leaves and refinements are branches) and terminating a search down a branch when a non-closed refinement is encountered, instead adding the existential for that refinement to the list. A similar algorithm would determine whether a switch is really exhaustive.
This I believe covers the behavior of existentials for closed protocols. All of its behavior for anything besides switching should be exactly the same as with normal protocols. In particular even though we might imagine fancy stuff like the compiler looking for members with matching signatures on all conforming types and expose a way to call them through the existential (since there's a trivial way to implement that by exhaustively switching over the conformances), this shouldn't be supported. It's not consistent with Swift's nominal notion of requirements (i.e. a concrete type cannot implicitly conform to a protocol even if it fulfills all the requirements). You can either do that yourself as an extension or just add the member as a requirement on the protocol.
Speaking of extensions, closed protocols allow extensions that are effectively dynamically dispatched to be added to protocols from anywhere (visitation is really just dynamic dispatch on a hierarchy bolted on from the outside). You have to write the dispatch table yourself but that's better than not being able to do it at all. Another fancy feature we can think of is the compiler allowing you to write an extension with only a declaration and then it forces you to extend every conforming type to implement what is declared, which it then internally turns into the exhaustive switch you would otherwise write by hand:
extension MyProtocol {
func doSomethingNew() // By adding this, the compiler will raise an error here until every type conforming to `MyProtocol` defines this function in an extension
}
extension MyStruct1 {
// This will get called when an `any MyProtocol` holding a `MyStruct1` receives a `doSomethingNew` call.
func doSomethingNew() {
print("I'm a MyStruct1!")
}
}
...
This would be cool, and (unlike the previous hypothetical feature) consistent with Swift's type system. But to me it's a bonus that can be added, if ever, later. I wouldn't miss it dearly if it's never added because I can accomplish the same thing by implementing the "base" extension with an exhaustive switch (and even dispatch to corresponding extensions on each concrete type), which isn't a ton of boilerplate and only has to be done once (the kind of boilerplate I don’t like is the kind that scales with how much you use something).
One more thing to say about existentials is to discuss the existential for the metatype any MyProtocol.Type
. This should support exhaustive switching just like with the existential for the protocol itself:
let value: any MyProtocol = getValue()
switch type(of: value) {
case let struct1 as MyStruct1.self: ...
case let struct2 as MyStruct2.self:: ...
case let struct3 as MyStruct3.self:: ...
case let subStruct1 as SubStruct1.self: ...
case let subStruct2 as SubStruct2.self: ...
case let subSubProtocol as (any MySubSubProtocol).self: ...
// No open-ended default
}
How could the compiler implement these existentials? Well the optimally efficient way would be to compile them to tagged unions (enums), so that the memory size of one is equal to the largest size of any of the possible inhabitants (plus the space needed for the type tag) and implement the members by dispatching on the type tag instead of jumping over to a witness table. But if this would make them too incompatible with other existentials, then they should just be implemented the same way existentials always are. Really this is an entirely compile time feature: marking a protocol as “closed” tells the compiler when it’s okay to accept a switch without a default
because it knows one of the other cases will always get hit. The resulting compiled code doesn’t need to change at all (except for omitting the default branch). The extra knowledge just enables new optimization opportunities.
Now what about generics? How does a closed protocol interact with generics when it’s used as a constraint? I believe there's nothing different about them (and same for opaque types). Since they behave just like protocols except when switching on them, and you only switch on them when you have an existential and need to recover the underlying concrete type (which you already have in a generic), there's no difference in capabilities inside a generic. For a T: MyProtocol
you have access to the requirements on MyProtocol
, same as always, and that's that.
What's important is what that constrained type parameter means: it guarantees it can only be bound to one of a known closed set of concrete types or a type that conforms to a non-closed refinement. This is a restriction on how that generic code can be called, and it's a powerful guarantee to know generic code can only be invoked with one of a closed set of known types. Simply having that guarantee strengthens type safety, even if values simply pass through generic code. But the way this emerges is simply as a consequence of a constraint to a protocol that doesn't allow additional conformances. This means if a library exposes generic code, either functions or types (including other protocols with associated types constrained to the closed protocol), those functions/types cannot be used with anything except types defined within the library itself. This restriction alone can prevent incorrect usages of a library at compile time.
But since every restriction has a corresponding new capability, what is the new capability within this generic code, made possible by the fact you can enumerate all the possible concrete types the type parameter is bound to?
The answer is simply the same exhaustive switching. You can do this in generic code just as you can in code with existentials because a value of type T: MyProtocol
trivially erases to an any MyProtocol
. But while this isn't required and can be a later enhancement, this would be much more powerful if the type checker can track the restriction on the type parameter in switch cases. I think it would be better and easier on the compiler to add specific syntax for this.
For example, you should be able to exhaustively switch on a type parameter T
, so that T
gets rebound within the body of the switch case:
private var currentInt = 5
private var currentString = "Hello!"
func makeAValue<T: MyProtocol>() -> T {
switch T: {
case MyStruct1:
print("Making a MyStruct1")
return MyStruct1(intValue: currentInt)
case MyStruct2:
print("Making a MyStruct2")
return MyStruct2(intValue: currentInt, stringValue: currentString)
case MyStruct3:
print("Making a MyStruct3")
return MyStruct3(stringValue: currentString)
case SubStruct1:
print("Making a SubStruct1")
...
case SubStruct2:
print("Making a SubStruct2")
...
case MySubSubProtocol:
print("Making some MySubSubProtocol")
...
// No open-ended default
}
}
struct MyStruct1: MyProtocol {
let intValue: Int
}
struct MyStruct2: MyProtocol {
let intValue: Int
let stringValue: String
}
struct MyStruct3: MyProtocol {
let stringValue: String
}
...
Notice the lack of .self
, and of any
on the last case. This is different than switching on the runtime metatype.
In the first switch cases, T
is rebound to the corresponding concrete type (so within the case the function is no longer generic). If one or more parameters of type T
were in scope they could then be passed to non-generic code taking the matching concrete type. In the last case of the refined protocol, T
is rebound to a further constrained generic parameter T: MySubSubProtocol
and can then be used to call generic code with that same constraint.
Really this is a more general enhancement to generics that is just as usable with normal open protocols (you'd just need an open ended default). But if this isn't feasible or problematic for any reason you can hack around it by falling back to erased existentials and force casts:
func makeAValue<T: MyProtocol>() -> T {
switch T.self as any MyProtocol.Type: {
case is MyStruct1:
print("Making a MyStruct1")
return MyStruct1(intValue: currentInt) as! T
case is MyStruct2.self::
print("Making a MyStruct2")
return MyStruct2(intValue: currentInt, stringValue: currentString) as! T
case is MyStruct3.self::
print("Making a MyStruct3")
return MyStruct3(stringValue: currentString) as! T
case is SubStruct1.self:
print("Making a SubStruct1")
...
case is SubStruct2.self:
print("Making a SubStruct2")
...
case is (any MySubSubProtocol).self:
print("Making some MySubSubProtocol")
...
// No open-ended default
}
}
Here we're just exercising the same exhaustive switching capability on the existentials. The compiler can't tell this is type safe so we have to tell it to just trust us. I expect the earlier type safe example would just compile to this (possibly omitting the conditional check and fatalError
pathway).
You can also do the same thing in a generic type:
struct Container<T: MyProtocol> {
static var description: String {
switch T.self as any MyProtocol.Type: {
case is MyStruct1.self:
"Container for MyStruct1"
case is MyStruct2.self::
"Container for MyStruct2"
case is MyStruct3.self::
"Container for MyStruct3"
case is SubStruct1.self:
"Container for SubStruct1"
case is SubStruct2.self:
"Container for SubStruct2"
case is (any MySubSubProtocol).self:
"Container for MySubSubProtocol"
// No open-ended default
}
}
Here we don't even need that feature of rebinding the type parameter and can just use switching on the existential.
You can also do this with values of T
. This allows the behavior of a generic type to vary with its type parameter in more significant ways that don't require offloading the variation to T
itself. It's really just another use of visitation from the outside:
final class OptionsViewModel<T: MyProtocol>: ObservableObject {
init(value: T) {
self._value = value
}
private let _value: T
var options: [String] {
switch _value as any MyProtocol {
case let struct1 as MyStruct1: options(for: struct1)
case let struct2 as MyStruct2: options(for: struct2)
case let struct3 as MyStruct3: options(for: struct3)
case let subStruct1 as SubStruct1: options(for: subStruct1)
case let subStruct2 as SubStruct2: options(for: subStruct2)
case let subSubProtocol as any MySubSubProtocol: options(for: subSubProtocol)
// No open-ended default
}
private func options(for value: MyStruct1) -> [String] { ... }
private func options(for value: MyStruct2) -> [String] { ... }
private func options(for value: MyStruct3) -> [String] { ... }
private func options(for value: SubStruct1) -> [String] { ... }
private func options(for value: SubStruct2) -> [String] { ... }
private func options(for value: any SubSubProtocol) -> [String] { ... }
}
}
Here again we don't even need the fancy feature of rebinding T
and can just use existentials, although it would be nice to switch on T
itself and have the compiler know we can pass _value
directly to those private functions.
Aside from this feature of rebinding type parameters by switching on them, all of these added capabilities within generic code are simply taking advantage of the added capability on existentials. I don't believe any changes to generics specifically are warranted here. The feature of rebinding type parameters by switching isn't specific to closed protocols or exhaustive switching.
Swift could also handle non-open classes in the same way. This would just be a matter of treating non-open classes as equivalent to closed protocols, so that you'd be able to exhaustively switch on the subclasses. I consider this less important and wouldn't really miss it much if it never gets supported, but that's because I don't use class hierarchies all that often.
I believe that covers closed protocols. If anyone is unsure of why this is useful, I can provide copious examples. The lack of this feature is the main reason Swift devs gratuitously overuse enums and in doing so destroy significant amounts of type safety that could otherwise be used to validate business rules at design time (i.e. declaring a struct X
with a corresponding XType
enum added to the struct as a let type: XType
member, and then doing runtime checks in code that requires specific "types" of X
and fatalError
ing or throwing errors on a mismatch. I see this sort of thing all the time. It’s devs creating their own type system the compiler can’t validate. But making X
a protocol and then making a struct for each corresponding type loses the necessary ability to exhaustively switch on the type, which can only be dealt with using workaround boilerplate and tedious wrapping/unwrapping).