Revisiting Union Types in Swift for typed throws

Revisiting Union Types in Swift

This topic has been discussed many times, and Union Types were always rejected for the language.
However, I believe the situation has changed significantly since Swift introduced typed throws.

Currently, if a throw function calls multiple throwing methods with different typed errors, we have two options:

  1. Erase the type by declaring throws Error, losing all static guarantees.
  2. Create an enum listing every possible error type, and then wrap every try in a do/catch block to map those errors manually.

Both solutions add unnecessary boilerplate and reduce readability.

Union Types could solve this problem elegantly.
The compiler could infer the union of all thrown error types inside a function, even without explicitly declaring it - just like it already does with other type inference.

For example:

func load() throws (NetworkError | DecodingError) {
    try fetchData()
    try parseData()
}

or simply

func load() throws { // NetworkError | DecodingError is inferred implicitly
    try fetchData()
    try parseData()
}

And if fetchData() or parseData() change their error types, the compiler could update the union automatically.

Given that Swift already supports anonymous types (like tuples and closures), I think Union Types would fit naturally into the language model and make typed errors far more practical in real projects.

5 Likes

Implicitly inferring the types of the thrown errors would mean that the whole body of the function has to be parsed in order to know which error types are thrown. And if fetchData() or parseData() don’t explicitly declare their thrown types, then you’d also have to parse their function bodies in order to type-check the functions, recursively until you find a function that either declares all its thrown types, or that doesn’t throw at all. In a pathological case, you’d have to parse the entire project just to know the types thrown by this one function.

I’m not an expert on the implementation of the compiler, but my understanding is that one of the goals is that the compiler should only need the declaration of a given function to know whether it is valid to call it. Requiring it to parse the body of the function as well could result in significant compilation-time slowdowns.

10 Likes

While I agree that union types should be added, this statement is not even true for closures.

{ try fetchData() } // () throws -> _
{ () throws(NetworkError) in try fetchData() } // () throws(NetworkError) -> _
1 Like

I believe the current way that you could match Swift’s implementation for:

func load() throws { ... }

would be:

func load() throws(any Error) { ... }

Are you wanting something to generate an equivalent to the following?

protocol NetworkDecodingError: Error { }

extension NetworkError: NetworkDecodingError { }
extension DecodingError: NetworkDecodingError { }

To then write

func load() throws(any NetworkDecodingError) { ... }

If we were to infer the Error types it could very well break ABI and require more effort to determine this data if many recursively nested functions were used. Also it would be more verbose to state that the Error shouldn’t be typed as you would need to write throws(any Error) and creates an obfuscation layer where two apparently untyped throws’ would be incompatible.

The current method is to write this

enum NetworkDecodingError: Error {
    case network(error: NetworkError)
    case decoding(error: DecodingError)
}
1 Like

Declaring a protocol is the easiest approach for now, but it’s not very useful because you can’t exhaustively switch over all possible conforming types. The compiler won’t help you, so there’s little difference between this approach and completely erasing the error to any Error.

An enum is better:

enum NetworkDecodingError: Error {
    case network(NetworkError)
    case decoding(DecodingError)
}

But you’ll still have to manually map all errors:

func loadData() async throws(NetworkDecodingError) {
    do {
        try fetchData()
    } catch {
        throw .network(error)
    }

    do {
        try parseData()
    } catch {
        throw .decoding(error)
    }
}
1 Like

It is also incompatible one with of the main motivations for introducing typed throws in the first place (i.e. embedded swift). From SE-413

Existential error types incur overhead

Untyped errors have the existential type any Error, which incurs some necessary overhead, in code size, heap allocation overhead, and execution performance, due to the need to support values of unknown type. In constrained environments such as those supported by Embedded Swift, existential types may not be permitted due to these overheads, making the existing untyped throws mechanism unusable in those environments.

1 Like

As much as I like the idea in principle, I seem to remember the core team being very explicitly against it. But also, how would you catch the errors? Would the union of two error enum behave like a bigger enum, with all cases from both enums, meaning you don’t have to un-nest them when catching them?

1 Like

Yeah, it's in frequently rejected proposals.

I guess like so:

do {
	try load()
} catch let error as NetworkError {
    ...
} catch let error as OtherError {
    ...
}

Union types could be implemented as if they were written:

enum AnonymousUnionType_123 {
    case case1(NetworkError)
    case case2(DecodingError)
}

automatically wrapped / unwrapped by the compiler, so you won't see or use them as enum.

How I’ve always imagined this feature is more or less syntactic sugar that the compiler uses to create a sort of enum-like structure under the hood. This is more or less how I envision it:

// Error type declaration
//
// The keyword `errortype` would function very comparably to `typealias` with the addition that the compiler typechecks the "unioned" types to ensure that they're error types and that the types aren't circular (i.e. `errortype ErrorA = ErrorB | ...`, `errortype ErrorB = ErrorA | ...`).
errortype NetworkDecodingError = NetworkError | DecodingError

This is valid for and can be used in the same contexts as type aliases (e.g. globally, to satisfy a protocol associated type requirement, etc.). The declared error type can be implicitly used anywhere any Error is accepted, either as an existential or for constrained types.

Under the hood, the compiler would create an manage an “enum-like” structure that would look very comparable to the following:

enum NetworkDecodingError: Error {
    case network(NetworkError)
    case decoding(DecodingError)
}

This makes the declared type a more materially concrete type that Swift can pass around to avoid the overhead created by existential types (unless of course one of the nested error types is itself an existential, which might not actually be allowed as you could create an implicit circular error type by obscuring the circular references indirectly through a protocol).

Since this error type is usable anywhere any Error is accepted, we would work this into typed throws modifiers:

func loadData<T>(from url: URL, ofType: T.Type) async throws(NetworkDecodingError) -> T where T: Decodable {
    ...
}

Then when we go to use it in a do-catch statement, it might look something like the following:

do {
    let decodedType = try await loadData(from: url, ofType: T.self)
} catch let error as NetworkError {
    // handle networking specific error
} catch let error as DecodingError {
    // handle decoding specific error
}

Under the hood Swift would do all of the “unpacking” necessary to do the matching and would look something like the following:

do {
    let decodedType = try await loadData(from: url, ofType: T.self)
} catch let NetworkDecodingError.network(error) {
    // handle networking specific error
} catch let NetworkDecodingError.decoding(error) {
    // handle decoding specific error
}

Like enums, we could also allow for the @frozen attribute to be applied to errortype declarations that would allow the compiler to ensure that the user handles potential future error types that might get added to the error type if the @frozen attribute isn’t used on the declaration:

// Library A:
public errortype NetworkDecodingError = NetworkError | DecodingError
public func loadData<T>(from url: URL, ofType: T.Type) async throws(NetworkDecodingError) -> T where T: Decodable

// Library B:
do {
    let decodedType = try await loadData(from: url, ofType: T.self)
} catch let NetworkDecodingError.network(error) {
    // handle networking specific error
} catch let NetworkDecodingError.decoding(error) {
    // handle decoding specific error
}
// error: 'Errors thrown from here are not handled because the enclosing catch is not exhaustive'
// note: 'handle unknown errors using "catch let @unknown error"'

In essence, this would be more or less a syntactic wrapper around enums to support this feature. Would love some feedback on this approach for supporting this feature.

1 Like

“A small thing that composes“ would be to think of those as of anonymous/structural enums, parallel to how tuples relate to structs currently.

So:

typealias Foo = (Int | Float)
typealias Bar = (integer: Int | real: Float)

Should be roughly equal to, respectively

enum _AnonymousFoo {
    case `0`(Int)
    case `1`(Float)
}

enum _AnonymousBar {
    case integer(Int)
    case real(Float)
}

Other nuances should be similar to tuples (labeled/unlabeled subtyping, standard protocol conformances, etc)

The only possible brand-new addition would be some new type-based syntax for pattern matching:

if case let i as Int = foo { ..
// effectively the same as
if case .0(let i) = foo { ...

Anyhow point is, this should be a separate feature we should introduce before going into the error thing.

1 Like