Status check: Typed throws

ksluder · September 21, 2023, 12:12pm

I don’t know how to state my position more clearly than “Swift made the correct decision a decade ago to regard typed errors as a mistake, and the language team should not accidentally undo that correct decision for the sake of avoiding existential errors in embedded environments.”

technogen · September 21, 2023, 12:15pm

I still keep hearing baseless opinions without any rational justification and counter-argumentation. An answer like "because said so" is not an answer and will not be taken seriously by anyone worth listening to.

This is the classical "argument from authority" logical fallacy.

ksluder · September 21, 2023, 12:25pm

The justifications for rejecting typed errors were thoroughly discussed and debated on the Swift mailing list at the time of the original discussions of Swift’s error system.

The typeless error design makes Swift a rather unique language among its peers. Since the beginning, there has been a large contingent of folks who have wanted a typed error system instead. The biggest reason they lost the argument at the time is because of ABI. The Swift team’s combined decades of lived experience made it clear that errors are only categorizable in trivial cases. Most functions call other functions, and once you compose functions you must compose their errors. This quickly scales beyond tractability.

Furthermore, Swift not only lacked a stable ABI at the time, the final shape of the ABI was still a ways off. Any type that appears in a type signature has ABI impact. It was known that resilience was going to be a critical part of the Swift ABI story, and a single universal error type is far easier to design resilience for than a type parameter.

These are not arguments from authority. They might just be arguments you weren’t around to witness.

technogen · September 21, 2023, 12:44pm

Douglas_Gregor:

Sajjon:
But what would be amazing, would be some kind of Swift compiler magic, which would allow me to just do:
public func fetchUser(...) async throws(MyError) -> User {
	let data = try await makeNetworkRequest(path: "user/..")
	return try decode(data: data, as: User.self)
}
fetchUser is an API that depends on several different subsystems, which are likely to change over time, and decoding of fairly arbitrary data. This is a prime use case for untyped throws, and I think it's wrong to seek syntactic sugar for it.

I agree, the nature of this particular throwing function (along with many others like this) necessitates handling any possible type of error.

But that doesn't necessarily preclude adding predetermined information to that error.
I can't imagine this being any less correct than simply throwing any Error:

func decodeSomething() throws(NewDecoringError) -> Something {
    <#...#>
}

in this example NewDecodingError is essentially equivalent to DecodingError.Context.

The error is still capable of representing any specific error type, but it also always tell you where in the object graph the error happened.

It's also very easy to modify the decoder to catch the error during every step of the decoding process and wrap it into this error type while adding the coding path.

ksluder · September 21, 2023, 12:50pm

On the contrary, it is this exact error wrapping process that was discussed heavily during the addition of try.

technogen · September 21, 2023, 12:59pm

Answering in order of appearance:

Swift has evolved way beyond what it was years ago. What used to be reasonable to assume back then isn't necessarily so now. The very fact that this thread exists and is being actively discussed by the core team is a testament that to that.
The ABI stabilization has not only brought a stable ABI at that point, but has made ABI stability an ongoing concern that every evolution proposal must address. This is specifically to enable features like this to be possible.
Making swift "a unique language" is a non-goal. The goal is making Swift a practical language that is capable of solving practical problems. This is not a "language of the year" contest.
Making features "far easier" is also a non-goal. It would've been far easier to just keep pure object-oriented approach that Objective-C has provided, add new syntax to it and call it a day. There are specific problems that need to be solved. It's not a matter of "too hard", it's only a matter of "not enough time right now" at best.
For those chunks of code that have to be resilient, they can either revert to using any Error or a technique that I've outlined in my previous post in this thread (e.g. wrapping any Error into a struct with some optional meta-information).
I've started leaning swift right after WWDC 2014 and have been keeping in touch with its development ever since. I'm plenty aware of its history.

technogen · September 21, 2023, 1:05pm

Again, it's not the language's business to dictate how specific API's should be designed. This is not a matter of memory safety or thread safety that can only be deterministically guaranteed by heavy language support and limitation. Even @Douglas_Gregor considered fully specific decoding error type a "bad idea", instead of something like "categorically unacceptable". The programmer should always be assumed to know better what they're doing than the programming language, when the programmer specifically requests the functionality.
any Error is still the default, which is perfect, because at the very least, if the programmer doesn't care or doesn't know - the language will choose the "safest" option.

technogen · September 21, 2023, 1:41pm

The fact that error type polymorphism is essential is beyond doubt. My argument is simply that there should be a choice between static polymorphism and dynamic polymorphism.

Swift has both enums with associated values and protocols. They both exist because choosing between static polymorphism and dynamic polymorphism is a necessity.

Why wouldn't the possibility of that choice be equally valid here as well? It's still very much possible by way of a dedicated union type enum.

michelf · September 21, 2023, 1:42pm

It occurs to me that we could allow typed throw without preventing evolution of the error type. For instance, this declares a typed throw:

func fetch() throws(NetworkError) { ... }

And because of this, fetch can only throw errors of this type. But that doesn't necessarily imply that at the call site catching NetworkError is enough. If we assume the type can change at a later time, we still need a catch-all clause to make sure no other errors have been thrown to handle them:

do {
   try fetch()
} catch let error as NetworkError {
   ...
} catch let error {
   ...
}

This solves the problem of making the ABI around thrown errors more efficient without changing the language model. That said, it'd be nice to know this last catch is a fallback that isn't expected to be triggered, so we could add a mechanism similar to switching on non-frozen enums, with @unknown in the last catch:

do {
   try fetch()
} catch let error as NetworkError {
   ...
} @unknown catch let error {
   ...
}

Here @unknown catch would emit a warning for any declared typed throws not handled by the previous catch clauses. It's a warning though, it won't stop code from building if the typed throw becomes untyped or changes to another type.

In other words, with @unknown catch you can be exhaustive at the call site, but you don't have to (just don't use @unknown).

This could be extended more generally by allowing more types in typed throws, even when it has no ABI or performance benefit:

func fetch() throws(NetworkError, JSONError, *) { ... }
// Here the * denotes that the function is able to throw anything,
// but the two first types are "worth" checking for.

And if you're using @unknown catch the compiler would warn you about any types comming from typed throws unhandled by previous catch clauses:

do {
   try fetch()
} catch let error as NetworkError {
   ...
} @unknown catch let error { // warning: missing catch for JSONError
   ...
}

So the benefits are:

At the ABI level, single-type typed throws can skip the existential box (except at the boundary of resilient libraries because evolution is allowed to change the type).
At the call site, you can use @unknown catch and let the compiler tell you about the error types "worth" checking. There's no pretence those are the only types however, hence the trailing @unknown catch for handling less expected errors.

But if you don't use @unknown catch, you are free to ignore that typed throw is a thing that exists.

Future direction: I suppose we could add "frozen" typed throws, like frozen enums, which could allow exhaustive catching without the need for @unknown catch at the end. That could be a liability however, so I don't know.

technogen · September 21, 2023, 2:12pm

Isn't this what the default behavior is for?
Unless anyone specifically wants to have a concrete error type, they can just write throws and not care. Any specific error type that are thrown out of their body should get automatically promoted to any Error, so even if the dependencies are using this feature, the client code still doesn't have to.

That makes sense, but that's what documentation is for. In a similar way, nothing in the language specifically says "don't use classes unless you have a good reason to". A lot of people (actually, most of the people I worked with, unfortunately) will just go with what they know, instead of using value types instead.

Feels like this is one of those cases where no language design could reasonably compensate for ignorance. We'd have to just communicate the dangers of over-specification of errors in documentation form (the swift book perhaps).

This looks amazing to me! It combines the runtime efficiency of statically-typed errors with the flexibility of untyped errors. With some extra thought, this could become the default. If the function body throws more than one type of error, the type under some would end up being any. This would keep the behavior exactly as it is now, but also open up optimization opportunities.

To reiterate, the choice between typed throws and untyped throws is not black and white. Swift provides both static polymorphism, type erasure, and type composition. All of those a very useful, while leaving the opportunity to still carry an untyped error as well.

Tino · September 21, 2023, 3:29pm

Sure?
At least it is not what I'd expect from union types: (A | A) should be A, whereas Either<A, A> can't be reduced.

technogen · September 21, 2023, 3:38pm

I agree that A | A doesn't make sense and will have to be collapsed. If the two types are known to be equal (e.g. they're concrete types), it should be a compilation warning. When they are not known to be equal, (e.g. in a generic context), the union type would be preserved, but when looking at a specialization of the generic context (where both types happen to be the same), the exact type would be collapsed into just one.

This is a limitation on top of what an enum can do, so if anything, it's reducing type complexity, because a union of two types that are known to be equal would never exist.

The only complexity would be to implement the type collapsing behavior if and when the two types are known to be the same.

A union with more than two types is equivalent to a union of two types where one of those types is also a union, so it's a simple recursive type collapsing algorithm.

Another simplicity comes from the fact that these unions are unordered: A | B is exactly equivalent to B | A.

Joe_Groff · September 21, 2023, 4:16pm

It may seem like a complexity reduction, but unions make type checking harder, since if unions collapse, then every individual type T is also potentially a substitution for T | U | V | ... where the generic parameters are all equal, potentially needing an exponential search to attempt the unification. It's also not something that you can always assume is safe to do, since in a generic context working with T | U the provenance of the T and U may be semantically important even when T == U, but if the distinction is collapsed when the types are the same, then you can't test for it, and the reachability of the else in a construct like if x is T { } else { } would change depending on the generic parameters.

technogen · September 21, 2023, 5:05pm

That's a very good point! Thank you for clarifying!

From what you described (and I agree), it seems like automatically collapsing a union into a single type is not desirable even if there was no exponential type checking problem.

If the distinction between T and U is useful even if T == U, then perhaps being able to refer to the generic type name itself would be a way of discriminating. Perhaps something like this:

    func getUnion<T, U>(_: T.Type, _: U.Type) -> T | U {
        // ...
    }

    func useUnion() {
        /// Simple case
        switch getUnion(String.self, Int.self) {
            case x as String:
                // ...
            case x as Int:
                // ...
        }
        // no need for default, the switch is known to be exhaustive

        // complex case (highly bikesheddable syntax)
        switch<T, U> getUnion(T.self, U.self) where T == String, U == String {
            case x as T:
                // ...
            case x as U:
                // ...
        }
    }

I know I've seen such local generic parameter syntax somewhere in the parameter pack discussion thread, so the concept of declaring local generic parameters for the purpose of type composition is not new. Another place where such an ad-hoc generic type declaration was mentioned was in parametrized extensions.

With this approach, the fact that the two types are the same gets abstracted away behind a local generic type declaration, which allows one to refer to two types that may be the same in local scope, but different in the scope of the callee (which in this case is significant to preserve).

EDIT 1:

With what I've been calling local generic parameter syntax (please, correct me if this technique already has a name), the union type doesn't need to be collapsed at all, which makes it a self-sufficient type, just like enums are. The type resolution becomes identical to that of enums and the whole point of such a union type would be reduced to essentially syntactic sugar for a dedicated variadic generic enum (with variadic number of cases) whose purpose is to facilitate ad-hoc static polymorphism. The fact that trivial decomposition of such a union type without clear type distinction (either by having distinct types or by using local generic parameter syntax) is impossible would not be a problem. Just like one can't refer to a member of an optional without unwrapping it first (using optional chaining, something like if let, or a switch on the optional itself), one also can't decompose a union type without being able to discriminate its component types (by either having them already distinct or introducing a distinction via the aforementioned syntax).

EDIT 2:

Even without this local generic parameter syntax, one can still pass the union type into another generic function that does specify two different generic parameters. It's a workaround, for sure, but it demonstrates that distinction can be introduced even if in a certain context, it might not be there.

func decompose<T, U>(_ union: T | U, onT: (T) -> Void, onU: (U) -> Void) {
    switch union {
        case x as T:
            onT(x)
        case x as U:
            onU(x)
    }
}

In fact, such a function could be made variadic generic and part of the Swift standard library (as a necessity for now, and as a convenience for the future).

EDIT 3:

Such a union would relate to enums in the exact same way as a tuple relates to structs, as in: provide anonymous alternative to an explicitly defined type, which is useful for any situation where definition a whole type for this one use case would be suboptimal. Granted, tuples also serve the purpose of providing bridging with C structs, but unions would also have the potential to solve a similar use case.

toph42 · September 21, 2023, 5:29pm

That doesn’t work, though, does it? I thought some P can’t be an any P because existentials can’t conform to protocols.

technogen · September 21, 2023, 5:30pm

They do for compiler-magic exceptions like Error. It's one of the types (aside from @objc protocols) whose existentials conform to the protocol itself.

dmt · September 21, 2023, 5:35pm

Side question. Why existential errors (and existentials in general) are considered to be unsupported in restricted environments? It seems like when a value of existential type is the only such value in a stack frame and immutable(throw/return operands satisfy this requirements), we can store it on top of the stack. We would need to sacrifice the ability to statically deduce the type size and instead store its size along the value. But with that we could do appropriate sub/add to the stack pointer.

technogen · September 21, 2023, 5:39pm

The existential still has a stable ABI, which dictates that its in-line storage is at most 3 * pointer size. If it ended up allocating the out-of-line storage on the stack (by virtue of forced optimization), then the existential would not be allowed to escape the function, meaning that you'd have to open it somehow before returning it. If that's the case, then using an existential becomes pointless, since you could either pass a statically-typed value as a generic parameter somewhere or return it as an opaque return type.
From stack frame perspective, throwing an error counts as returning a value.

dmt · September 21, 2023, 5:55pm

I'm aware about that, but it's not what I'm talking about. Let's forget about current layout of any Any for a moment.
Suppose we have new kind of structure, whose size is statically unknown, but instead stored within the value itself in the preamble.

|          header           |  value   |
| size | meta | other flags | raw data |

Total size of such object would be header.size + sizeof(header).
We can trivially construct such object from a local variable of known type.
And it seems like we can always return such object from a function by emplacing it on stack at start address of the current frame.

technogen · September 21, 2023, 5:59pm

Okay, but what would be the point? You're essentially reinventing opaque result types. Granted, this version would allow returning more than one concrete type, but it would be paid for with a change to the ABI to enable emplacing values on the caller's stack (recursively, because that value may also be returned further up the stack frame). This would only be useful for achieving dynamic polymorphism in an embedded environment. It's my subjective opinion of course, but it feels like embedded environments are unlikely to benefit from dynamic polymorphism, where static polymorphism can do the trick in most of the cases. Especially if that dynamic polymorphism comes at such a big implementation complexity cost.