Structural Sum Types (used to be Anonymous Union Types)

The language is ultimately not allowed to undermine the programmer by second-guessing their decisions. That's why despite the fact that swift is a type-safe and memory-safe language, it still offers UnsafePointer and unsafeBitCast.

In this case, any Error is unacceptable, because it relies on heap allocation, which is unacceptable for embedded systems and performance-critical code (like audio processing).

There are cases where exhaustive error checking is absolutely necessary and there is no way any unexpected error may occur, aside from the ones that are known.

A limited-error function may catch and re-throw errors from multiple other limited-error functions. In such a case, the only reasonable solution is to throw an anonymous sum type containing both error types. Otherwise, a dedicated single-use enum would need to be defined for every such function. This is the exact same reason why we have tuples and don't force the programmer to define a struct every time a function needs to return multiple values.

EDIT:

The ability to define limited-error functions, Swift's error handling mechanism can be used as a form of control flow, which isolates the happy path from the variety of unhappy paths, making the code a lot more ergonomic and maintainable, in light of many possible (yet perfectly predictable) outcomes of the operation.

Swift's error handling system is NOT built like typical exception handling (as in, errors can never unwind the stack or be accidentally ignored, if you don't count throwing main). This means that errors are a legitimate control flow mechanism that can and should be used to the fullest in order to express how an operation can end aside from the happy path. There's no good reason why this control flow should be somehow less type-safe than any other part of Swift. Type-safe means type-safe. Either the language is type-safe indiscriminately, or the language cannot be claimed to be type-safe.

1 Like

I feel like we may be straying back to the original thread’s purpose here, but I also don’t see a “type-unsafe Swift” argument here. Type-erased any Error existentials are still type-safe. They don’t let you put anything “in the box” that is a non-Error so I feel like you’re misusing “type-safe” here. (I am not a Computer Scientist, so perhaps there’s a strict definition for which your assertion on type safety is sound, but I’m approaching it from the layman’s perspective.)

Right. I think what @technogen means is static typing - deterministic types at compile time rather than at runtime. Or at least that's closer to the right terms.

To the point, it's not just about efficiency (existentials vs concrete types), it's about the coding experience (e.g. knowing what type(s) you're actually dealing with up front, exhaustive case checking, etc). Runtime type-safety doesn't provide those things (even though it's much better than no type safety, of course).

2 Likes

It errors at compile time if you try to put a non-error in an any Error var:

var err: any Error
err = "error"
// Error: Cannot assign value of type 'String' to type 'any Error'

I don't think it's important enough to warrant a special syntax like (Int | String), but I do think it would be worth adding something like this to the standard library, with a type named something like Either<each Option>. I have encountered plenty of situations where having a type like this would be useful, especially while using the new parameter packs feature.

3 Likes

Yes, which is better than nothing (e.g. Python or JavaScript), but still far from ideal in some cases.

See the origin thread for details, but in short if Swift only had "Any" - a bit like Objective-C by convention used id a lot - we'd all be pretty sad, or at least our users would be because of our much buggier apps. It's a bit weird that Swift goes to great lengths to have really good type inference and static typing, and actively discourages unnecessary use of existentials, for everything except exceptions.

2 Likes

That protocol has exactly zero user-facing requirements (technically, there are error domain and error code requirements for Darwin platforms, but they're hidden, so for all intents and purposes they don't exist).

Anything that can have a user-provided protocol conformance can be turned into an error with a single line of code and no forethought whatsoever.

For this reason, the error versus non-error distinction is practically nonexistent.
The Error protocol is essentially a marker protocol.

Exactly! To me, the term "type-safe" means (among other things) having information about what can safely be assumed to definitely be there, and what can safely be assumed to definitely NOT be there.

In this case, static type information in the form an exact set of error types fits the bill, while an absolutely arbitrary and completely un-actionable value wrapped in a marker protocol existential certainly does not.

If one simply has an instance of any Error, there's absolutely nothing they can do other than print it out and then cancel whatever they're doing. By definition, it's any error. Anything other than simple log-printing and triggering some sort of cancellation would require them to downcast it, which immediately invalidates the whole narrative about any Error being useful. The only reason why any Error is useful at all is to be able to execute an arbitrary throwing operation and rethrow its error without stopping to think why that error happened. At the very least, it's a lot better to do something like this instead:

enum MyError {
    case firstPartFailure(any Error)
    case secondPartFailure(any Error)
    // ...
}

They're still retaining the exact error that occurred without necessarily having to care about what the error was, but they're at least no longer mindlessly rethrowing an unknown error, because they can't be bothered to actually think about proper error propagation. But even then, I'd prefer this instead:

enum MyError<FirstPartFailure, SecondPartFailure> where
    FirstPartFailure: Error,
    SecondPartFailure: Error
{
    case firstPartFailure(FirstPartFailure)
    case secondPartFailure(SecondPartFailure)
    // ...
}

Because again, they have no business arbitrarily losing type information just because they're too lazy to actually think things through.

Exactly! One of the biggest wins in Swift, coming from Objective-C was static type information when it was needed. That is everywhere except for errors. I don't see any good reason why this would be an exception to the rule "have as much static type information as you want, but be able to lose it if you don't want it".

Thank you so much for weighing in! What you said is pretty much exactly what I was going for: language design encouraging lazy error handling, just like you said.

Sorry for the sarcasm, @Douglas_Gregor. I got frustrated by being misunderstood like this and dismissed as an "angry troll" while trying to solve a problem that I (and I'm sure, many others) have. I hope discussion arguments will be preferred to be taken rationally rather than emotionally, in order to keep these forums productive.

7 Likes

my two cents here is that i can't remember many instances where i promoted an enum union to a protocol and regretted it afterwards. "exhaustiveness checking" has a funny way of becoming irrelevant once you've figured out what the various enum cases actually have in common.

the pain point for me personally is that the “protocol unions” are rarely ever important enough to justify introducing a top-level protocol for them; enum unions on the other hand have the benefit of being nestable. so for purely lexical reasons, i find that i still have to stick with the awkward old enums.

i imagine being able to nest protocols in namespaces will go a long way towards mitigating this issue.

2 Likes

That does make sense, but it still doesn't solve the performance problem. We need an option that involves zero heap allocations. The exhaustiveness checking is not just for guaranteeing that you'll never have an unexpected type, but also contains enough information to determine the static size of the value. For the purpose of error handling, my use case is the necessity to gracefully handle all possible errors, where rethrowing them or ignoring them is not an option. Solving the exhaustiveness problem would also help solve the performance problem.

are static enums actually a performance win though? an enum’s stride is the maximum size of its payload element, if you have a single large metadata struct, that just adds padding to all the other cases.

existentials and generics are not great for small POD types like Point2 or whatever. but i've found that they are excellent for abstracting over more complex data, especially if you can constrain them to AnyObject.

That's true. But with static polymorphism, the performance is reliable and if your data structures are not intended to be big (e.g. statically-typed errors), then the padding wouldn't be too big. For embedded systems, you'd probably make heavy use of static storage in the executable, which would make the padding largely irrelevant.

With existentials you do get to save some memory by avoiding the padding, but you pay for it by dynamic allocation. In this context, it comes down to low time complexity versus low space complexity. Both options have to be there, because both options are critical in different cases.

For instance, in audio processing, memory is plentiful, but responsiveness is critical, so you'd easily be willing to pay for the speed with some potentially big padding.

On the other hand, for a bare-metal microprocessor code that is working with extremely limited resources, you may need to squeeze every bit out of the available memory, so you wouldn't be able to afford to use such polymorphism much anyway.

I don’t think that’s quite correct…? Both some P and Foo<P> provide polymorphism that is resolved at compile time.

only with respect to members of P, requirements of P will still dispatch to the conforming type’s witness at run-time, unless the compiler can specialize it to a known type.

Right, my point is that in many cases, the compiler can specialize it to a known type. For starters, AIUI (others please correct me if I’m wrong), some P always specializes to a statically known type; that’s what some means. And I assume that means calls to requirements of P on a value declared as some P do not require a witness table dispatch under normal circumstances.

And isn’t it true that when a function f has a type parameter P of protocol type, the Swift compiler can at its discretion emit a type-erased version of f that uses witness table dispatch and / or a specialized version of f for specific concrete types implementing P?

At the risk of getting lost in the details, I think (again AIUI) it is not correct to say that protocols in Swift only provide dynamic polymorphism. Or perhaps I misunderstood the OP.

1 Like

this is only true if all your code lives in a single module. in larger projects, failure to specialize is a common occurrence and a major pain point, and is something that takes a significant amount of planning and effort to prevent.

2 Likes

That’s new information to me. I’ve paraphrased some P as, “It’s a specific type that conforms to P, but don’t worry about knowing which one. You don’t need to know, but the compiler knows.” So that’s not really true? There are instances where the compiler doesn’t know what concrete type your opaque type is? Does that code still compile?

1 Like

some P as a parameter type is exactly the same as func foo<T: P>(_ value: T) with all of the cross module issues that implies. And while some P as a return type acts the way you describe, I'm not sure that it's completely transparent to the compiler accross module boundries, since the underlying type is allowed to change in later versions of the code.

e.g.

// version 1.0
func foo() -> some P { return A() }
// version 1.1
func foo() -> some P { return B() }

is allowed.

1 Like

I apologize as well for overreacting; I'm going to remove my reply, and let's get back to language design.

Doug

8 Likes