Precise error typing in Swift

tcldr · September 23, 2021, 7:06am

Yes, I agree you don’t need typed Errors for exposing underlying errors. That’s not the point being made.

The discussion is, if typed Errors are introduced, how can you discourage their use for situations where they’re not needed.

My opinion is that to discourage the use of typed throws in situations where they’re not required, a mechanism needs to be provided for a developer to get as much context for the situation a throw occurs as possible. Using an underlying error properly would be one mechanism, attaching context for the throw in the form of a stack trace generated at the point the error is thrown would be another.

Hopefully between these things we are able to give enough context to developers about the situation in which an Error is thrown that they don’t feel they need to encode their own context in to their Errors via the type system.

gwendal.roue · September 23, 2021, 7:25am

I like how an idea I never heard before is emerging from this thread: that some precise error hierarchies are an encoding for a stack trace.

In other words, maybe the need for a precise hierarchy can hide a need for a precise stack trace, in a form of the XY problem:

I need typed throws.
(long and tiring discussion)
Actually I needed to know where an error happens.

Maybe this discussion will provide arguments for grabbing a stack trace whenever an error is caught (which can be logged, for example)? I'm very curious.

dfunckt · September 23, 2021, 11:35am

I sympathise with that view but typed throws can answer other questions as well. For example, typed throws document (to some extent) what errors are even possible, not just where they originated from after the fact. So no, a solution that provides a way to get the stack trace is only partial.

"Generic" throws (ie. the status quo) are a subset of typed throws. It feels strange that we'd be adding functionality that is far more general but not embrace it and effectively guide users to ignore it and opt for generic throws by default. I wonder if we should instead try to imagine what would be the best way to utilise typed throws, and teach that. We'd then be able to introduce generic throws as a special case to use when the downsides of typed throws (which exist, as John lays out in this thread's top post) outweigh the advantages they bring.

In my view, the biggest downside is that it is inconvenient for error types thrown from deeper layers to propagate upwards: one would typically have to wrap them up and rethrow. This problem mostly stems from the use of concrete types (which aren't really "malleable"). A potential alternative is perhaps something that involves what Gwendal hints above -- error type hierarchies. That is, a few basic semantically-meaningful protocols that form an hierarchy. So we could then encourage people to create one or more protocols for every meaningful bunch of functionality they introduce and type their throws with these protocols instead. Error would then be the ultimate, catch-all error type. (Presumably, this would also alleviate the need for error type "combinators", like errorUnion).

In time, Swift could potentially come up with a standard set of protocols in its standard library. The obvious counter-example for "native" error hierarchies is Java, where exception types in my opinion feel somewhat forced. I think though that this is mostly due to single inheritance, which protocols aren't limited to and I think we can do a lot better if we actually determine Swift should come with a basic set of protocols.

Up to now, I've only seen explorations that involve concrete error types, so hopefully protocols might provide a reasonable alternative way forward.

tcldr · September 23, 2021, 12:24pm

I agree that would be nice, but in practice does that not mean you’ll effectively encode your dependency tree?

If you call a throwing function at the application level that touches a few subsystems, the set of possible errors is going to consist of the base error types of all your subsystems, (and their subsystems, and their subsystems.)

Is that actually useful?

(I’m fully plus one on typed throws, btw. but I agree with the premise of the original post that their utility is limited to tasks such as local control flow rather than deep nested hierarchies.)

allevato · September 23, 2021, 5:04pm

Thanks for bringing this up, because typed errors would have an impact on APIs if we get to the place that @beccadax pitches (and I hope we do) in [Pre-Pitch] Import access control: a modest proposal where the default behavior of import would not leak everything and you would need to use public import to import anything that you want to use part of the public APIs that you export.

When precise error types aren't encoded as part of the API, you can do something like this:

ModuleA.floop calls ModuleB.blip and throws anything that blip throws
Otherwise, ModuleB is purely an implementation detail of ModuleA; the latter doesn't use any of the former's types in its APIs, so ModuleA imports ModuleB as implementation-only
MainModule imports ModuleA to call floop. It can either:
- Import ModuleB directly if it wants to handle those errors specifically
- Ignore ModuleB and treat any of its errors as existentials with generic handling

The benefit of implementation-only imports is that dependency graphs can be pruned, which is very important for build performance of large projects. If MainModule doesn't care about the specific types of errors that might come from ModuleB and just wants to handle them generically, it doesn't have to include ModuleB (and its transitive closure) as a compile-time dependency. Or it can choose to explicitly opt-in to using ModuleB by depending on it and importing it.

If APIs started adopting typed errors everywhere, then those error types would have to become part of the public API of the module. In the example above, if ModuleA.floop was declared as throws(ModuleB.BlipError) then it would have to import ModuleB as a public import instead of an implementation-only import, thereby "infecting" ModuleA with the requirement that ModuleB also be present in the dependency closure passed to the compiler whenever anyone imports it. MainModule would have no choice but to ensure that ModuleB was present, whether it actually used it or not.

This can be subtle, but discouraging people from naïvely leaking implementation details into their API when they're really not necessary is critical if we want to move toward a model where public imports are no longer the default.

So I agree with much of the sentiment that I've seen so far in this thread: typed errors have their place in certain situations, but the default should be to not use them and most users should be strongly discouraged from doing so. I'm not sure what the best way for the language/compiler to do that is, though.

It almost feels like there should be a higher barrier to declaring that you throw errors declared in a different module than errors declared in your own. In a future public vs. implementation-only import world, that barrier could be having to change an import to a public import (assuming the module wasn't already public imported). And maybe that would be enough to force an author to think about that API boundary and its implications.

dwaite · September 23, 2021, 7:00pm

This is a great summary write-up and must have taken a considerable amount of time to create!

It does a very good job capturing the technical issues and real-world ramifications of typed Errors.

A common pattern is to use an enumeration of potential error cases for a module. At the module level, you will commonly see cases where additional functionality will cause new error conditions and even new error sources, and over-specifying errors will break ABI.

Use of module level errors often means you have to choose between error handling completeness needing to deal with errors which are not possible within an activity, or different error types for each activity (breaking monadic processes and increasing code).

This is exasperated by the need to represent sum types through type composition into an enum. "Common errors plus this one special case" requires additional code both to generate and to handle the resulting error representation.

With regards to low-level code, all of these issues are present in POSIXError. Different operating systems have both added new error code values and overrode existing values with additional meanings The same value may have wildly different interpretations based on which function was called or even which flags were passed to the function.

While having an exact POSIXError throw may give efficiency benefits for callers, I'd argue that callers are being precisely bound to the ABI of a particular OS and version by the error interface that results - and that solving this binding issue means creating a higher level interface for which POSIXError is no longer appropriate.

Terje · September 23, 2021, 11:13pm

Yes, knowing where errors originate is nice.

Swift implements errors differently than other languages. It doesn’t unwind the stack but just returns it as a hidden in out variable (something like that, correct?). So the “penalty” or “burden” is much less. So using thrown errors as a control flow mechanism is just fine I have read somewhere, long ago.

I’m using errors in that way in a library. I throw a few specific errors, catch them, expand my backend storage, and try again. A typed error seems appropriate here (?) but other errors could occur too (ref. FileHandle).

Now “my” errors could occur in several places. Knowing where they originate would be very useful during debugging. I’m using #fileID and consorts as associated enum payloads in my error type. But that just doesn’t feel right. That information is not required for the proper operation of my library. But without it, I would be stepping through a lot before hitting the bug.

On the other hand though some of these errors (when caused by faulty user input) might have to be (todo:) logged at runtime. So context (user input, document state and probably origin of the error too) would be important to log for post analysis.

Again, just sharing my experience as an ordinary user.

Saklad5 · September 24, 2021, 2:32am

I’ve always thought of the current approach as equivalent to Result, with the exception that Failure is an Error rather than anything conforming to it.

Here’s my position: if you expect an error to be handled by consumers of your API, it needs to be part of the API in some form. I think the most powerful form of this is with protocols: you don’t need to specify what the errors are, necessarily, but you can guarantee they have certain traits. For instance, imagine how much more useful Foundation.RecoverableError would be if you could actually guarantee its use?

Beyond that, I think it would be useful simply to describe the general reason for an error being thrown. This could be done like so:

enum RefreshError: Error {
  case networking(Error)
  case persistence(Error)
  {…}
}

Even if a “miscellaneous” case is needed, you’ve still done something very useful: you’ve ruled out the defined categories. I’d argue that anyone using Throws callouts at all could easily benefit from this sort of thing. Even now, if you change Throws in a meaningful way, that should be considered an API change. This would just make the compiler understand it. And with enumeration cases conforming to static method requirements, there’s a lot of potential for handling nested errors in a useful way.

Finally, one of the most important considerations in my book: internal methods. In non-public code, especially smaller methods that can only fail in extremely specific ways, precisely defining errors could be a significant boon for both optimization and code readability.

I’d argue a similar degree of precision could be handy for code that never propagates other errors, since there is no risk of exposing implementation details in that scenario. Whether that amount of detail should be embedded in the API contract is up to the programmer, of course.

Saklad5 · September 24, 2021, 2:43am

As a rule, I think any errors that are propagated through multiple call sites are unlikely to get handled in any specific way. Precise error typing is useful purely for the immediate call site, and I imagine most outcomes would consist of retrying the call.

But that doesn’t make precise typing useless. HTTP status codes are similarly vague in many cases, but they’re invaluable for determining how to handle failure. You don’t need to handle all of them to benefit.

Saklad5 · September 24, 2021, 2:46am

I’m not too familiar with compiler optimization, but I have to wonder: would there be any value in being able to throw an opaque Error type?

I could see compile-time checks to ensure you aren’t propagating different errors unintentionally being useful in very specific circumstances, at least.

Stepping back from specific use cases, I think there’s a certain elegance in having throwing described similarly to returning.

In Swift, there aren’t actually returning and non-returning functions. There’s no need: we have functions that return Never, functions that return Void (the implicit default), and so on.

I’d like to see throwing treated the same way: rather than having throwing and non-throwing functions, let’s have functions that throw Never (the implicit default when not marked with throws), functions that throw Error (the implicit default when marked with throws), and so on.

That’s a considerable improvement in terms of language complexity, I feel.

dwaite · September 26, 2021, 1:32am

Now “my” errors could occur in several places. Knowing where they originate would be very useful during debugging. I’m using #fileID and consorts as associated enum payloads in my error type. But that just doesn’t feel right. That information is not required for the proper operation of my library. But without it, I would be stepping through a lot before hitting the bug.

Note that with a debugger you can break on the errors to see the origination point.

Capturing a stack trace would be less useful because Swift tends to not implicitly encourage error/exception 'wrapping', and more explicitly does not have an 'innerError' which could be used to get the initial cause in a generalized fashion if errors were wrapped.

Stack traces can be useful in post-mortem, especially in languages which will attempt to keep going in the face of logic errors. Swift has tended a bit more toward single user systems and will often fatalError out of the process completely - so you can theoretically do post mortem on the core file and see not just the stack trace but the local state at the time of the error.

tcldr · September 26, 2021, 1:14pm

The fact that crashing the system is the best way to get good failure context seems like an opportunity for improvement. It would be interesting to hear the thoughts of the SSWG.

But even as someone primarily focused on single user systems, especially with today's software business of converting free users to paid, reviews and ratings, etc. a crash isn't an ideal way to end a user's session if it can be avoided. And even then, if you catch the errors you expect, but crash out of the ones you don't, your stack trace will be useless as it's in the catch block.

It also requires that you're extra diligent about all but the least likely error types that could occur. If you forget that there's a DB validation error buried deep in a module, or a third-party module changes one of their dependencies and introduces a new common error type, now you're crashing. And as mentioned before, that might be why some people want the compiler to check statically for them.

It would be great if we could get this context without a crash. For me, this would introduce an option to handle all the common errors I expect, but then log and track the ones I'm not, or that may appear as a result of changes elsewhere in the system. Then, if it's clear that it's something that needs specific handling, I can deal with it armed with the information I need rather than some vague POSIX error that occurred somewhere after calling this method.

I think solving this context problem is why a lot of people lean on underlying errors or think to wrap errors in their own custom types. In fact, there's a write-up of using underlying errors this way on JustEat's Tech Blog: https://tech.justeattakeaway.com/2018/01/26/effective-ios-error-management/

But if we can create a mechanism that solves this problem better than an underlying error can, then I think the need for wrapping errors evaporates and the community will begin to advocate it as an anti-pattern.

Saklad5 · September 26, 2021, 1:17pm

fatalError(_:file:line:) should be a last resort, not a common design pattern. If you expect it to happen, you’ve designed your code wrong. This is especially true of code that others use.

If you throw an error, you at least flag that the code might fail. If code should never hit a certain point, use assertionFailure(_:file:line:) and proceed anyway if possible. Skip the nil case, etcetera. The compiler can optimize it out anyway.

Saklad5 · September 26, 2021, 2:52pm

Wrapping errors is an effective form of progressive disclosure. I initially thought it was a crutch, but I’ve since realized it’d be the right call in many cases regardless of language limitations.

tcldr · September 26, 2021, 3:46pm

Honestly, I'd be satisfied with this outcome but my feeling is that this concept was discouraged in Swift. I think wrapping errors makes a lot more sense if it's supported by the standard library. A WrappingError protocol with a single underlyingError: Error? member would serve to standardise the practice and make it simple to create tools that can log across modules. It loses utility if each module has their own way of doing things.

Maybe it's all that's needed to prevent the proliferation of precise Error types.

Saklad5 · September 26, 2021, 3:47pm

That feels redundant. Heck, not every case necessarily has an associated value.

dwaite · September 26, 2021, 7:16pm

It is never the best way to get local failure context; that’s why we have debuggers and logging.

The proper system WRT server side code is to design for each request to be an effective process, with individual request being capable of failure due to invariants.

Optimization of that by combining requests to be serviced by a single OS process should not compromise that behavior, nor lead to cascade failures of other requests.

There has been talk previously of supporting fatalError-style errors at the actor level rather than a full system level. There has also been discussion I believe of an AppDomain style interface. I don’t know if there has been development work toward either optimization.

fatalError should be for what it was meant for - exiting when invariants cause invalidity of the underlying system.

assertionFailure IMHO is only useful to trigger a debugger earlier in the cleanup process when an unexpected error is going to lead to the process exiting or the major activity failing after cleanup. The double edged sword is that the developers may never actually test that cleanup path as a result of the debug assertion.

tcldr · September 27, 2021, 8:04am

Right. So I think we're all agreed then that using fatalError(_:) as a surrogate for gleaning developer context is heavy handed. But that brings us back to whether the status quo with regards to debuggers and logging might be leaving developers wanting with regards to failure context.

During development, the debugger is perfectly sufficient. But in production we must lean on logging.

So, currently, we must hope that the Error we catch provides enough context for us to identify, diagnose and solve any issue that might arise during the execution of our program in a production environment.

My feeling is that many developers will achieve this by wrapping errors, to leave a breadcrumb trail of where in the stack an issue has arisen. It feels like a 'good enough' solution for logging failure context. They might do this by adopting an underlyingError: Error? member in their module's own base error type, or perhaps they might be tempted to create a union type of all their sub-modules errors to achieve the same thing. And if I understand the original post correctly, is what we're trying to avoid.

So, if we are to convince developers that they don't need to wrap their errors. How do we give them a breadcrumb trail?

Jeehut · April 8, 2022, 6:11pm

I just read the initial post (but only once and quickly) and for now, I'm confused.

My main takeaways are:

Precise Error Types are bad for application-level programming
Precise Error Types are possible though and should be considered for low-level programming

Before I re-read the long text over and over again to understand it better, can maybe someone explain to me in simple terms why the following readFromFile function from one of my apps should better not use a precise error type?

public enum ReadWriteFileError: Error {
  case fileToReadIsDirectory(fileUrl: URL)
  case fileToReadDoesNotExist(fileUrl: URL)
  case foundationReadFromFileError(fileUrl: URL, error: Error)
  case typeDecodingError(error: Error)
}

enum SomeCodableType {
  /// - Throws: ``ReadWriteFileError``
  public static func readFromFile(at url: URL) throws -> SomeCodableType {
    let (fileExists, fileIsDirectory) = FileManager.default.fileExistsAndIsDirectory(atPath: url.path)

    guard fileExists else { throw ReadWriteFileError.fileToReadDoesNotExist(fileUrl: url) }
    guard !fileIsDirectory else { throw ReadWriteFileError.fileToReadIsDirectory(fileUrl: url) }

    do {
      let fileContentData = try Data(contentsOf: URL)

      do {
        return try Self.decoder.decode(Self.self, from: fileContentData)
      }
      catch {
        throw ReadWriteFileError.typeDecodingError(error: error)
      }
    }
    catch {
      throw ReadWriteFileError.foundationReadFromFileError(fileUrl: url, error: error)
    }
  }
}

Although my function guarantees that I can only get a ReadWriteFileError when calling this function, I still have to add a catch-all to make the compiler happy when using it, because this guarantee is only documented, not communicated to the compiler.

I understand that we should not use precise error types everywhere, but in some places I can imagine they make a lot of sense in high-level code, too ... what am I missing?

Michael_Ilseman · April 8, 2022, 7:25pm

Jeehut:

public enum ReadWriteFileError: Error {
  case fileToReadIsDirectory(fileUrl: URL)
  case fileToReadDoesNotExist(fileUrl: URL)
  case foundationReadFromFileError(fileUrl: URL, error: Error)
  case typeDecodingError(error: Error)
}

Do you intend for this enum to be @frozen or not? If it's not frozen, then users will still need a fall-through for unknown cases. If you want to add @frozen but keep the ability to throw new kinds of errors, you could add a case unknown(Error), which lands users in the exact same boat but with an extra level of error wrapping. If you wanted to leave the enum exactly as-is, it's still the case that thrown foundation errors and type decoding errors are wrapped up in an extra layer instead of just throwing them directly.

I'm not saying it's bad to throw a precise error from a file read function, but for libraries with any concern of stability (source or binary), it can be non-intuitively foot-gunny.