[Pitch N+1] Typed Throws

tera · October 2, 2023, 12:10am

Nobody suggested that. Or do you mean that we should apply the same line of reasoning and get rid of rethrow?

dmt · October 2, 2023, 12:32am

Yep. It's like a thought experiment - I proposed it and reasoned about its uselessness. And yes, all the same reasoning can be applied to rethrows. But I should emphasis that I don't suggest to get rid of rethrows, because it's already in the language. I propose the following:

Don't support typed rethrows.
Threat rethrows as it's always untyped: either Never or any Error.
Discourage people from using rethrows.
Rewrite functions in stdlib that use rethrows to use typed throws with exact this pattern: func f<E: Error>(g: () throws(E) -> Void) throws(E). Don't ever ever constraint E to have something like a static factory.
Add a special attribute that will instruct the compiler to emit a type-erased trampoline for ABI compatibility.

stephencelis · October 2, 2023, 1:06am

As an aside I'd like to point out that a very common error that confuses users is when they wrap a @discardableResult-returning function with a continuation like withAnimation. While an explicit _ = will generally work around the error, I've seen lots of folks seek help in order to find this solution, and a rereturns would have avoided this confusion in the first place.

dmt · October 2, 2023, 1:14am

And by doing so this person made it absolutely non-equivalent to rethrows. Maybe this constraint is actually needed, IDK, but the function is not rethrows anymore.
I made a table.

Typed throws	Is it an equivalent for rethrows?
`func f<E: Error>(g: () throws(E) -> Void) throws(E)`	Yes. It's an equivalent for `func f(g: () throws -> Void) rethrows`
`func f<E: ErrorProtocolThatRequiresInitWithNoParameters>(g: () throws(E) -> Void) throws(E)`	No. Why would someone constraint the error type in the very abstract function `f`, and then think it has the same semantic as `rethrows`?

dmt · October 2, 2023, 1:32am

You mean that "result discardability" is not preserved by the closure return type inference?

var s = Set<Int>()
let f = {
  s.insert(42)
}
f() // Result of call to function returning '(inserted: Bool, memberAfterInsert: Int)' is unused

Yep, but honestly I don't think it should, or it least it should be controllable. I would rather have a compiler argument to ignore @discardableResult.

xwu · October 2, 2023, 3:44am

No, again, these two are not semantically equivalent.

In the case spelled throws(E), f may throw when g does not throw: for example, f might throw for every invocation after the first one where g throws. In the case of rethrows, this would be impermissible.

In my view, at least, there's little utility in debating whether the scenarios in which these two are not equivalent constitute only "weird" cases[*], This is because the distinction between throws and rethrows has been explicitly and publicly documented ever since Swift 2, when the feature was introduced:

rethrows is identical to throws , except that the function promises to only throw if one of its argument functions throws.

Users are absolutely entitled to rely on the documented semantic guarantees of the language without showing up 165 posts into a Swift Evolution thread years later to explain themselves: this isn't the bar for removing or modifying the semantics of a released feature.

The core of what Doug is talking about, unless I'm mistaken, is eliminating a subset of rethrows usage that was formally not permitted but possible to write. Namely, the same document above formalizes what it means to rethrow as follows:

More formally, a function is rethrowing-only for a function f if:

it is a throwing function parameter of f,

it is a non-throwing function, or

it is implemented within f (i.e. it is either f or a function or closure defined therein) and it does not throw except by either:

calling a function that is rethrowing-only for f or

calling a function that is rethrows, passing only functions that are rethrowing-only for f.

It is an error if a rethrows function is not rethrowing-only for itself.

This formulation, notably, does not permit catching an error, wrapping it, and then throwing, because it is not "throw[ing] by either calling a function that is rethrowing-only for f or calling a function that is rethrows, passing only functions that are rethrowing-only for f."

—
[*] Not that I think it would be weird. For example, in analogous fashion, iterators are guaranteed to return nil for every invocation of next() after the last. If, instead, Swift had designed iterator protocols to require next() to throw if past the end of a sequence, then the iterator for sequence(first:next:) would indeed throw for every iteration after the end without necessarily invoking the closure passed as the argument for next.

ksluder · October 2, 2023, 4:39am

Doesn’t this implementation of Task.runInline() violate that formal definition?

  public static func runInline(_ body: () async throws -> Success) rethrows -> Success {
    return try _runInlineHelper(
      body: {
        do {
          let value = try await body()
          return Result.success(value)
        }
        catch let error {
          return Result.failure(error)
        }
    },
      rescue: { try $0.get() }
    )
  }
}

Specifically, this function can throw if the Result.get() call in the rescue closure throws. Results.get() is throws, but it’s not an argument of runInline.

Of course, we all know that the get() call here is only going to throw if the invocation of body itself threw.

xwu · October 2, 2023, 5:32am

Good point, but this implementation also wouldn't compile in Swift 2 for lack of await and Result: one certainly needs to read that document mutatis mutandis, as the lawyers might say.

wadetregaskis · October 2, 2023, 5:33am

I'm not sure that's the correct interpretation. Or at least, it's not the only viable interpretation. I find that section of that internal compiler documentation very hard to understand in any case, but as best I can decipher it seems to say a rethrowing function can throw only if a function it calls throws - it does not specifically say the thrown error has to be passed up as-is. I interpret it as alluding only to the timing at which an exception can be thrown, not any restriction on what is thrown.

The actual Swift Language Reference describes rethrows much more clearly and explicitly says catching and throwing a different exception is permitted:

A rethrowing function or method can contain a throw statement only inside a catch clause. This lets you call the throwing function inside a do -catch statement and handle errors in the catch clause by throwing a different error.

I think the actual Language Reference is the authoritative definition of intended functionality, because it's what the users of Swift actually learn from and rely on. Even if somehow its definition of rethrows was technically inaccurate at some point, it's now the definition de facto.

xwu · October 2, 2023, 5:38am

wadetregaskis · October 2, 2023, 5:47am

Also, looking at the Git history, it appears this behaviour was newly permitted as of Swift 3. In Swift 2 (based on the earlier version of that section of documentation) it was explicitly not permitted to throw a different error. In fact the docs were very clear that you could only pass up errors created elsewhere, you could not in any way create your own. You couldn't even use a do-catch block inside a rethrowing function!

A rethrowing function or method can't directly throw any errors of its own, which means it can't contain a throw statement. It can only propagate errors thrown by the throwing function it takes as a parameter. For example, it is not possible to call the throwing function inside a do-catch block.

So yeah, originally it appears that rethrows truly was a "I just pass through errors untouched" marker.

Typed throws alone cannot technically express that either, as discussed earlier in this thread, although it gets closer.

It appears, from that clearly deliberate change, that there was a need for that capability. It'd be interesting to hear from the author - @Alex_Martini - as to whether they recall the reasoning for the change.

Though in any case, probably a more important gauge of its need would be to survey existing code at scale.

tcldr · October 2, 2023, 2:41pm

At risk of derailing the thread, one further benefit of reducing rethrows to syntactic sugar is the precedent it sets for enhancing other effects clauses:

For example, the idea of a reasync clause has already been raised as a counterpart to rethrows, but with the ideas surrounding typed throws it sheds some new light on what could potentially be achieved.

One of the big things that seems to be coming out of this proposal is just how useful having a generic 'handle' for an effect is when creating composable types like Sequence or AsyncSequence.

It crossed my mind that this might be useful for the async clause, too. Beyond whether something is just asynchronous or not, I've found myself occasionally frustrated that I haven't been able to 'thread through' the asynchronous execution context into third party modules.

However, if the async effects clause offered a generic handle, this becomes possible.

func a1()
// is equivalent to
func a2() async(Never)

func b1() async
// is equivalent to
func b2() async(any Actor)

@MainActor func c1() async
// is equivalent to
func c2() async(MainActor)

Jumhyn · October 2, 2023, 2:47pm

To be clear, in my use of the word 'weird' I didn't mean to dismiss any motivation for the "throw the same type even when input function hasn't actually thrown" behavior (I agree with you that some sort of caching behavior like you mention seems pretty reasonable). A better word would likely be 'obscure' instead, construed as applying narrowly to the hoops implementors are required to jump through to achieve that behavior. It is not something that authors of functions using the 'new rethrows' would stumble on by accident—use of this functionality would have to be very deliberate.

In particular, since rethrows would make the error type parameter anonymous, it seems to me like it would be very difficult to achieve an implementation which recovers an error from a source other than the input function. I believe you'd need to have a local generic function to provide a name for the type, something like:

func throwsOnItsOwn(_ f: () throws -> Void) rethrows {
  func lookupError<E: Error>(_: () throws(E) -> Void) -> E? {
    return errorCache[ObjectIdentifier(E.self)] as? E
  }
  
  if let error = lookupError(f) {
    throw error
  }

  f()
}

It is arguably desirable to support this pattern where we are permitted to throw an error if the input function would have thrown, and IMO this is a sufficiently obscure construction to strongly discourage anyone from using it unless they really want to.

Of course, even if we model rethrows externally as introducing an error type param, we could always maintain additional checks internally which continue to enforce today's guarantees about when, precisely, an error may be thrown.

One other thought that occurred to me:

Douglas_Gregor:

func map<T, E>(_ body: (Element) throws(E) -> T) throws(E) -> [T]
ABI considerations aside, this change is source-compatible and has the added benefit of working nicely with closures that have typed throws.

While this perhaps is strictly true because map was already generic, introducing a generic parameter is to a non-generic function is not a source-compatible change (since it will break any clients currently creating an unapplied reference to the function).

func f(_: () throws -> Void) rethrows {}
func g<E: Error>(_: () throws(E) -> Void) throws(E) {}

let h1 = f // currently ok, decays to '(() throws -> Void) throws)`
let h2 = g // error: generic parameter 'E' could not be inferred?

So it seems like we'd need some additional rules here—maybe it would be sufficient to say that if a generic parameter appears only as an argument to throws clauses then we will default it to any Error absent other type information?

tcldr · October 2, 2023, 3:00pm

For me, the bigger issue is that we're not applying the same strict judgement to rethrows that we do to the typed throws equivalent. For example, it's fairly trivial to 'break-out' of the intended semantics of rethrows, too:

struct MyError: Error {}

func f(_ a: () throws -> Void) rethrows {
  func g(_ b: () throws -> Void) throws {
    throw MyError()
  }
  try g(a)
}

In this example, the supplied closure, a, is never called yet MyError is thrown. This clearly breaks the intended semantics of 'f should throw if and only if a throws'.

EDIT: To be fair, this method does trigger a runtime exception when called with a non-throwing function.

dmt · October 2, 2023, 3:07pm

Jumhyn:

introducing a generic parameter is to a non-generic function is not a source-compatible change (since it will break any clients currently creating an unapplied reference to the function).
func f(_: () throws -> Void) rethrows {}
func g<E: Error>(_: () throws(E) -> Void) throws(E) {}

let h1 = f // currently ok, decays to '(() throws -> Void) throws)`
let h2 = g // error: generic parameter 'E' could not be inferred?
So it seems like we'd need some additional rules here—maybe it would be sufficient to say that if a generic parameter appears only as an argument to throws clauses then we will default it to any Error absent other type information?

I think explicit is better than implicit, so, IMO, it'd be better to have a distinct function with the same name, but we would need to make sure it will not introduce ambiguity.

func f(g: () -> Any) -> Any {
  g()
}
func f<T>(g: () -> T) -> T {
  g()
}

func t() {
  let f1 = f(g:) // not ambiguous, f1: (() -> Any) -> Any
  let f2 = f(g:) as (() -> Int) -> Int
}

Jumhyn · October 2, 2023, 3:47pm

That the language currently allows this is (IMO) straightforwardly a bug that should be fixed. Were we proposing specifically allowing your example construction in a rethrows function, I think we'd be applying equal or greater scrutiny.

I don't think we'd need to introduce overloads to achieve this—if we did have a rule that generic error parameters default to any Error I'd of course want it to still be possible to disambiguate via as (() throws(MyError) -> Void) throws(MyError) -> Void.

Karl · October 2, 2023, 4:08pm

I'm not really sure exactly what is being discussed here -- the removal of rethrows? Or just general observations that a similar concept may be expressible in another way? I read the discussion, but I still can't tell exactly.

If it is the former, I hope it comes with some very, very compelling reasons for breaking source compatibility. Especially since, as has been noted, this "other way" is not an exact 1:1 replacement for the existing rethrows feature. That means there may be libraries which cannot implement that other way without breaking their APIs.

dmt · October 2, 2023, 4:24pm

Yeah yeah, I understand what you're saying. My point is the opposite: if we have a capability to automatically default <E: Error> to any Error, why don't we have a capability to default <T: WhatEverConstraint> to any WhatEverConstraint ? This would be a crutch.
Also, I still hope someday generic closures will be supported...
So, instead of doing that we can provide authors of libraries an option how to move toward typed throws and maintain source compatibility - two functions: erased and typed.

Jumhyn · October 2, 2023, 4:24pm

The proposal is not to remove rethrows, but to simplify the conceptual model by recasting it in terms of the thrown error type:

JacksonUtsch · October 2, 2023, 4:30pm

TLDR;

+1 to typed throws. Unions/anonymous sum types would be great. Preferred <Error> syntax.

My idea of error handling

It seems to me that error handling is nearly half of the equation when it comes to computation. You can either have an operation succeed or fail.

There are degrees of implementing error handling.

• Ignore/(exit with) errors
• Forward errors to a logging system
• Handle some errors
• Handle all errors
• Have no possible failure cases. This could be considered a perfect application and is unlikely to happen.

Not always are the last few the best option. In writing a script you may just want to exit and log what happened.

Context is quite important when it comes to errors in a number of ways. Application context can be abstracted into two main types. Dependencies and executables.

Dependencies are to be expansive, allow for adding new error types to improve libraries. Add additional functionalities

Typed errors can be great for executables. You want your app to be user-friendly when failure cases arise as they do (Bad network, lack of permissions, etc..). This is very difficult to catch when using any Error as I am sure many are aware of.

In CLI executables you want to know why something failed instantly. Details help here, tracing.

You might model program context like so:

 Program {
   // Should not be exhuastive for updates. New found errors, additional functionality.
   case dependency(Dependency)
   // Great candidate for exhuastivity. Predictability is good.
   case executable(Executable)

   enum Dependency {
    case api
    case source
   }
   enum Executable {
     case application
     case cli
   }
 }

 // What error types to use..
 switch program {
   case .dependency(.api):
     // use struct type
   case .dependency(.source):
     // use non-exhuastive enumerable type [ex](https://github.com/rust-lang/rfcs/blob/master/text/2008-non-exhaustive.md#enums-1)
     // This keeps the benefit of type inference and users of dependencies can create warnings for newly introduced cases that arise into the `default` node of a `switch`

   case .executable(.application):
     // use enumerable type
   case .executable(.cli):
     // consider an enumerable type for handling errors or a singular concrete error with tracing such as [anyhow](https://github.com/dtolnay/anyhow) if the program is often failable. The latter is quite nice for development on something you could want to fail if anything hiccups.
 }

A more reasonable direction to support this use case would be to introduce a form of anonymous enum (often called a sum type) into the language itself, where the type A | B can be either an A or B. With such a feature in place, one could express the function above as:

I'm glad a union/anon sum type was briefly discussed. Many times you can throw either types of an error but you do not want to create a new error type to allow for this. These would be not equivalent and currently you cannot model something like FileError | MetadataError.

Another suggestion uses angle brackets around the thrown type, i.e.,

The angle brackets seem more intuitive to me and to follow precedence. +1

Also, not a huge fan of the implicit do catch converting to any Error. Seems like this should be explicit to avoid unwanted behavior. Understandable if this is due to backwards comparability.