Pitch: Fix rethrows checking and add rethrows(unsafe)

Slava_Pestov · February 17, 2021, 8:22pm

Fixing `rethrows` checking and introducing `rethrows(unsafe)`

Proposal: SE-NNNN
Authors: Slava Pestov
Review Manager: TBD
Status: Prototype implemented
Implementation: apple/swift#36007
Bugs: SR-680

During the review process, add the following fields as needed:

Decision Notes: Rationale, Additional Commentary
Previous Revision: 1
Previous Proposal: SE-XXXX

Introduction

This proposal fixes a soundness hole in rethrows checking, and introduces a new rethrows(unsafe) escape hatch for situations where a function has the correct behavior at runtime but the compiler is unable to prove that this is the case.

Swift-evolution thread: Discussion thread topic for that proposal

Motivation

Swift allows you to write a higher-order function which is known to only throw an error if one if the provided function-typed arguments throws an error. This is indicated by writing the rethrows keyword in place of throws before the return value:

extension Optional {
  func map<Result>(_ fn: (Wrapped) throws -> Result) rethrows -> Result {
    guard .some(let wrapped) = self else { return nil }
    return .some(try fn(wrapped))
  }
}

let x: Int? = 123
let y = x.map { $0 * 2 } // no 'try' since nothing can be thrown
let z = try x.map { someThrowingFunc($0) } // 'try' is needed

Today it is also possible to defeat the compiler's rethrows checking
and write a function which always throws even when the passed-in closure
does not throw:

enum MyError : Error {
  case bad
}
func rethrowsViaClosure(_ fn: () throws -> ()) rethrows {
  try fn()
}
func invalidRethrows(_ fn: () throws -> ()) rethrows {
  try rethrowsViaClosure { throw MyError.bad }
}

invalidRethrows() // no 'try', but crashes at runtime

This was originally due to a bug in the implementation, but some projects have come to rely on it to implement functions that still only throw if the passed-in closure throws, but the compiler cannot
prove this.

For example, the implementation of DispatchQueue.sync() in swift-corelibs-libdispatch looks like this:

private func _syncHelper<T>(
    fn: (() -> ()) -> (),
    execute work: () throws -> T,
    rescue: ((Swift.Error) throws -> (T))) rethrows -> T {
  var result: T?
  var error: Swift.Error?
  withoutActuallyEscaping(work) { _work in
    fn {
      do {
        result = try _work()
      } catch let e {
        error = e
      }
    }
  }
  if let e = error {
    return try rescue(e)
  } else {
    return result!
  }
}

public func sync<T>(execute work: () throws -> T) rethrows -> T {
  return try self._syncHelper(fn: sync, execute: work, rescue: { throw $0 })
}

The _syncHelper() function catches any errors thrown by the work parameter inside the inner closure passed to fn, then rethrows the error, if there was one, from outside of the closure by passing it to the rescue closure.

With the proposed language, the compiler now rejects the definition of sync(), since it calls _syncHelper(), which rethrows, with a rescue closure that unconditionally throws.

These situations are rare but do come up, so to handle them, we propose introducing a new rethrows(unsafe) variant of the rethrows keyword:

public func sync<T>(execute work: () throws -> T) rethrows(unsafe) -> T {
  return try self._syncHelper(fn: sync, execute: work, rescue: { throw $0 })
}

From the perspective of the caller, sync() behaves like any other rethrows function; the unsafe aspect is an implementation detail.

Detailed design

Today, a rethrows function must obey the following restrictions:

It cannot throw an error directly.
It can call any one of its throwing closure arguments, with the try keyword required as usual.
It can call any other rethrows function.

The soundness issue is with rule 3, because we do not impose restrictions on what arguments may be passed in to the other rethrows function.

This proposal leaves rules 1 and 2 unchanged, but introduces a new rule 3:

It cannot throw an error directly.
It can call any one of its throwing closure arguments, with the try keyword required as usual.
It can call any other rethrows function, as long as the closures that are passed in to the callee are either among the function's original throwing closure arguments, or another closure that in turn obeys the restrictions of rule 1, 2 and 3.

If the function is declared with the rethrows(unsafe) keyword in place of rethrows, the rules are not enforced, and it is up to the programmer to ensure that the function only throws if one if its original closure arguments throws.

Source compatibility

This proposal breaks source compatibility with any code that previously relied on the soundness hole, either intentionally or unintentionally.

So far, there are three known examples of soundness violations:

The implementations of DispatchQueue.sync() in swift-corelibs-libdispatch and the Darwin Dispatch overlay.
The implementations of IndexSet.filteredIndexSet() in swift-corelibs-foundation and the Darwin Foundation overlay.
The implementation of DatabaseQeueue.read() in the GRDB.swift open source project from Swift's source compatibility suite.

All three can be addressed with the appropriate use of rethrows(unsafe).

Effect on ABI stability

The proposed change to rethrows checking does not change the ABI of existing code. Changing an existing rethrows function to use rethrows(unsafe), or vice versa, is an ABI-preserving change.

Effect on API resilience

The proposed change to rethrows checking does not change the API of existing code. Changing an existing rethrows function to use rethrows(unsafe), or vice versa, is an API-preserving change.

Alternatives considered

We could leave the soundness hole unfixed, but this is suboptimal since users have hit it on accident and reported it as a bug.

We could downgrade violations of the new proposed rules to a warning, preserving source compatibility with code that would otherwise have to use rethrows(unsafe). However, having to change code to use this keyword should be rare in practice, and a warning for a behavior that can crash at runtime will confuse developers.

We could use a separate attribute for rethrows(unsafe) instead of new syntax, for example something like this:

@rethrowsUnsafe func rethrowsUnsafely(_: () throws -> ()) rethrows {}

zwaldowski · February 17, 2021, 9:12pm

I understand where unsafe is coming from in terms of fitting in with the rest of the language and enforcing rules for correctness, but I’d like to dig further in to the notion that these use cases are “unsafe”.

It seems like the sources you link to are trying to use rethrows in a way that is semantically correct for the API consumer, but are hitting a call boundary that doesn't handle Swift error propagation as an implementation detail.

Rather than opting out of the correctness checking, is there a more precise use-case that should be handled here?

I’d be absolutely OK with your pitch becoming syntax as-is, but it would be nice to not add an escape hatch if we didn’t have to.

Chris_Lattner3 · February 17, 2021, 9:13pm

Do we need the equivalent for reasync now that async/await has landed?

Slava_Pestov · February 17, 2021, 9:15pm

You're right that the 'unsafeness' here is an implementation detail; as far as the caller is concerned we just have an ordinary 'rethrows' function. So it's not like 'UnsafePointer' for example.

However I want to push back on the "implementation detail" part. Precisely specifying the rethrows analysis is an important part of the language design and not just the implementation.

Rather than opting out of the correctness checking, is there a more precise use-case that should be handled here?

In all the three examples I mentioned, we're catching an error in an inner closure, then checking for that error at the top-level of the function, in which case we rethrow it. I don't know how to sufficiently formalize the required analysis to handle this, though.

Slava_Pestov · February 17, 2021, 9:18pm

I'm currently working on reasync, which is how I discovered this bug, then found it was a very old known issue

If you're asking specifically whether we need reasync(unsafe), I'm not sure that's possible since reasync is a bit different than rethrows. With rethrows, we use the throwing ABI, and we can call a throwing function from a non-throwing function by ignoring the error result.

With async/reasync the situation is a bit different. There's no efficient way to call an async function with a "fake" async context from a sync function, so reasync will actually be implemented with a combination of inlining and emitting a second "sync-only" entry point for your reasync function.

A reasync(unsafe) function would mean you have a suspension point in the body that the compiler cannot prove is unreachable, so it's not clear how this would fit in with our current implementation strategy for reasync.

zwaldowski · February 17, 2021, 9:35pm

Right. I didn't mean to imply that a fix is a magic new layer of analysis (i.e., a magically-defined additional scope of work for you ).

We have withoutActuallyEscaping(_:), and the async continuations proposal has resume(with result: Result<T, E>). Is there room to represent the exact escape hatch needed for Foundation, Dispatch, Core Data, GRDB to use rethrows without leaving correctness holes in rethrows?

Slava_Pestov · February 17, 2021, 9:42pm

I guess one other alternative I could list in the proposal is to put the escape hatch at the throws site, not on the function declaration. eg something like this:

func doSomethingUnsafely(fn: () throws -> ()) rethrows {
  var e: Error!

  do {
    try fn()
  } catch {
    e = error
  }

  if let e = e {
    throw(unsafe) e
  }
}

Slava_Pestov · February 17, 2021, 9:50pm

It's possible we can handle the escape hatch with some kind of withForwardedThrows builtin:

try withForwardedThrows { handler in
  someFunctionTakingNonThrowingClosure {
    do {
      try fn()
    } catch {
      handler(error)
    }
  }
}

The idea is roughly that if you call handler() at least once from inside the body, the last error you pass in is thrown after we return from withForwardedThrows(). Otherwise we return normally. withForwardedThrows() would itself be rethrows, and the implementation would be a compiler-internal detail.

xAlien95 · February 17, 2021, 9:57pm

Alternatively what about using rethrow instead of throw(unsafe) at the throwing site?

func doSomethingUnsafely(fn: () throws -> ()) rethrows {
  var e: Error!

  do {
    try fn()
  } catch {
    e = error
  }

  if let e = e {
    rethrow e
  }
}

It is simpler to reason about when reading the code, since its goal is to rethrow an error, but it lacks the unsafe bit.

Slava_Pestov · February 17, 2021, 10:03pm

My current thinking is to actually wrap the entire idiom in a new builtin operator, analogous to withoutActuallyEscaping. Basically we want something like this:

/// Calls `fn` with a closure that evaluates `inner`, and saves the
/// result. Then after `fn` returns, forces the result.
///
/// \param fn A function taking a no-parameter, no-result closure as
///           an argument. This function must call its argument at
///           least once.
/// \param inner A function that can throw or return a 'T'.
///
/// \returns Whatever `inner` returned, or throws an error if `inner`
///          throws an error.
///
/// \fatal A fatal trap occurs if `fn` does not invoke its argument
///        at least once.
func withRethrownError<T>(_ fn: (() -> ()) -> (),
                          _ inner: () throws -> T) rethrows -> T {
  var result: Result<T, Error>!

  fn {
    result = Result(catching: inner)
  }

  return try result.get()
}

The idea is that withRethrownError is rethrows(unsafe), but it would be the only place where you'd need rethrows(unsafe) (making it an implementation detail of the compiler; either withRethrownError would be a builtin intrinsic, or it would use an underscored attribute specially for this purpose).

hooman · February 17, 2021, 10:05pm

I prefer withForwardedThrows as it a bit more self-documenting.

Edit: Clarified which one I prefer.

Nevin · February 17, 2021, 10:08pm

At a high-level, I’d like to see a solution that lets _syncHelper completely eliminate the rescue parameter.

• • •

At a nuts-and-bolts level, it is not clear to me that the proposed solution actually addresses the use-case of _syncHelper:

In the example:

The call to _syncHelper does not satisfy the proposed condition 3, because the argument passed as rescue does not meet condition 1.

• • •

I think the “simplest” solution is to say that rethrows(unsafe) should actually lift rule 1, allowing errors to be thrown directly, and placing the burden on the programmer to ensure that any such thrown error is actually a rethrown error, with undefined behavior resulting if it is not.

• • •

A “nicer” solution would involve some sort of purpose-built escape-hatch, so if we can get that working it’s probably the way to go.

Slava_Pestov · February 17, 2021, 10:12pm

That is correct. The call to _syncHelper inside sync type checks today, but would fail to type check with my proposed implemented. That's exactly why we need the escape hatch in the first place. If idioms like the _syncHelper thing weren't used, this proposal would simply be a bug fix.

That's exactly my original proposal.

Yes, my current thinking is to come up with a purpose-built escape hatch; see Pitch: Fix rethrows checking and add rethrows(unsafe) - #10 by Slava_Pestov

xwu · February 17, 2021, 10:31pm

I like the escape hatch as pitched (although I mildly favor the original terminology choice of "unchecked" rather than "unsafe", so long as there's not a memory safety issue I'm not seeing).

I agree with your argument that the unchecked nature of rethrowing is a part of the design of a function to an extent and not just the implementation, and for that reason (and for the reason that it's supposed to be an escape hatch for users currently using rethrows), I think the originally pitched augmentation of rethrows is superior to a with... { } call at the throwing site, which involves more ceremony and buries the issue into the implementation.

Slava_Pestov · February 17, 2021, 10:33pm

My proposal makes rethrows safe, as far as I know. There's no way to throw an error unless it came from one of your closure arguments once we amend the rethrows checking rules as pitched.

Well, we could also have both; a rethrows(unsafe) syntax allowing users that currently rely on the bug to fix their code with a small change, and a new standard library function, if we feel the latter is sufficiently useful. The benefit of using the standard library function is that it is safe -- unlike direct use of rethrows(unsafe), you can't write code that actually throws an error when it should not.

jrose · February 17, 2021, 10:50pm

There is a (theoretical) memory safety issue: if the function throws, the return value is uninitialized. If the caller didn't expect it to throw, it'll read uninitialized memory. I don't remember if this is how any of our ABIs work in practice, though. (Note that this is different from the conventions for NSError, where "there was an error" is encoded in the return value.)

Slava_Pestov · February 17, 2021, 10:53pm

The example in my commit message segfaults:

enum MyError : Error {
  case bad
}

func rethrowsViaClosure(_ fn: () throws -> ()) rethrows {
  try fn()
}

func invalidRethrows(_ fn: () throws -> ()) rethrows {
  try rethrowsViaClosure { throw MyError.bad }
}

invalidRethrows {}

Also even if the uninitialized return value is not an issue, throwing an error that is not expected to be caught ends up leaking the Error existential value, does it not?

jrose · February 17, 2021, 11:12pm

I don't think we've been considering leaks to be "unsafe"; after all, the user can make leaks all the time themselves. (Even Rust doesn't consider leaks to be "unsafe".)

I'm very happy to see this fixed and to get a reasonable proposal fixing it. I prefer throw(unsafe) error (or unsafe throw error) over rethrows(unsafe) because (a) the unsafety isn't really part of the function signature, as mentioned above, and (b) you'd still get rethrows checking elsewhere in the function, which might still be useful. It would be nice™ if there was a way to get only-run-time-checked rethrows (as opposed to totally unsafe), but anything I try to think of there rapidly becomes complicated to implement and complicated to use, which is probably overkill for this fairly rare case.

Slava_Pestov · February 17, 2021, 11:18pm

One shortcoming of that is that it doesn't allow you express the case where the unchecked-throw comes from another throwing call via try, and not a direct throw. We could also have a try(unsafe) or something, but another approach is to add a special form:

func withUnsafeRethrows<T>(_ fn: () throws -> T) rethrows {
  try fn()
}

The idea is that a call to withUnsafeRethrows with any closure would be considered a rethrows call site, even if the closure unconditionally throws. You would use it like this:

withUnsafeRethrows { throw MyError }

or

return withUnsafeRethrows { try result.get() }

or whatever.

David_Ungar2 · February 18, 2021, 12:32am

The unsoundness in the status quo is a great catch! Without diving in, I'll cast a vote for a solution that does not introduce the "unsafe" flavor. Can the existing unsafe uses be rewritten? Or, to be radical, could "rethrows" just be eliminated, and just use "throws" instead? (Just musing out loud...)

Pitch: Fix rethrows checking and add rethrows(unsafe)

Fixing rethrows checking and introducing rethrows(unsafe)

Introduction

Motivation

Detailed design

Source compatibility

Effect on ABI stability

Effect on API resilience

Alternatives considered

Fixing `rethrows` checking and introducing `rethrows(unsafe)`