Pitch: Genericizing over annotations like throws

generics

(Matthew Johnson) #2

@Joe_Groff has suggested in the past something along the lines of an associatedeffect feature that would allow us to abstract over effects. It would look something like this:

protocol Foo {
    associatedeffect barEffects
    func bar() barEffects -> Int
}

This is just off the top of my head, but usage in generic functions might look something like this:

func caller<F: Foo>(f: F) async & F.barEffects -> Int {
    let x = await someAsyncCall()
    // this call probably needs some kind of usage site annotation along the lines of throws and await 
    // but it isn’t clear what syntax to use since the concrete effects are unknown
    return f.bar() + x
}

(Tellow Krinkle) #3

That makes sense, that should cover all the things that I want


(John McCall) #4

As a general matter, that seems reasonable. As a specific matter, while async is might logically be an effect like any other, I don't know how possible or sensible it's going to be to abstract over async-ness; it's just way too important to the implementation/semantic model.


(Tellow Krinkle) #5

I don't know what the plan for how async will be implemented is, but here's how I would make a generically async function if async was some compiler magic that converted code to our current callback-passing system (pretty much, an optionally async function would either go async and return nil, or not go async and return its result plus (for supporting move only types) the unused callback):

// @once added for supporting move only types
typealias Callback = @once (Output) -> ()
// Assume each function's types could be different but I'm lazy

// (current) always async function
func asyncFunction(args: Args, callback: Callback) {
	let output = doStuff1(args)
	callback(output)
}

// (current) calling an async function from an async function
func outerAsyncFunction(args: Args, callback: Callback) {
	let calculatedStuff = doStuff2(args)
	asyncFunction(calculatedStuff) { partial in
		let output = doStuff3(partial)
		callback(output)
	}
}

// (new) optionally async function
func optionallyAsyncFunction(args: Args, callback: Callback) -> (Output, Callback)? {
	let (output, shouldAsync) = doStuff1(args)
	if shouldAsync {
		callback(output)
		return nil
	}
	return (output, callback)
}

// (new) calling an optionally async function from an async function
func outerAsyncFunction(args: Args, callback: Callback) {
	let calculatedStuff = doStuff2(args)
	let result = optionallyAsyncFunction(calculatedStuff) { partial in
		let output = doStuff3(partial)
		callback(output)
	}
	if let (partial, passedCallback) = result {
		passedCallback(partial)
	}
}

// (new) calling an optionally async function from an optionally async function
func outerOptionallyAsyncFunction(args: Args, callback: Callback) -> (Output, Callback)? {
	let calculatedStuff = doStuff2(args)
	let result = optionallyAsyncFunction(calculatedStuff) { partial in
		// If you async at least once, treat yourself like an async function from now on
		let output = doStuff3(partial)
		callback(output)
	}
	if let (partial, passedCallback) = result {
		// Make sure passedCallback is the same one you passed in, else fatalError
		// Take ownership of stack variables back from passedCallback
		let output = doStuff3(partial)
		return (output, callback)
	}
	else {
		return nil
	}
}

// (new) calling an optionally async function from a non-async function
func outerNonAsyncFunction(args: Args) -> Output {
	let calculatedStuff = doStuff2(args)
	guard let (partial, _) = optionallyAsyncFunction(calculatedStuff, callback: {}) else { 
		fatalError("optionallyAsyncFunction went async even though it shouldn't have!") 
	}
	let output = doStuff3(partial)
	return output
}

Obviously I don't know what we'll end up using as our async system, but I do hope that ability to abstract over async-ness is one of the considerations made when choosing.

Edit: As a side note, an optionally async function might also perform better on something like a buffered stream iterator that only actually goes async once every few thousand calls


(John McCall) #6

Okay, so basically two separate ABIs for async functions and potentially-async functions, and everything downstream of a potentially-async function call has to be emitted twice, once for if it returns normally and once in a callback. That is a lot of complexity. It also ends up having all the overhead of both conventions, since you still need to heap-allocate the frame whenever a potentially-async function makes a potentially-async call. This can be avoided if the function is statically known to never make such a call, but ordinary async functions can do that, too.

A much simpler way of doing this is to use a single ABI for async functions based around tail calls and callbacks. You can still efficiently implement calls to async-ABI functions that are statically known not to go async by just making a non-tail call and providing a callback that just stashes the return value on the stack.

In either implementation, a function which abstracts over async-ness is going to be significantly less efficient when dynamically non-async than a non-async function. I think that's unavoidable; anything you do to be lazier about heap-allocating the frame is going to be paid for in other ways.


(Karl) #7

I'm not sure the abstraction over async and throws is all that useful. I can't think of many examples besides Sequence where you might want to blanket over such different behaviours.

I remember a similar discussion a while ago about files and Sequence, and the consensus was that you generally don't want to use generic Sequence/Collection algorithms on things like file-handles or remote directory iterators. Essentially any I/O operation could fail, and it's not worth burdening all generic Sequence code with handling that; especially since most algorithms (e.g. sort) wouldn't perform nearly as efficiently on those kinds of things as they would if you just copied it all to an Array first.

Instead, it's better to define your own protocol - perhaps inspired by Sequence or Collection - which is tailored for your situation.


#8

I recently encountered a scenario where I wanted this sort of abstraction over the throwing-ness of stored properties. I ended up having to entirely duplicate the implementation with one hierarchy of classes that throw and another that don’t. It would be nice to use something like rethrows there.


(Tellow Krinkle) #9

Sort already copies the entire contents of a Sequence to an array first anyways. Admittedly, RandomAccessCollection stuff makes a lot less sense, but most Sequence things would work just as well on a throwing or async collection. A Zip2Sequence of two buffered file readers would be more efficient than if you read the entire files into memory just to iterate over them. A lazy map would work just as well on a throwing or async sequence. So would the Set initializer, which would keep memory usage down in the case where there were a lot of repeats.

As for other protocols, I'm pretty sure any protocol that you would make for a file today would be extendable to a remote file (async) later. TextOutputStream might also want this, since you could output to either a String or a file.


(Joe Groff) #10

Another interesting future direction we could take to address polymorphism over throws specifically would be to adopt typed throws, and make the error type an independent type argument of all function types. This would mean that a type that doesn't throw effectively throws Never, and one that throws without a type by default throws Error:

(X, Y, Z) -> W        === (X, Y, Z) throws Never -> W
(X, Y, Z) throws -> W === (X, Y, Z) throws Error -> W

This would allow protocols to be generic over throwing and nonthrowing implementations by making the error a separate associated type:

protocol MyIterator {
  associatedtype Element
  associatedtype Error: Swift.Error

  mutating func next() throws Error -> Element
}

struct NonthrowingImplementation: MyIterator {
  mutating func next() -> Int // conforms with Error == Never
}
struct ThrowingImplementation: MyIterator {
  mutating func next() throws -> Int // conforms with Error == Swift.Error
}

(Matthew Johnson) #11

I’m very much in favor of a design in this direction for typed throws. Further, I think it’s important that throwing any uninhabited type is treated equivalently to throwing Never in terms of needing to use try when invoked and needing to handle errors, etc. I wrote about why in [Discussion] Analysis of the design of typed throws.

This design for typed throws would support abstracting over error types, including whether one throws at all or not which is great! That said, I think it would sit nicely along side a more general effect abstraction feature. They address related (even overlapping) but distinct use cases.


(Adrian Zubarev) #12

That is exactly what my conclusion was after all the debate around typed throws and Result. That would be just perfect if you‘d ask me.


(Ben Rimmington) #13

Another workaround is to use Result as the element of a sequence.

extension FileStream: IteratorProtocol, Sequence {

  public typealias Element = Swift.Result<UInt8, Error>

  public func next() -> Element? {
    guard let byte = UInt8(exactly: fgetc(_stream)) else {
      if feof(_stream) != 0 {
        return nil
      } else {
        return Element.failure(POSIXError())
      }
    }
    return Element.success(byte)
  }
}

(Gwendal Roué) #14

This is true. However, this implements one possible failure mode of failing sequences.

Another interesting failure mode is that the sequence ends of the first iteration failure.

Compare those two snippets below, which show how different is the consumption of those two kinds of sequences:

// Mode A: Sequence of Result (eventual error is per-element):
for result in sequence {
    do {
        let element = try result.get()
        // Handle element.
    } catch {
        // Handle element failure.
        // It is possible that iterator produces
        // a success element on next iteration step.
    }
}

// Mode B: Sequence ends on first error:
do {
    while let element = try iterator.next() {
        // Handle element
    }
} catch {
    // Handle sequence failure.
    // The consequences of calling iterator.next() after this failure should
    // be precisely defined (programmer error and trap, or the guarantee
    // that some error would be thrown).
}

Those are important semantics differences. And we like to make semantics very clear, if I interpret SE-0052 correctly.


(Ben Rimmington) #15

@gwendal.roue It might be necessary to retry only after some errors (e.g. EAGAIN, EINTR, ETIMEDOUT).

do {
  for result in fileStream {
    do {
      let byte = try result.get()
      // TODO: Handle success...
    } catch POSIXError.EINTR {
      continue // Retry an interrupted system call.
    } catch {
      throw error // Rethrow to outer `catch` clause.
    }
  }
} catch {
  // TODO: Handle failure...
}

The post-nil guarantee in SE-0052 should be possible if FileStream doesn't conform itself to IteratorProtocol.


(Gwendal Roué) #16

In your specific case, some errors are fatal, some other errors are not.

When you iterate a random number generator, all entropy errors are transient.

And when you iterate an SQLite statement, any error is fatal.

What I was trying to say is that a generalization of throwing sequences may well be delicate to define in a way that is precise, and yet does not leave entire classes of problems abandoned on the roadside.

When you mix iteration and errors, you get multiple possible outcomes.

I'm no functional developer, but... Aren't we're trying to combine monads, only to discover that there is no general solution?


(Karl) #17

To be fair, your example shows a protocol which already allows its next() method to throw and simply allows non-throwing methods to witness the requirement. It's convenient, but you don't need typed-throws for that; in fact, I think it already works.

I don't think the solution proposed by OP is all that useful - I think we actually have a better thing, right now (Swift 4.2), but I don't think everybody is aware of it, so let me explain:

You can create a parent protocol, whose requirements can throw, and also a non-throwing refinement. The compiler will already recognise these as being the same, so you don't even need to write a default implementation with the throwing signature (see example). However, generic code will be able to require a non-throwing witness if your algorithms are not fault-tolerant. I think it's a cleaner solution.

We could do something like this in the standard library, except that ABI stability bans re-parenting protocols, and Sequence/IteratorProtocol kind of have knives dangling above their heads already.

protocol MightThrowIterator {
  mutating func next() throws -> Int
}
protocol NoThrowIterator: MightThrowIterator {
  mutating func next() -> Int
}

struct A: MightThrowIterator {
  enum Err: Error { case anError }
  mutating func next() throws -> Int {
		throw Err.anError
  }
}

struct B: NoThrowIterator {
  mutating func next() -> Int {
    return 42
  }
}

func tryIterate<T: MightThrowIterator>(_ val: inout T) {
  do {
    let element = try val.next()
    print(element)
  } catch {
    print(error)
  }
}

func definitelyIterate<T: NoThrowIterator>(_ val: inout T) {
  let element = val.next()
  print(element)
}

func test() {
  var testObjA = A()
  var testObjB = B()
  tryIterate(&testObjA) // prints: 'anError'
  tryIterate(&testObjB) // prints: 42
  definitelyIterate(&testObjA) // Compile error: 'A' does not conform to expected type 'NoThrowIterator'
  definitelyIterate(&testObjB) // prints: 42
}

(Joe Groff) #18

The thing that's added that you can't do now is the ability to abstract over the throw-iness of the conforming type using the associated type. You could think of it as analogous to the difference between an (Any) -> Any function and a (T) -> T function; both take the same set of inputs, but the latter also allows you to preserve the type information coming out of the function instead of losing it going in. The closest thing you can do now is build parallel overloaded protocol hierarchies, as you noted.


(Karl) #19

I'm not sure what you mean - I don't think throw-iness is something you can really abstract over, because throws is a superset of non-throws and the reverse isn't true. At the most abstract level, you will have to assume something might throw. Which is what we have.

I don't actually think typed-throws is that useful in practice. There are a few cases where you can exhaustively know all the errors you will throw, and they will all happen to share a common type which also is narrow enough to be useful, but as soon as things get bigger you'll have to erase to Error. IMO, the only macroscopically-meaningful distinction in such a system is Error vs. Never, which we already have with throws.

What's more, because Swift encourages type-safety and people love exhaustive switching and the feeling of absolute determinism, I think a lot of people would reach for typed-throws when they'd probably be better off without it.


(Joe Groff) #20

For many (though not all) places where rethrows is used with with* { ... }-style scoping operations, it might be worth considering whether coroutines would overall be a better model. A yield-once coroutine, like what we now use for property accesses, can express similar patterns where some setup and teardown needs to happen around a scope, but because the coroutine is a separate context that sits alongside the main context, it doesn't need to pass through the effects of the inner scope like a higher-order function does when calling a closure. If coroutines were exposed as a user-facing feature and adopted in place of higher-order functions, the coroutine forms would be usable in any effect context without needing to complicate the type system to represent the effect propagation.


(Joe Groff) #21

Think about rethrows. A rethrows method like map works whether it's passed a throwing closure or not, but it doesn't throw itself, so whether a map expression throws depends on whether the closure throws. The discussion here is about enabling that kind of abstraction more generally than rethrows.