Await/Async, part deux

QuinceyMorris · June 12, 2020, 4:37am

OK, let’s start this over in a new thread. This time, I respectfully request that you respond to what I actually say, not to positions I don’t hold. Also, sorry, this has turned out to be book-length...

We need to start with the concept of a “path of execution”. Consider the following simple example:

	X
	Y
	Z
	print(“got here”)

You can think of X, Y and Z as placeholders for ordinary Swift statements, or as function calls if you prefer. X to Y to Z to print is a single path of execution, which is to say that Y is executed after X, and Z is executed after Y. It’s a clear and extremely simple model to grasp. Beginners can reason about when the print statement is executed.

Now consider a somewhat more complicated example:

	B {
		C {
			D {
				print(“got here”)
			}
		}
	}

Assume that B, C, D are functions with a completion handler represented by a closure. That is, each function finishes whatever it is intended to do, however long that takes, then invokes its completion handler exactly once.

This is a very normal asynchronous pattern in Swift. It’s absolutely how we currently want people to write code, and it’s just fine to initiate this sequence from the main thread. There are other, more complicated asynchronous patterns, but this is the one, I’m claiming, that we should be examining first.

In fact, this is our friend, Ol’ Pyramid of Doom. We don’t like Ol’ Pyramid, but there’s actually nothing very frightening happening here. It’s still a single path of execution, B before C before D before print. It’s just written with a lot more syntax.

There is, however, a much bigger problem, and this problem is what actually confuses beginners. There are two paths of execution here. To see that, let’s add some surrounding code:

	A
	B {
		C {
			D {
				print(“got here”)
			}
		}
	}
	print(“but what about me??”)

One path is B-C-D-print as before. The other path is A-print. Both paths end in a print statement, and there is no way to predict which print will execute first. That’s what confuses beginners. They tend to assume that “but what about me??” will print after “got here”, often with fatal consequences.

As far as I’m concerned, this whole async/await discussion is about allowing programmers to write single paths of execution, for real. As far as I’m concerned, we have the “technology” to do that, if only we can get on with implementing it.

So, let’s reduce the last example to what looks like a single path of execution:

	A
	B 
	C
	D
	print(“got here”)

(I omitted the other print statement temporarily. It'll be back.) In this form, it won’t work, because B, C and D are “asynchronous”, meaning they do their work after returning to the caller, not before. In order to make this an actual path of execution, we need to make them complete their execution sequentially.

That’s where await comes in. It’s really just a serialization operator:

	A
	await B 
	await C
	await D
	print(“got here”)

The operator applies only to “asynchronous” functions. The result is an actual path of execution. It’s so easy to reason about, we can decide where we really wanted the other print to happen. For example, here’s maybe what we really wanted:

	A
	await B 
	await C
	await D
	print(“but what about me??”)
	print(“got here”)

But, go ahead, put the prints wherever you'd like. It's easy.

OK, now let’s look at the context in which the above sequence might occur. There are basically two possibilities here. The first is a regular function:


func A1() {
    A
    await B 
    await C
    await D
    print(“but what about me??”)
    print(“got here”)
}

The other is an “asynchronous” function:

func A2() async {
    A
    await B 
    await C
    await D
    print(“but what about me??”)
    print(“got here”)
}

In both cases, await is an execution-path sequencing operator.

In A1, it will actually need to wait for each of B, C and D to complete in order to move on to the next step in the path.
In A2, because of async, it doesn’t actually wait, but chains each step onto the (now implicit) completion handler of the previous asynchronous step.

(Of course, whatever calls A2 will have to await it, which will either wait or chain in the same way, according to context.)

This “dual” behavior of await is the heart of the whole proposal. It’s what makes await/async useful, and why await should not be force-restricted to the interior of async functions.

Please notice, in everything I’ve said above — except for the deliberately “broken” example where we couldn’t reason about the relative timing of the two print statements — there is no concurrency. Everything you see in the source code happens sequentially in a single path of execution. That’s the point.

But there’s more. It’s not sufficient that things happen sequentially in a path of execution. We want two additional thread-related constraints:

Every step in the path must begin and end execution on the original thread (even if a step has internal implementation details that temporarily switch to other threads).
Waiting (as await does in A1) must not block the thread it’s running on. It only blocks its own path of execution.

Constraint #1 means that async/await does not introduce additional thread-unsafety to the thread the execution path started on.

Constraint #2 means async/await is usable in the most likely scenario: the main thread of an app, or any similar single-path thread-like use-case, such as a serial DispatchQueue.

This is important, because initiating asynchronous behavior from a shared serial thread like the main thread is a valorous design pattern. Starting and finishing asynchronous behavior on the main thread provides the easiest thread safety solution across a wide range of commonly useful scenarios.

AFAIK, the only plausible way of implementing these behaviors is a coroutine, and (no thanks to anything I’ve said or done) that’s what’s actually being proposed, I've been happy to see.

To review:

As far as I’m concerned, the goal here is to add Swift language features to provide a path-of-execution serialization operator, along with a genuine notion of an asynchronous function.

Beyond that:

There is a further discussion which we haven’t even started having yet. There are other asynchronous patterns we might like to provide for. How about, for example, an await operator that operates on “groups” of asynchronous functions (aka concurrency or dispatch groups)? How about an algebra of futures or promises that can be used within the implementations of asynchronous functions to provide more sophisticated usage patterns?

Those things are important, but they are not nearly as important as the basic serialization behavior. That’s where we need to start.

Avi · June 12, 2020, 5:25am

This is beautifully written, and for whatever it's worth, I agree 100%.

pyrtsa · June 12, 2020, 6:29am

A synchronous function hiding asynchronous behaviour like this seems like a trap to me, because at the call site there's nothing to indicate it may take an arbitrarily long time to return. I don't see why we should allow implicit async like so, just like we don't allow throwing functions to omit their throws attribute only to crash the app when an error occurs.

Seems to me we don't lose anything (and seems more Swifty!) to instead have call-site keywords similar to try and try! for what you're suggesting (probably await and await! then?) and always require async function declarations to take the form of your func A2() async.

(Hope I didn't comment on the wrong thing! )

Avi · June 12, 2020, 7:06am

It is not the asynchronous nature which dictates the time it will take, but rather the work being done. You can write entirely synchronous network code (for example), and it will take just as long.

i think you are looking at this from the wrong direction. Swift allows functions to call other functions which may throw. The invocation has to be tagged with some form of try. This is perfectly analogous with using await in the example. In the case of throwing, the outer function is not tagged in any way to indicate that it invokes throwing functions.

And that's the point of your confusion, I believe. A1() is not tagged async because it isn't. That means that a caller of A1() will not yield its execution to other work. In multi-threading terms, this is like calling A1() synchronously, except that the caller doesn't have a choice.

ddddxxx · June 12, 2020, 9:16am

It's not the fact. try is only allowed in throwing context, like this:

func f() throws/rethrows {
    try ...
}

and this:

do {
    try ...
} catch {}

This is not allowed:

func f() {
   try ...
}

Analogously await should not be allowed in non-async context.

Avi · June 12, 2020, 9:20am

But this is:

We don't need the rules to be identical because errors and concurrency are not the same. Swift's model requires that error propagation be explicit. With await, there's no context being leaked. Quite the opposite, in fact, as execution does not continue until the wait is over.

Anachron · June 12, 2020, 9:42am

One could do this:

do{
let x = await foo()
doSomethingWithX(x)
}
catch{ Task<X> in 
Task.resume()
}

and for Task<Void>, one could omit the catch block if there is some special indication that you mean 'just run the damn code, I don't care'.

pyrtsa · June 12, 2020, 9:43am

I'm repeating myself, but IMO allowing this modification of A1 would fit in the spirit of Swift:

func A3() {
    A
    await! B 
    await! C
    await! D
    print("but what about me??")
    print("got here")
}

No one is saying they are, but they do have more in common than one might think.

Avi · June 12, 2020, 9:49am

Could you explain why you think so? An ! following a keyword means that the operation will trap if it fails. How does that apply here?

That may be, but it's their differences that matter here, as we're discussing whether their syntaxes should be divergent or convergent. Throwing an error aborts the current execution path. An awaited async function preserves the chain of execution.

ExFalsoQuodlibet · June 12, 2020, 9:52am

I 100% agree with OP, and I think it's a perfect basis for such discussion. Can I have some minimum context about the "let's start this over" remark?

pyrtsa · June 12, 2020, 9:54am

I can't make sense of this. If await in A1 (or what I and apparently several others have referred to as await! without but a handwaving definition) – if that await would not block its thread of execution then where would that thread continue running on? Say the main thread gets "parked" to wait for a non-async function call an async function, where's the continuation point to jump to, and what makes the main thread jump back to this function when it's ready to resume?

For func A2 async I can see how an executor system partly in the language, partly in its stdlib, could turn those into running code with cooperative switching, but A1 is supposed to act as an ordinary function.

pyrtsa · June 12, 2020, 9:56am

If it takes forever to run the async code it will never return.

Anachron · June 12, 2020, 9:58am

Maybe the OP is referring to this:

I proposed a tool to insert effectful syntax in a standardized way and in order to get some attention I used async-await as an example, because it is a feature that has been requested by some. A lot of discussion followed on async-await itself (which is why I eventually changed the title, because that wasn't the heart of the proposal) and apparently there's a lot of confusion around async-await, making the discussion over there somewhat messy.

Avi · June 12, 2020, 9:59am

That's true of any function.

pyrtsa · June 12, 2020, 10:02am

But not of async functions. If it never produces a result you can still do other things in the program maybe including cancel it.

Avi · June 12, 2020, 10:04am

I don't understand what you are getting at.

func f() { while true { } }  // assume loop is not optimized away for some reason

func g() {
  f()
}

This is perfectly valid code today, and there's no sign that f() will never return. Why should we treat async functions differently in this respect?

pyrtsa · June 12, 2020, 10:11am

I'm looking for ways to keep async a well-typed abstraction. If you can have await anywhere without a compiler diagnostic, there's a high chance you actually also wanted the enclosing function to be async and it would be nice if the compiler told you (like is the case with try without an enclosing do or throws).

But there's also the bigger issue in this proposal which I think should be defined well: how should non-async code be put aside in its thread of execution (say, the main thread) and where should that thread continue running in the meanwhile?

Anachron · June 12, 2020, 10:26am

I think reasonable solutions would work with completions. If you hit the keyword 'await' and an async function, you would simply wrap the code thereafter to a completion that is then handed to the async function. In the case that you assign the awaited value to a pre-declared variable (e.g. because you await some bool that you then check in a while clause), there may be a need for some special scoping rule, but I think that can be worked out.

I think though that awaiting in non-async functions should really only be allowed in void functions and I admit that I'm skeptical even then, too.

ddddxxx · June 12, 2020, 10:30am

If only you explicitly transfer control:

func f() { // non-async context
    RunLoop.current.sync { // async context
        await g() // transfer control to scheduler (current run loop)
    }
}

That's why I don't like await in non-async context. In which case the scheduler is implicit.

pyrtsa · June 12, 2020, 10:33am

ddddxxx:

If only you explicitly transfer control:

func f() { // non-async context
    RunLoop.current.sync { // async context
        await g() // transfer control to scheduler (current run loop)
    }
}

Yes, this I definitely understand. That's what I referred to in my earlier post as the clear case of func A2 async.

But this thread is about the case where a func A1 magically turns into something half-asynchronous-half-not, unless I'm misunderstanding something.