Protocol with async function allows implementation without async?

nickasd · October 2, 2024, 2:01pm

I was pretty confused when I noticed that while I had added the async keyword to a custom protocol function, the compiler didn't complain about the existing implementations missing it.
This code unexpectedly compiles:

protocol P {
    func foo() async
}

class A: P {
    func foo() {}
}

A().foo()

Why is this allowed? After all, overriding a method in a class requires async if the superclass has it, so why not in a protocol? If I have a variable of type P, will calling foo() always behave asynchronously even if the implementation doesn't declare async?

vns · October 2, 2024, 2:33pm

This behaviour is defined by the original proposal: swift-evolution/proposals/0296-async-await.md at main · swiftlang/swift-evolution · GitHub

This behavior follows the subtyping/implicit conversion rule for asynchronous functions, as is precedented by the behavior of throws.

Why it shouldn't be? In the end, any what synchronous function can do, asynchronous can as well, this is the other way around that is a problem.

nickasd · October 2, 2024, 2:37pm

Because by looking at it, knowing that it's an implementation of a protocol, I get the idea that it's synchronous, when the protocol isn't, so I make wrong assumptions and start thinking about the code in the wrong way. If I'm allowed to override a throws method with one that doesn't throw, that also confuses me, because again by looking at the implementation I get the idea that the superclass doesn't throw either.

vns · October 2, 2024, 3:08pm

I'm not sure I follow you. If you know this is an implementation of a protocol, you more likely know how protocol is declared, right? Yet this IMO a one really specific way to consider if something should be or not allowed. Type that conforms to a protocol, can be used not only hidden away by the protocol, after all. And that can be a really handy conversion to not introduce unwanted asynchronous code.

Back to the reason why this allowed, as proposal states, we have a clear subtype relation between sync and async functions, so that they are interchangeable in one way:

func add(_ a: Int, _ b: Int) -> Int { a + b }

func add(_ a: Int, _ b: Int) async -> Int { a + b }

Apart from the fact where the function is going to be called, nothing has changed: everything you can write in the synchronous function, you can write in async one, because its the latter that has special ability to abandon thread. As for the semantic difference, when you deal with the protocol (as an existential or a generic constraint), no information is hidden from you: you deal with async function and have you assumptions correct. For the difference at the implementation site, I don't understand what assumptions can be violated if protocol declares function as async.

nickasd · October 5, 2024, 9:52am

Your explanations certainly make sense. Another reason for why this seems unexpected to me is the following. Until now, I saw protocols as a way of enforcing that all implementers have the same method signatures: as soon as I change the name or argument type or argument number of a method in the protocol declaration, the compiler shows errors at both the call and implementation sites. On the other hand , if I add the async keyword, the compiler only shows errors at the call sites (because they miss await) but not at the implementation sites.

I guess I will have to get used to the fact that I can rely on the compiler when I change protocol argument types and add or remove arguments, whereas if I make a method asynchronous I will have to manually inspect all implementers and check whether they can be rewritten to make use of asynchronicity. At the moment, adding async to a protocol definition might be, like you said, a handy way of converting synchronous code to asynchronous, but it hides the fact that the automatically converted code might have potential to be optimized even more for asynchronicity, e.g. by running some of its sub-operations in parallel.

aetherealtech · October 5, 2024, 11:24am

The question seems to be: why is asynchrony not a requirement that can be specified by a protocol? In other words, why is a protocol not allowed to say: “you must implement this asynchronously”, instead it can only say: “you must implement this synchronously” or “you are allowed to implement this asynchronously or not”?

It’s a similar question with throwing. Why is a protocol allowed to say “you must implement this without throwing” or “you may implement this with or without throwing” but a protocol cannot say, “you must implement this with throwing”?

Well if asynchrony were a requirement mandated by a protocol, would it be satisfied if you simply added the async keyword to a function, but didn’t change anything else… meaning the function doesn’t await anywhere? If so, what are you gaining by requiring implementers to mark their methods as async but not make use of asynchrony? What you lose is that in code that works directly with a concrete type, it now has to await those calls (and so it can’t call the function in sync code) even though it really doesn’t need to, because there’s no actual suspending in the function.

Same deal with throwing: if a protocol mandated that implementers must mark their functions as “throws”, then implementers that don’t need to throw anything still wouldn’t, they would just mark their functions as throws anyways. Again, what do you gain? What you lose is the ability to call the function without try even though that’s unnecessary.

A function declaring throws isn’t a promise it actually will ever throw anything (just as await isn't a promise it will actually suspend). So really the protocol rules are just reflecting the meaning of those keywords: they always mean "this might happen so you need to prepare for it".

Ideally you only want to mark functions as async if they need to suspend, and you only mark functions as throws if they need to throw an error. It doesn’t make sense for a protocol to mandate that a function declare it suspends or throws. What might make sense is a protocol mandates non-functional requirements (NFRs) like “this call doesn’t block for more than x milliseconds”, which depending on what the function is supposed to do, might practically demand that it suspend… or an NFR that a function retrieve something over an unreliable channel (like a network), which practically demands it might throw. It’s a neat idea that one day we could tell the compiler about NFRs like that and have it synthesize the automated tests that prove all our implementations satisfy them, and I think in C++ land this is what contracts are supposed to do (I haven't used them so I'm not really sure, I just get that impression when reading about them).

Since you mentioned potential performance optimizations, maybe this is what you’re really after. So it’s not a matter of enforcing the async keyword but really NFRs like not blocking for too long. If a concrete type can find a way to satisfy that without suspending, then it doesn’t need the async keyword, but the NFR might make that practically impossible. Until there's a way to teach the compiler about those NFRs, you just need to write the tests yourself. So then write a test for your protocol that can run against any concrete implementation that checks it has the performance characteristics you need.

nickasd · October 5, 2024, 12:27pm

Exactly. And your explanations make sense as well, both for async and throws.

I vaguely remember that when I discovered by chance that I can implement a throws protocol function without having to declare it as throws, I was at first confused, but quickly accepted it and perhaps even found it cool, because it makes sense that I wouldn't be forced to do so when it doesn't contain any try expression.

For async though, I think the situation is a bit different. Adding throws to a function definition doesn't change the program's behaviour, but adding async does. Even when only considering functions that don't contain any await expression, adding async to their definition can cause them to be run in parallel with other tasks. Perhaps that's why I didn't see the connection with throws when used in a protocol. (Maybe I was even subconsciously looking at async like a final completionHandler argument, which the implementers wouldn't be free to leave out at will.)

But again, with all the explanations you guys gave me, it makes a lot more sense now.

ibex10 · October 5, 2024, 12:45pm

All the explanations make sense, in theory only; but, in real life they don't not help, especially when you are having a late night session.

A protocol is a specification of a public interface.

protocol P {
    func foo() async
}

class A: P {
    ...
    func foo() {}
    ...
}

class A has already promised me that it conforms to protocol P. Why should I bother checking the declaration of foo (), which may be burried 137 lines beneath the first line.

vns · October 5, 2024, 12:59pm

That can be seen as a reason to allow such omit as well. Protocol only says that is may suspend, but if the implementation doesn't need to do so, why it should be enforced? The need for async can arise simply from async I/O, not driven by any performance reasoning outside of that.

I'd argue that if your implementation is fine in synchronous form despite protocol declaration, there is no need to go and check every implementation if they need to executed asynchronously. That looks like premature optimisation to me: you don't know if there any need to change the behaviour at all.

Further, I'd also argue that If your system needs to ensure that work is done asynchronously to others, than this should be ensured on your side, not exposed to the clients, because than you have leaky abstraction: you rely on implementation side to work correctly and implementation side should rely on knowledge of internals to provide correct conformance. Most APIs hide this from the end users, so that they are in control where to run their code.

tera · October 5, 2024, 1:32pm

Interestingly there's this difference:

protocol P {
    func foo() throws
    func bar() async
}
class S: P {
    func foo() {} // ✅
    func bar() {} // ✅
}

class C {
    func foo() throws {}
    func bar() async {}
}
class D: C {
    override func foo() {} // ✅
    override func bar() {} // ❌ Method does not override any method from its superclass
}

Whether this disparity is a bug or a feature I don't know.

QuinceyMorris · October 5, 2024, 6:09pm

This is not actually true in regard to async. At least I'm not aware of anything currently in Swift that makes it true.

When you write an async function, the entry and exit boilerplate of the function is (with an exception that I'll mention below) synchronous. That is, until and unless execution hits a suspension point in the function — an await in that function itself — there's no run time difference in behavior from a synchronous function. This was by design.
In Swift concurrency, there's no real need to mark functions async at all. It was a design choice made to ensure that developers aren't accidentally misled about the run time behavior of functions that adopt asynchronicity or concurrency.

In that regard, aysnc is exactly like throws, in that it's a bit of extra syntactic ceremony that the language imposes on developers for — excuse the phrase — their own protection. Indeed, throws was the design used to justify the adoption of this ceremonious async keyword.

So, Swift has no semantic concept of "async-ness" associated with a function declaration, nor with a protocol requirement. The protocol requirement is really just syntactic — of marking a requirement as allowing and requiring an await at the call side. If the protocol didn't specify it, a calling site couldn't write an await without violating the ceremonial rules. The actual conformance (called function) can be any function with the correct type signature.

The one odd case here is an isolated function in an actor. In that case, [see below]~~because actors are non-reentrant,~~ execution may have to be suspended before-or-during [I don't know the exact details] the function entry boilerplate, to prevent it doing anything unsafe in synchronous code before execution hits the function's first explicit suspension point.

This is really just the same as "normal" async functions, except that there's a hidden suspension point that Swift inserted for you. Note that because this is basically compiler shenanigans, the function doesn't need to be marked async explicitly.

Edit: D'oh, I always get this backwards. IAC, it's not the reentrancy that leads to this initial suspension, just the way actors work.

vns · October 5, 2024, 10:13pm

There is a difference: if that is a nonisolated type that conforms to a protocol, and async function called within some actor, it will hop off the actor. Consider following:

protocol P: Sendable {
    func foo() async
}

struct A: P {
    func foo() {
        MainActor.assertIsolated()  // ok
        print("I am sync and run in actor isolation")
    }
}

struct B: P {
    func foo() async {        
        // Uncomment to see in play
        // The following line will crash at runtime
        // MainActor.assertIsolated()
        print("I am async and hop off an actor.")
    }
}

@MainActor
struct X {
    let p: any P

    func test() async {
        await p.foo()
    }
}

let x1 = X(p: A())
await x1.test()

let x2 = X(p: B())
await x2.test()

There is a significant difference how this code will run. So async also not just a marker for the developers, it has an impact.

QuinceyMorris · October 6, 2024, 2:23am

Ah, nice! This is a more recent complexification, resulting from Swift's more recent "callee decides the isolation" policy.

I'm not sure, though, why it makes any more implausible — or undesirable — that a non-async-function can satisfy an async protocol requirement. Both of your foo functions can satisfy an await. It's just that in cases like this they do so in different ways.

QuinceyMorris · October 6, 2024, 2:32am

I was asked what evolution proposal defined this behavior. It's actually in SE-0306 Actors:

The second form of permissible cross-actor reference is one that is performed with an asynchronous function invocation. Such asynchronous function invocations are turned into "messages" requesting that the actor execute the corresponding task when it can safely do so. These messages are stored in the actor's "mailbox", and the caller initiating the asynchronous function invocation may be suspended until the actor is able to process the corresponding message in its mailbox. An actor processes the messages in its mailbox one-at-a-time, so that a given actor will never have two concurrently-executing tasks running actor-isolated code.

[my emphasis]

aetherealtech · October 6, 2024, 10:07pm

I was just reminded of this earlier today while trying to test something, with the test failing because I relied on the invalid assumption that calling down through async functions is synchronous until it hits a genuine suspension point. My thinking on that is confounded with C# experience (where you can even call an "async" function from a sync one, because there's really no such thing, you just can't await the result, and it will synchronously execute until something actually suspends).

I definitely find this surprising behavior and am a little skeptical it is the "right" way to do it (I lean toward it being correct that an async function inherits its actor context unless and until you explicitly opt out of that with e.g. Task.detached. After all, if Task { await doX() } inherits the actor context, why wouldn't await doX()?).

However, I also suspect this difference is only "visible" in situations where you're doing something unsafe. In your example you're asserting main actor isolation in a function that is not main actor isolated. In my aforementioned test I was making assumptions about the execution order of things that have no such guarantee... testing concurrency utilities is tricky.

If you rely only on what is safe to rely on (i.e. that code is executing on a specific actor only if it's marked as such, or that X happens after Y only if there's an await between them), the calls jumping back to the default task actor should be a pure implementation detail that "shouldn't matter". It probably helps keep actors better utilized, since if async code isn't marked as actor isolated, it's okay to execute it on other actors, so you might as well free up a specific actor to stay concentrated on what does need to execute on it.

If you want to specifically require in a protocol that a function execute on the default task actor (what requiring the async keyword would actually accomplish), I don't think there's a way to do that now, only because there's no nominal Actor type for that default actor (at least I don't know of one, please correct me if I'm wrong). If there was, adding @DefaultActor (or whatever) to the protocol requirement would do the trick, even for conforming types that can omit the async (and it would still prevent anyone from calling it in non-async code unless it's also similarly isolated). I'm not sure if adding that capability is a good idea, since isn't declaring something can run safely on the default actor saying it can safely run anywhere? Why force any code to run there?

vns · October 7, 2024, 5:56am

It is commented out and there for demonstration purposes how async changes function behavior.

There is nothing unsafe in the code. There is simply the difference between synchronous and asynchronous functions in Swift, which makes an effect on where the code will be running and how it will behave.

Swift has semantic of isolation, so depending on details async function can run in an isolation or nonisolated. Latter right now always hop off the actor, and isolated will immediately change their context to the declared, so async function is running in the same isolation only if it has the same isolation. You can check this behavior with isolation assumptions on various cases.

aetherealtech · October 9, 2024, 11:26am

I think you didn't understand the point I was making. In order to demonstrate this difference you had to show code that does something unsafe and would crash if used in a way that's not prevented or annotated from call sites. If it being commented out means it doesn't matter, why not remove it from your example? It's because that wouldn't demonstrate what you want to, right? With it commented out, you don't notice the hop off the actor.

The safe version of this unsafe code:

func setSomeUIState() {
  MainActor.assumeIsolated {
    // Access @MainActor isolated state
  }
}

is this:

@MainActor
func setSomeUIState() {
  // Access @MainActor isolated state
}

Once you do that, adding the async keyword no longer changes "where" it executes. It does still change it from running on the same loop iteration of the main thread to being scheduled for a later iteration. But this is also something you can't safely couple to. Adding async forces the caller to add await, which similarly pushes everything after the await onto a later loop iteration, so the synchronization between different points in that function remain the same. You would only "notice" that the pieces are spread across multiple run loop iterations if you leave invariants in an inconsistent state across the await, but that's always incorrect (and why any request to remove the await keyword from the language indicates a fundamental misunderstanding of cooperative multitasking). If you need to prevent that even when the call is going through a protocol, you can enforce it by removing the async keyword from the requirement. So once again, protocols requiring conformances to not be async makes sense, and Swift supports this, while requiring conformances to be async really doesn't.

My point is checking "where" a non-isolated function runs is an implementation detail in the sense that the compiler is free to change it without it affecting the logic of a well-behaved program. With optimizations on, the compiler can also reorder instructions, but only if it is sure that doesn't change the behavior of the program. That a call to await an async function hops off the actor falls under this category: the compiler freely choosing an implementation that still produces the logic of the program. Since the function isn't actor isolated, that is telling the compiler "run this wherever you like, my program's logic is unaffected by that". If you notice this hop across actors in the sense it changes your program's logic (not what you can see in a debugger), you must be doing something unsafe, similar to how you can "notice" instruction reordering or other optimizations if you violate type punning rules.

vns · October 9, 2024, 12:29pm

That's the purpose of demonstration: it allows you to observe the behaviour in unambiguous way.

aetherealtech:

The safe version of this unsafe code:

func setSomeUIState() {
  MainActor.assumeIsolated {
    // Access @MainActor isolated state
  }
}

is this:

@MainActor
func setSomeUIState() {
  // Access @MainActor isolated state
}

These are two different functions, and they also differ from my example as well, which, again, has single purpose to demonstrate that async has an impact on the function behaviour.

I'm out of this conversation at this point.