Discussion: Unavailability from asynchronous contexts

etcwilde · October 27, 2021, 1:22am

There are APIs that cannot be used with the Swift concurrency model, or should be used with extreme caution. APIs that involve thread-local storage will not behave correctly, and holding onto locks across suspension points can lead to deadlocks and other weird behavior. As a result, developers should be able to specify API that is not available for use from within asynchronous contexts.

I've started thinking about how such an attribute could work, but have come to a few cross-roads.
The high-level idea can be summarized with the snippet below:

@available(*: unavailableFromAsync)
func pthread_shenanigans() {
  // Cute concurrency model you have there. 
  // It would be a shame if something were to happen to it...
}

func asyncFunc() async {
  pthread_shenanigans() // error, `pthread_shenanigans()` is not available from an async context
}

My questions arise when we start talking about how this attribute should propagate.
I see three possible roads, though I am open to additional thoughts:

Implicitly Inherited Unavailability
Explicit Unavailability
Thin Unavailability Checking

Each of these has different pros and cons that I discuss below.

Propagation

The attribute needs to propagate through some mechanism in order to be effective. We don't want folks to be able to circumvent this mechanism with a layer of indirection.

func myWrapperFunc() {
  pthread_shenanigans()
}

func asyncFunc2() async {
  myWrapperFunc() // This should still emit an error
}

Implicitly Inherit Unavailability

Pros:

Low developer overhead
Correct

Cons:

Compile-time cost

Being able to implicitly inherit the unavailability attribute results in a walk of each declaration and expression in the function. This is expensive, but is alleviated by two factors. First, we can memoize the results with the requester. Second, we only need to run this check starting at asynchronous functions and public functions.

Consider the following code:

func a() { }
func b() { a() }
func c() { b() }
func d() {
  a()
  b()
}

func e() async {
  d()
  c()
}

The steps the checker will take are as follows:

e: expand
- d: expand
  - a: expand
  - b: expand
    - a: memoized
- c: expand
  - b: memoized

There are no additional async functions in the file to check, so we are done.
There are 5 expansions and 2 memoized results returned.

The developer of a framework only needs to annotate the specific API or types that are unavailable and the compiler propagates the requirement to every function that calls it. With this implementation, myWrapperfunc implicitly inherits the unavailability from an asynchronous context from the call to pthread_shenanigans.

Module Boundaries

The availability from asynchronous contexts would need to be computed and stored in the swift module for every public function and public type to ensure that bad API can't leak across module boundaries.
This can, of course, be done by the compiler automatically, so most developer probably won't notice.

C/Objective-C

While types and functions can explicitly be attributed with an attribute like __attribute__((swift_attr("unavailableFromAsync"))) (the exact spelling is up for discussion), ultimately, like any other bridging to C, there is a level of trust that the library isn't doing anything nefarious, or that it has been appropriately annotated if it does.

We can check that annotated types and API's are not being used directly. If they are used directly in a Swift function, then the unavailability is propagated implicitly. If the unavailable API is used within another, unannotated API, it is not possible for Swift to look inside the C/Objective-C function for verification.

Availability Behavior

Pros:

Correct

Medium:

Compile-time cost (Cheaper for unavailability checking)

Cons:

Lots of developer overhead

In this model, developers need to explicitly annotate API with the unavailableFromAsync in order to use any other unavailable APIs or types. This may be fairly noisy as we go through and annotate things like pthread_mutex_t, but may not be too bad with sufficient fix-it mashing.

Checking whether a given expression is available in an asynchronous context is trivial with this model. For each function called in a given asynchronous context, check that the unavailableFromAsync attribute is not present.

Ensuring that each function is properly annotated is where the cost comes in.
The explicit-annotation checker needs to ensure that any usage of unavailable API is only used from an unavailable context and emit a fix-it and diagnostic in cases where that does not hold.
Given the code example from above, the explicit annotation checker follows:

a: expand
b: expand
- a: memoized
c: expand
- b: memoized
d: expand
- a: memoized
- b: memoized
e: expand
- d: memoized
- c: memoized

There are 5 function expansions and 6 memoized results returned. The example had no functions that were untouched by the asynchronous function e, so the number of expanded functions is the same as in the implicit case. If additional synchronous functions were declared in the file, the number of expanded functions would increase for the explicit annotation checking where it wouldn't for the implicit availability checking. That all said, the checking for this already exists in the compiler today for ensuring that declarations have the necessary OS availability attributes (in places where fixAvailabilityForDecl gets called). I'm still wrapping my head around how these mechanisms work, so my understanding of the cost model is fuzzy.

Module Boundaries

Like with the implicit version, public API would need to be annotated at module boundaries.
This is not a factor because the explicit annotation is a requirement for functions if the are unavailable.

C/Objective-C

Like with the implicit checking, we can ensure that annotated C/Objective-C types are not being used directly from asynchronous contexts, or indirectly through a chain of synchronous calls in Swift through the use of an unavailableFromAsync swift attribute. The same limitations exist, we are unable to verify that the unavailable API are not used within C/Objective-C functions.

Thin Checking

Pros:

Compile-time cost (It's cheap!)
Low developer overhead

Const:

It's easy to mess up

This is the cheapest option with the least level of protection. The checking only verifies that an asynchronous function isn't directly calling or using an unavailable declaration. This is a simple walk of the expressions and types in the body of the asynchronous function verifying that nothing has the unavailableFromAsync attribute.

This would allow developers to circumvent the protection with a layer of indirection, either by wrapping the unavailable type in a struct of their own, or the function in another synchronous function.

Future

In the future, I would like to propose a mechanism for a weaker form of checking. It is technically possible to use semaphores and locks in the straight-line code between suspension points safely and a model for that would be beneficial. That is not part of this discussion.

xwu · October 27, 2021, 3:06am

What if we spelled this as nonasync and let the rules fall out from there, paralleling async (implying strongly something similar to what you call “availability behavior”)?

beccadax · October 27, 2021, 3:13am

Let's start with:

Propagation

I don't think implicitly-inherited is a reasonable possibility because a caller would not be able to know if its callee is unavailableFromAsync without type-checking the callee's body. Being able to skip this work is one of the linchpins of non-WMO compilation performance; pulling it out could have drastic effects on compilation speed.

I also worry that there might be cycles in the analysis that would make it more difficult to handle than you might at first assume. Memoization is straightforward when you're processing trees, but once functions are no longer marked explicitly, this becomes a graph traversal. I would guess that it's tractable, but it's not easy.

If implicitly inherited is out, that leaves only the explicit propagation and thin checking options. This is a pretty straightforward trade-off between "adds another effect/color" and "misses bugs". I'm going to assume we want to catch more bugs, because the design questions for explicit propagation are a strict superset of the ones for thin checking.

Declarations affected

I actually don't think we'd want to mark types like pthread_mutex_t because it is often possible to build safe operations from unsafe ones. For instance, the primitive pthread_mutex_t operations are unsafe to use with async/await, but this one should be safe (or at least as safe as any lock ever is) since you can't await in the critical section:

func pthread_mutex_with_lock<R>(
    _ mutex: UnsafeMutablePointer<pthread_mutex_t>,
    do criticalSection: () throws -> R    // Note: not async!
) rethrows -> R {
    switch pthread_mutex_lock(mutex) {
    case 0:
        defer { pthread_mutex_unlock(mutex) }
        return try criticalSection()

    case EINVAL:
        preconditionFailure("invalid mutex")
    case EDEADLK:
        fatalError("locking would cause a deadlock")
    case let value:
        fatalError("unknown pthread_mutex_lock(_:) return value \(value)")
    }
}

That suggests we should be marking operations, not types, as unavailableFromAsync. It also suggests we need a way to mark a particular operation as permitted to use unavailableFromAsync operations without making itself unavailableFromAsync to its callers:

@unchecked(unavailableFromAsync)    // or something
func pthread_mutex_with_lock<R>(
    _ mutex: UnsafeMutablePointer<pthread_mutex_t>,
    do criticalSection: () throws -> R
) rethrows -> R { /* ...as before... */ }

Feature set

There are a lot of bells and whistles we could add to this. For instance:

Should this feature accept a custom message:? How about renamed:?
Should this feature make the operation deprecated, unavailable, or give both options?
Should this feature support the standard set of platform and version options we expect from @available?
Should this feature allow you to say that an operation is unavailable/deprecated from async before a specific version? For instance, suppose iOS 15 ships with a version of an operation that's compatible with Swift Concurrency, but if you use concurrency in back-deployed scenarios, it's unsafe. This isn't something that existing kinds of availability can express.

For my money, I think message: is a no-brainer (it lets you tell your users what they should do instead), renamed: and a deprecation option are solid contenders, and any sort of versioning is probably overkill. But reasonable engineers may not agree.

Syntax

Finally, we get to the bikeshed color. I think that, if this feature does not have any OS- or version-specific behavior, it probably shouldn't be part of @available. Instead, it's better thought of as something like @discardableResult that specifies diagnostic behavior at the call site.

I'm not going to push for a specific spelling, but keep in mind that I think it should probably support at least an unsafe or unchecked flag and a message: option.

George · October 27, 2021, 5:12am

I've been thinking about this very issue but from a slightly different perspective: blocking APIs. These aren't quite as explicitly problematic as thread-local-accessing-APIs, but can still result in very subtle failures, ranging from performance degradation to resource starvation. I'd bet this is the reason NIO provides NonblockingFileIO, which allows you to schedule totally-could-block API calls on their own threads, thus leaving the primary event loops available for "true" nonblocking async code. My guess is it will be really common for folks to accidentally call subtly blocking APIs from async contexts without a mechanism such as this. @beccadax's suggestions of renamed: seems like an important part of the solution (especially for APIs without similarly-spelt async overloads).

etcwilde · November 11, 2021, 2:11am

Alright, taking into account the feedback here; I have most of an initial implementation that should be mostly ready: GitHub - etcwilde/swift at ewilde/concurrency/UnavailableFromAsync-attribute.
I still need to write up the full evolution proposal and incorporate a few things.

After various discussions, the performance cost of either of the guaranteed checking is too expensive and would need far more consideration as it adds another color to functions. As a result, I've gone with a thinner model, so it will be possible to circumvent the checking by wrapping unavailable calls in a synchronous closure and calling that. The attribute can only be placed on function declarations and on constructor declarations, but not on destructors since we can't guarantee where those will be called and they should be safe to use from anywhere.

Since it is thin checking, we don't need a way to stop the mechanism when the API are actually being held safely, since wrapping the unsafe function in another function or closure will suffice. It is up to the API developers to ensure that they are propagating that annotation correctly when working with mutexes, semaphores, and thread-local storage.

The annotation will start as a warning in Swift 5.6(?) and turn into a proper hard error in Swift 6.

In accommodating the discussion, I've made the following changes:

The checking is weak, only looking at the declarations used in async contexts
The annotation can only be used on constructors and function declarations
The spelling is @unavailableFromAsync.

In order to get this implemented fairly quickly, I haven't added any of the bells nor the whistles yet.

Yep, that sounds like a good reason not to go down this route to me.

I went down the @available route to match up with the direction @completionHandlerAsync took.
This is kind of similar to that, but with a harder error. The difference being that this should be a harder error about correctness, while that is more of a suggestion. I do like using a different syntax though as it simplifies the implementation a fair bit. Maybe not the right reason, but it works for me.

An optional message: or renamed: probably makes sense in most cases. Most things should be wrappable in something like the pthread_mutex_with_lock function described above, though I'm not sure all of them will be.

Since we aren't doing the strong checking, I don't think we need an unsafe or unchecked flag anymore; wrapping the call in a synchronous closure call will have the same effect as not checking it.

@George, I think we could accommodate this here. While I'm going back-and-forth on whether the goals align better with [Pitch] @completionHandlerAsync attribute - #27 by bnbarham, it might make sense to have that be part of this attribute. A spelling could look something like @unavailableFromAsync(weak, message: "This is slow") for the blocking, but not technically wrong, or @unavailableFromAsync(message: "This causes resource starvation") for the "it's wrong, and you'll hurt if you do this". The weak form would stay a warning in swift 6, while the strong form will eventually be an error.

etcwilde · December 4, 2021, 12:06am

George:

I've been thinking about this very issue but from a slightly different perspective: blocking APIs. These aren't quite as explicitly problematic as thread-local-accessing-APIs, but can still result in very subtle failures, ranging from performance degradation to resource starvation. I'd bet this is the reason NIO provides NonblockingFileIO , which allows you to schedule totally-could-block API calls on their own threads, thus leaving the primary event loops available for "true" nonblocking async code. My guess is it will be really common for folks to accidentally call subtly blocking APIs from async contexts without a mechanism such as this. @beccadax's suggestions of renamed: seems like an important part of the solution (especially for APIs without similarly-spelt async overloads).

I've been mulling this comment over for a while now. I'm still not quite convinced that blocking calls are necessarily something you want to ban from asynchronous contexts. It seems like you would be to wrap the blocking call in a detached task/async-let to avoid blocking the current task if that is necessary. Am I missing something? There is a subtle undertone of the goals of the completionHandlerAsync attribute; [Pitch] @completionHandlerAsync attribute. Right now that only works on functions that have an escaping closure. I wonder if that could be generalized a bit more for the warning case?

ktoso · December 4, 2021, 3:57am

Yeah you’re right that “banning blocking from async contexts” isn’t really meaningful.

What the actual need is, can be expressed as “ban blocking on cooperative executors” such as an NIO event loop, or our global executor in Swift Concurrency.

But the entire purpose of asyncing “away” such work is exactly to offload the blocking to a different thread — now that’s where we’re missing functionality today. You’d indeed want to create a new task for the blocking work, but that task would want to run on a “executor dedicated to blocking things” (IO is one example, but it could also be “very slow function that does not yield asynchronously” or anything like that.

In todays swift concurrency “just make a new Task to invoke the blocking code” is meaningless, because they all end up on the same shared pool.

Akka’s success was IMHO in large part due to how easy it is to write an asynchronous system AROUND terrible blocking things, and just give the “terrible blocking dispatcher (executor in our terminology)” to actors who invoke the terrible blocking code.

Long story short: the right thing to do about blocking is “so it on a task that is NOT the shared pool”. And we’ll be able to do this only once custom executors are a thing.

So: we should not ban calling blocking code in “async context” in general, but if there would be a way to ever detect what kind of executor we’re on that’d be something exciting to issue warnings about — in general this isn’t solvable I think, but in specific contexts where we know we’re on the global pool or on the main actor, we could perhaps issue warnings?

They’re not really actionable “within” swift concurrency until we get custom executors tho. But people can of course make their own dedicated queue or threadpool for the blocking things and run them and complete an unsafe continuation from there etc. But within swift concurrency to solve this we’ll need custom executors.

etcwilde · December 4, 2021, 5:13am

Hmm, interesting. This is around 500 miles out of scope of this proposal, but I wonder if it would be useful to have an IO global actor (@IOActor anyone?), kind of like @MainActor, but for IO. Since you can annotate API as having to run on a given global actor, you could force IO operations to happen on the IO actor. That would be kind of neat.

ktoso · December 4, 2021, 5:21am

Hah, indeed! We’re still far away from this… but once custom executors enter the picture it’s definitely something to consider, especially if we want to offer IO things.

hassila · December 4, 2021, 6:38am

I wish we’d have a true async backend vis-à-vis the kernel syscalls instead also on Darwin (no idea about windows) like Linux io_uring. Then IO basically isn’t blocking anymore in practice.

Side note on annotation - I always felt @MainActor might want to be spelled @Actor(Main) to be able to extend it for custom executors but no deeper thought than that.

David_Smith · December 4, 2021, 7:11pm

Internally, this is how AsyncBytes works today.

etcwilde · December 6, 2021, 10:09pm

Thank you for the feedback here on the discussion. I think I've compiled it all into a reasonably nice pitch here: Pitch: Unavailability from asynchronous contexts.