When should you mark a function `async`?

taylorswift · May 10, 2023, 12:00am

i have some code that calls system functions like read, readdir, etc. these functions aren’t async, but they are sort of “async-like” in that they can block.

is there any point in marking the API wrapper async, even though no awaits appear inside the function?

tera · May 10, 2023, 1:00am

It makes sense: await / async will hop to a background thread, so if it so happens that the read / readir wrapper is called on the main thread - main thread will not be blocked:

func readAsync() async {
    dispatchPrecondition(condition: .notOnQueue(.main))
    read(...)
}

@MainActor
func testProc() async {
    print("testProc started")
    dispatchPrecondition(condition: .onQueue(.main))
    await readAsync()
    dispatchPrecondition(condition: .onQueue(.main))
    print("testProc ended")
}

Task {
    dispatchPrecondition(condition: .notOnQueue(.main))
    await testProc()
}
RunLoop.main.run(until: .distantFuture)
print()

BTW, consider using existing async analogue of read (e.g. fileURL.resourceBytes).

ktoso · May 10, 2023, 4:16am

The real answer for IO is "put it such methods on some specific actor, and give it a dedicated executor that is NOT the global cooperative default pool" in order to get code blocking on IO off the cooperative pool - which can easily exhaust available threads and cause the rest of the system to stall while the IO is blocking all the available (few) threads. The default pool does not add additional threads to compensate for such behavior (by design).

These methods don't actually have to be marked async, since calling them cross actor will have the implicit hop anyway.

The question in the title is about "when should you mark a function async" and to that the answer is perhaps a bit underwhelming: when it needs to suspend. If it doesn't, probably no need to make it async. (Blocking on e.g. IO is not suspending).

Jon_Shier · May 10, 2023, 4:22am

It would be nice if Swift provided such an abstraction, or if Dispatch was updated to expose the limited width pools that Swift uses behind the scenes. Otherwise we don't seem to have a great way to deal with IO right now.

ktoso · May 10, 2023, 4:47am

Certainly. And we're aware of general need. Raising an issue is helpful on github if you'd like to do that though, forums comments just get lost in anecdotes that are harder to turn into scheduled work.

On the server side NIO offers a pool that is a good candidate to base such executor on: NIOThreadPool.swift It's some work but not too hard.

adamkuipers · May 10, 2023, 7:10pm

I thought marking a function as async without a suspension point is never going to suspend. The original async\await proposal says, "asynchronous functions never just spontaneously give up their thread; they only give up their thread when they reach what’s called a suspension point." I read this as saying synchronous and asynchronous functions will behave the same if they only call another synchronous function (e.g, read).

Avi · May 10, 2023, 7:27pm

The primary affect of marking a function as async is that it must be called from an async context. The same function without the attribute can be called from any context. This, I believe is the point behind Konrad's advice: only make a function async if it has to suspend.

taylorswift · May 10, 2023, 8:31pm

well, there is one big reason why you would want to (over)mark certain things as async, because AsyncSequence can throw, while Sequence cannot:

extension FilePath.DirectoryView:AsyncSequence
{
    @inlinable public
    func makeAsyncIterator() -> FilePath.DirectoryView
    {
        self
    }
}
extension FilePath.DirectoryView:AsyncIteratorProtocol
{
    public
    typealias Element = FilePath.Component

    public
    func next() async throws -> FilePath.Component?
    {
        ...

for comparison, the code this FilePath.DirectoryView implementation is based on just swallows all errors.

perhaps this is a silly reason to promote something to async. but there just doesn’t seem to exist a better way to model a sequence of values that can fail in the middle of iteration than AsyncSequence.

Nickolas_Pohilets · May 10, 2023, 9:02pm

Don’t default actors use the same global cooperative pool under the hood? They own a queue of jobs, but draining this queue is still scheduled on the generic executor, or not?

taylorswift · May 10, 2023, 9:06pm

what if i just put them on some global actor like @SystemActor?

@globalActor
public
actor SystemActor:GlobalActor
{
    public static
    let shared:SystemActor = .init()

    init()
    {
    }
}

stephencelis · May 10, 2023, 9:50pm

It's probably worth pushing for fleshing out @rethrows support and extending it to Sequence over using AsyncSequence for this purpose.

taylorswift · May 10, 2023, 10:12pm

well, right now my unsafe stream iterator is bound to a global actor:

extension FilePath.DirectoryIterator.Stream
{
    @SystemActor
    private mutating
    func open() throws -> FilePath.DirectoryPointer?
    {
        ...

so my “safe” iterator interface need to await on the SystemActor anyway:

extension FilePath.DirectoryIterator:AsyncIteratorProtocol
{
    public
    func next() async throws -> FilePath.Component?
    {
        try await self.stream.next()
    }
}

it would be great for unrelated use-cases to have Sequence support @rethrows. but i can adapt my filesystem-adjacent code to use AsyncSequence today, with a few hours of work. by contrast, pushing swift-evolution legislation flunks a few important criteria for me:

effort (tremendous)
timeliness (likely to take up to a year)
likelihood of success (extremely low, from past experience)

so, AsyncSequence it is.

David_Smith · May 10, 2023, 11:44pm

This prevents it from using all the cooperative pool threads, which is good, but has two notable downsides:

You can't do more than one IO concurrently. This is much less important for most regular files, but can matter a lot for pipes, network IO, and dubiously responsive network filesystems
It still occupies one cooperative pool thread, reducing your available parallelism for CPU work

On the other hand, the custom executor solution needs to manually limit its concurrency to avoid thread explosions (and determining the optimal width to limit to is heavily hardware-dependent in non-obvious ways), and may not play nicely with priority donation.

For the most common scenarios in mobile or desktop apps, I think single-threading regular file IO like this is a completely reasonable default behavior (and in fact it's exactly what I implemented in the initial version of AsyncBytes for file URLs). For server-side situations, other types of IO, or workloads with unusual performance requirements, other strategies may be needed (as we see in NIO).

Making this less situational and easier to get right would be a nice improvement, but is not nearly as simple as it looks at first, second, or third glance.

taylorswift · May 11, 2023, 12:24am

does an actor occupy an entire thread? what happens if there are more actors than there are threads?

David_Smith · May 11, 2023, 12:29am

Only while it's executing something. Once it suspends, something else can use the thread. If there are no free pool threads the work is kept in a priority queue to wait until it has a chance to execute. The reason IO is tricky is that read() doesn't suspend, it just sits there until the kernel gets back to it.

taylorswift · May 11, 2023, 12:43am

isn’t this unavoidable? that read call needs to execute somewhere, right?

David_Smith · May 11, 2023, 1:53am

Indeed. It's a conundrum, isn't it? Regardless of where we end up long term, we'll need patterns to use until then, so I'm glad other people are thinking about this stuff too

taylorswift · May 11, 2023, 2:33am

thinking about this more, @SystemActor is probably not the answer because if we have a subprocess and a pipe we’re reading its stdout from, then:

we have a waitpid, which runs on and blocks the SystemActor, and
we have a read from the pipe, which needs to run on the SystemActor, which is currently busy waiting for waitpid to return.
but waitpid is blocked on read, because the subprocess has filled up its stdout buffer, and is blocked on write.
but read is blocked waiting for waitpid to yield the actor. so we have a deadlock.

ughhh.

eskimo · May 11, 2023, 8:19am

It's a conundrum, isn't it?

We just need to rewrite APFS in Swift and then it’s async all the way down!

(-:

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

tclementdev · May 11, 2023, 8:38am

But due to SE-0338, marking a function as async will cause it to execute somewhere on the concurrency thread pool. This is already not appropriate for IO, and if the caller is running on the main thread (or anywhere outside of the concurrency thread pool, or in an actor), it will have to suspend and await.