i have some code that calls system functions like read, readdir, etc. these functions aren’t async, but they are sort of “async-like” in that they can block.
is there any point in marking the API wrapper async, even though no awaits appear inside the function?
It makes sense: await / async will hop to a background thread, so if it so happens that the read / readir wrapper is called on the main thread - main thread will not be blocked:
The real answer for IO is "put it such methods on some specific actor, and give it a dedicated executor that is NOT the global cooperative default pool" in order to get code blocking on IO off the cooperative pool - which can easily exhaust available threads and cause the rest of the system to stall while the IO is blocking all the available (few) threads. The default pool does not add additional threads to compensate for such behavior (by design).
These methods don't actually have to be marked async, since calling them cross actor will have the implicit hop anyway.
The question in the title is about "when should you mark a function async" and to that the answer is perhaps a bit underwhelming: when it needs to suspend. If it doesn't, probably no need to make it async. (Blocking on e.g. IO is not suspending).
It would be nice if Swift provided such an abstraction, or if Dispatch was updated to expose the limited width pools that Swift uses behind the scenes. Otherwise we don't seem to have a great way to deal with IO right now.
Certainly. And we're aware of general need. Raising an issue is helpful on github if you'd like to do that though, forums comments just get lost in anecdotes that are harder to turn into scheduled work.
On the server side NIO offers a pool that is a good candidate to base such executor on: NIOThreadPool.swift It's some work but not too hard.
I thought marking a function as async without a suspension point is never going to suspend. The original async\await proposal says, "asynchronous functions never just spontaneously give up their thread; they only give up their thread when they reach what’s called a suspension point." I read this as saying synchronous and asynchronous functions will behave the same if they only call another synchronous function (e.g, read).
The primary affect of marking a function as async is that it must be called from an async context. The same function without the attribute can be called from any context. This, I believe is the point behind Konrad's advice: only make a function async if it has to suspend.
well, there is one big reason why you would want to (over)mark certain things as async, because AsyncSequence can throw, while Sequence cannot:
extension FilePath.DirectoryView:AsyncSequence
{
@inlinable public
func makeAsyncIterator() -> FilePath.DirectoryView
{
self
}
}
extension FilePath.DirectoryView:AsyncIteratorProtocol
{
public
typealias Element = FilePath.Component
public
func next() async throws -> FilePath.Component?
{
...
for comparison, the code this FilePath.DirectoryView implementation is based on just swallows all errors.
perhaps this is a silly reason to promote something to async. but there just doesn’t seem to exist a better way to model a sequence of values that can fail in the middle of iteration than AsyncSequence.
Don’t default actors use the same global cooperative pool under the hood? They own a queue of jobs, but draining this queue is still scheduled on the generic executor, or not?
it would be great for unrelated use-cases to have Sequence support @rethrows. but i can adapt my filesystem-adjacent code to use AsyncSequencetoday, with a few hours of work. by contrast, pushing swift-evolution legislation flunks a few important criteria for me:
effort (tremendous)
timeliness (likely to take up to a year)
likelihood of success (extremely low, from past experience)
This prevents it from using all the cooperative pool threads, which is good, but has two notable downsides:
You can't do more than one IO concurrently. This is much less important for most regular files, but can matter a lot for pipes, network IO, and dubiously responsive network filesystems
It still occupies one cooperative pool thread, reducing your available parallelism for CPU work
On the other hand, the custom executor solution needs to manually limit its concurrency to avoid thread explosions (and determining the optimal width to limit to is heavily hardware-dependent in non-obvious ways), and may not play nicely with priority donation.
For the most common scenarios in mobile or desktop apps, I think single-threading regular file IO like this is a completely reasonable default behavior (and in fact it's exactly what I implemented in the initial version of AsyncBytes for file URLs). For server-side situations, other types of IO, or workloads with unusual performance requirements, other strategies may be needed (as we see in NIO).
Making this less situational and easier to get right would be a nice improvement, but is not nearly as simple as it looks at first, second, or third glance.
Only while it's executing something. Once it suspends, something else can use the thread. If there are no free pool threads the work is kept in a priority queue to wait until it has a chance to execute. The reason IO is tricky is that read()doesn't suspend, it just sits there until the kernel gets back to it.
Indeed. It's a conundrum, isn't it? Regardless of where we end up long term, we'll need patterns to use until then, so I'm glad other people are thinking about this stuff too
But due to SE-0338, marking a function as async will cause it to execute somewhere on the concurrency thread pool. This is already not appropriate for IO, and if the caller is running on the main thread (or anywhere outside of the concurrency thread pool, or in an actor), it will have to suspend and await.