Contract to not impede progress in concurrency pool

tl;dr

I have always (and have advised others) to keep slow and synchronous work out of Swift concurrency. This was explicitly advised in WWDC 2021 & 2022 videos, as well as SE-0296.

I keep this sort of work in GCD and bridge back to Swift concurrency with continuations, and that works great. But I wonder if this advice really still holds today (e.g., as of Swift 6.2). I notice (in macOS, at least) that I can slam the Swift concurrency thread pool with .medium or .low priority tasks performing synchronous work, but the main thread remains responsive even if I never Task.yield() in all of that synchronous work. That surprised me.


The long version:

SE-0296 warns us about long computations, saying (emphasis added):

The proposal doesn’t define ā€œin a separate contextā€, but WWDC 2022’s Visualize and optimize Swift concurrency they suggest GCD:

In WWDC 2021’s Swift concurrency behind the scenes, also discusses this contract with the Swift concurrency system to never impede forward progress on a thread:

OK, so this is all fine and I’ve been doing this for quite a while without incident. It’s ugly to reintroduce GCD into my code, but works great and the UI remains responsive.

But I’ve written a little test app (included at the end of this question) to explore this behavior and notice that although I’m hammering my CPUs with 40 slow and synchronous jobs (on a Mac Studio with 20 processors), and my main thread remains responsive (the overlapping ā“ˆ signposts ā€œpointsā€ just happily tick away even though the device has been slammed):

Note, I used .medium priority for this background task. If I used .high, that did prevent the UI from responding (though, curiously, not identified as a hang in Instruments). Still, it looks like slow and synchronous work performed with a .medium or .low priority will keep the main thread responsive, which surprised me. I expected that if I tied up the threads that the main thread would be blocked.

So, two questions:

  1. Despite the observations above, I assume that it is still advised to simply keep very slow and synchronous work (that does not periodically yield to the Swift concurrency system, at least) out of the Swift concurrency system and bridge it back with a continuation? Or are there Swift changes since 2021/2022 that render this advice moot?

  2. Also, I assume that if I define a custom executor (a la SE-0392) like below, that this is exempted from this advice? I.e., I am assuming this ā€œnever impede forward progressā€ advice only applies for tasks running on the concurrency threadpool.

    So the following is fine, right?

    actor Foo {
        private let queue = DispatchSerialQueue(label: "Foo")
    
        nonisolated var unownedExecutor: UnownedSerialExecutor {
            queue.asUnownedSerialExecutor()
        }
    
        // I assume this is fine
    
        func somethingVerySlowAndSynchronous() {…}
    }
    

    I.e., I assume the ā€œnever impede forward progressā€ is a concurrency threadpool concern, not a general Swift concurrency concern. Is that right?


It’s not entirely relevant, but here is the code snippet that generated the above Instruments log:

import SwiftUI
import os.log

struct ContentView: View {
    let poi = OSSignposter(subsystem: "Test", category: .pointsOfInterest)

    @State var maxDuration: Duration = .zero
    @State var maxDurationInstant: ContinuousClock.Instant = .now

    var body: some View  {
        VStack {
            Text("\(maxDuration.seconds)")
                .monospacedDigit()
        }
        .task {
            try? await tickOnMainThread()
        }
        .task(priority: .medium) {
            await lotsOfBackgroundTasks()
        }
    }

    func tickOnMainThread() async throws {
        var last = ContinuousClock.now
        while !Task.isCancelled {
            try await Task.sleep(for: .seconds(1.0 / 100.0))
            let now = ContinuousClock.now
            let duration = last.duration(to: now)
            poi.emitEvent("tick", "\(duration)")
            if duration > maxDuration || maxDurationInstant.duration(to: now) > .seconds(1) {
                maxDuration = duration
                maxDurationInstant = now
            }
            last = now
        }
    }

    @concurrent
    func lotsOfBackgroundTasks() async {
        await withTaskGroup { group in
            for i in 0 ..< 40 {
                group.addTask(priority: .background) { spin(index: i) }
            }
        }
    }

    // simulating some slow and synchronous calculation

    nonisolated
    func spin(for duration: Duration = .seconds(5), index: Int) {
        poi.withIntervalSignpost(#function, id: poi.makeSignpostID(), "\(index)") {
            let start = ContinuousClock.now
            while start.duration(to: .now) < duration {
                // this is intentionally blank
            }
        }
    }
}

extension Duration {
    var seconds: Double {
        let (seconds, attoseconds) = components
        return Double(seconds) + Double(attoseconds) / 1e18
    }
}

And, FWIW, I know that I could be a good citizen and Task.yield() within the spin function that is simulating some long process, but I’m deliberately trying to tease out the behavior with this anti-pattern.


While I implied that I always use GCD for slow and synchronous work, I must confess that I take a slightly more pragmatic approach, and only go through this GCD rigamarole when really slamming the device. For infrequent, reasonably short, ad hoc, requirements (e.g., quickly save a file synchronously, or whatever), I don’t go through all of this. I just make sure to constrain the degree of parallelism to make sure this doesn’t bite me later.

6 Likes

I stand by my post here: Is @concurrent now the standard tool for shifting expensive synchronous work off the main actor? - #17 by David_Smith

In short: the contract is extremely narrow and mostly forbids things that people already know not to do (like using DispatchSemaphore to turn asynchronous APIs into synchronous ones). The advice is broader but is very situational and in my opinion is drastically over-applied, perhaps due to people mistakenly believing it’s the contract.

Also the main thread in an app is never part of the Swift Concurrency pool, so work on the pool will never block it regardless of whether the pool is fully blocked.

7 Likes

Thanks for the feedback.

I’m sympathetic to the position, but I’m hoping that someone from the Swift team can chime in. They have been so very explicit in so many proposals and presentations, and they don’t seem to have provided any counsel to the contrary that I’m aware of, so I hate to jump to conclusions.

I might not use the term ā€œblockedā€: Perhaps the right word is ā€œstarvedā€ or ā€œcpu boundā€.

For example, if you change the priority to the 40 jobs in my code snippet from .medium to .high, the main thread completely stops responding until the 40 jobs are done. (It is not identified as a ā€œhangā€ by Instruments, though.) Sure, that is easily remedied by reducing the degree of parallelism (even to just one less than the number of processors on the device in question) or reducing their priority, but you can’t just launch a large number of .high priority slow and synchronous tasks without having the main thread become completely unresponsive. And this was in a trivial example where I don’t have other async tasks underway: I suspect that you really must be careful about synchronous, massively parallelized algorithms within Swift concurrency.

However, if I use GCD’s concurrentPerform, it will interleave the massively parallelized work with activity on the main thread (even if the GCD work is performed with a wildly inappropriately high QoS, even .userInteractive).

To be clear, I’ve been a performance specialist on the Swift Standard Library team since 2018, and on the Foundation team for the 9 years before that.

The hang you report is interesting though, that shouldn’t be happening. I can try to take a look tomorrow.

7 Likes

Can you expand on why it shouldn't be happening? As far I understand, the concurrency thread pool is dynamic up to the number of logical cores on the device. With the main queue, that's logical cores + 1, so, naively, I'd expect filling the concurrency pool to block the main queue if the work is the same priority. Is the main queue supposed to always be higher priority (by default I didn't think so)? Or perhaps a bug in priority handling at the runtime (I seem to recall similar bugs in the past) or OS (less likely) level? To me, the simplest solution would be for the concurrency pool maximum to be the logical core count - 1, for platforms with an additional main queue.

1 Like

My understanding was that the main thread has an implicit higher priority than the threads created by the concurrency thread pool. So that under congestion the main thread is guaranteed to remain responsive.

2 Likes

That’s right, the kernel scheduler should be preemptively switching between the N+1 threads, so the worst case scenario should be degraded responsiveness due to contention, not a hang. I’m wondering if there’s a shared resource in a framework that all the threads are piling up on, I’m going to see if I can find time to make a test project from your code and reproduce the issue myself.

1 Like

Right. The main thread is a separate thread from the global pool. If that thread isn't blocked, it should be getting scheduled by the kernel with proper respect to its priority. That priority is higher than anything that you can request through a normal API. Exactly how that plays out is obviously up to the kernel — a lot of kernels won't permanently starve lower-priority threads just because there's higher-priority work to do, they just get less and less time the more congested the system is — but still, generally the main thread should be getting all the CPU time it needs.

1 Like

As far as I remember, most OS's use Multi-Level Round Robin with priority escalation, so I don't see how, under normal circumstances, the main thread can become completely unresponsive. I assume this is a weird bug somewhere.

XNU uses a system called ā€œpriority decayā€ to do this. The longer a thread remains on-cpu, the lower its priority gets, with a slope that’s a function of both the number of runnable-but-not-running threads systemwide and their priorities. There’s also a special-case priority floor as of iOS 7.1, where no matter how much a user interactive thread (like the main thread) decays, it will never become <= background priority.

6 Likes

Interesting to learn. I was thinking there must be some sort of system where some threads have a minimum time slice duration.

I believe I can reproduce the problem in a test project. If I change .medium to .high then the first ā€œtickā€ signpost takes 10 seconds to post on my laptop. Haven’t yet determined why. I have a hypothesis about what's going on, but I'm confirming it with some folks who work on the Concurrency runtime first.

4 Likes

Ok, confirmed my hypothesis: the issue is due to Task.sleep rather unintuitively requiring a thread from the pool even when it's run on the main actor. The main thread itself is actually completely responsive, but the timer gets delayed.

I'll make sure there's a bug tracking improving Task.sleep to avoid this (I think there already is one that will cover this for other reasons, but if I'm mistaken about that I'll file one).

Thanks for the test case! What a curious bug.

[edit] oh and the reason the priority matters is because the thread count cap on the pool is one thread per CPU core per QoS bucket, so when it's .medium the main thread is able to get a pool thread anyway.

20 Likes

All of these excellent comments/observations notwithstanding, I’d like to return to the question of SE-0296 which explicitly advised that when ā€œcalling a synchronous function that just does a lot of workā€ or an ā€œintense computational loopā€, that we should ā€œgenerally run it in a separate contextā€. Note, it is not referencing things like locks, semaphores, etc., but just something as simple as long-running calculations that are unable to periodically yield to the Swift concurrency system.

I’m reading SE-0296 (and the WWDC videos) as advising that we should consider keeping this sort of work out of the Swift concurrency thread pool. (And, again, I’m not suggesting that we have to resort to GCD for every synchronous operation, as that is clearly not the case: The proposal seems to be more pragmatic than that. FWIW, I acknowledge that this advice to not impede forward progress can easily be over-interpreted as a strict, formal contract, which is probably not the case.)

My question is whether something has changed in the Swift concurrency system to invalidate the explicit advice within SE-0296? Or are folks just suggesting that SE-0296 is simply incorrect?

2 Likes

Short answer: If the work degrades the application's performance to an unsatisfactory level, and optimization doesn't alleviate the problem, then yes, off the concurrent thread pool / main thread with it. But as @David_Smith mentioned, this should be a rather rare situation.

3 Likes

I think it's perhaps overstating things, but to be honest the discovery in this thread has me reevaluating my position a little.

Protecting yourself from subtle bugs in the runtime like this wouldn't be necessary in an ideal world of course, but pragmatically "I don't want to have to worry about things like this and I'm willing to pay a little extra memory/complexity/context switch overhead for it" is not unreasonable.

One reason I'm invested in making sure this bug (and any similar ones we can find) gets fixed is because I'd really like to not have that caveat.

7 Likes

I agree with @notthenhk; it feels like you're reading SE-0296 as making a much stronger statement than I think was ever meant. Moving heavy computation off the concurrent thread pool is a tool you can keep in your pocket for when you're having performance problems (or need to work around issues like this), not something you need to be proactively doing.

13 Likes

Thank you so much for clarifying.

FWIW, I would argue that the proposal in fact does say something far stronger than the above (namely, that one should ā€œgenerallyā€ keep intensive computations out of the concurrency thread pool if the work can’t interleave for ā€œcorrectnessā€). And I personally adopt a far more pragmatic position (that I only move stuff out of Swift concurrency when doing massively parallelized calculations that could tie up the limited number of threads).

But, again, your clarification is very much appreciated. Thanks!

2 Likes

Does the fact that Swift Concurrency pre-allocates a thread pool means that we have less threads (or CPU cores) available for GCD? My guess is no, but it would be nice to hear a confirmation from someone more knowledgeable than me.

It's the same thread pool as GCD; the pool just has a non-overcommit scheduling policy for jobs that are submitted through Swift Concurrency APIs.

2 Likes