Assuming you need tracking data in the AsyncTask itself, of course. Maybe you don't? It's worth contemplating.
I think we're interested in getting the overhead as low as possible; double-digit percentages would clearly be unacceptable, but I wouldn't say that e.g. less than 10% was a target. Ideally the overhead would be 0%, but that's unrealistic
Possibly. The tracking here is mostly for debugging purposes — we want to be able to list the tasks that are still extant, but not necessarily scheduled onto a thread.
I’ve already submitted my formal proposal for the project, but I’ve been refining my thinking based on your feedback here and your earlier comments regarding the design space.
On Storage & ABI:
I initially leaned toward PrivateStorage to keep the tracking intrusive, but your point is well-taken. Moving to an external tracking structure (keyed by task identity) is a much cleaner way to sidestep ABI constraints entirely. I’m also weighing the trade-offs of a Swift-side implementation versus C++ for this; I’d like to see if we can maintain the necessary atomic performance while staying in Swift before committing to a runtime-level C++ solution, as you mentioned might be worth thinking about.
On Implementation & Contention:
To address your concern about "unintentionally serializing" execution through a single lock, I’m looking at a sharded design to distribute the synchronization load. My plan is to treat this as an empirical investigation: I’ll benchmark a sharded mutex approach against a more complex lock-free design (like a per-shard Treiber stack) using AsyncTree and TaskGroups. I want to find the simplest implementation that keeps the "hot path" overhead low before committing to high-complexity atomics.
On Removal:
The clarification that real-time precision isn't the priority for debugging is a huge help. That allows for a much more efficient, non-blocking destruction path. If the goal is debuggability, then a "lazy cleanup" or a deferred reclamation strategy is a great trade-off to ensure that a task finishing its work isn't stalled by registry overhead just to check out.
I've also outlined these as my initial "Phase 1" investigation goals in the proposal, and I’m looking forward to sharing some concrete benchmark data if the project moves forward!