Task not released fast enough after its execution

I was trying to perform some Metal rendering into an MTKView in a background thread using async APIs. For that, I’m using a Task inside the view’s draw(_:) method like this:

override func draw(_ rect: CGRect) {
    guard let currentDrawable = self.currentDrawable else { return }
    Task {
        // render into currentDrawable
        currentDrawable.present()
    }
}

But instead of the performance gain, I hoped for I observed a massive drop in FPS. And I also found the reason:
The view has a pool of only 3 drawables. currentDrawable will only return a drawable if there is one in the pool right now, otherwise, it will block the call and wait until one is free. A drawable is returned to the pool when all references to it are gone.
The task’s operation block is capturing the drawable in this case. And it seems that it’s not released fast enough after the task’s execution.

For comparison, I implemented the same using a DispatchQueue, which was much faster. It seems to release its block right after execution:

let drawQueue = DispatchQueue(label: "drawQueue", qos: .userInteractive)

override func draw(_ rect: CGRect) {
    guard let currentDrawable = self.currentDrawable else { return }
    self.drawQueue.async {
        // render into currentDrawable
        currentDrawable.present()
    }
}

What is the reason for the Task being released so late after its execution? Is there a way to speed this up?

Thanks!

Tasks ought to release all of their context as soon as they are done executing (or even beforehand, really, as soon as any particular variable is done being used by the task). For example, this:

class C { deinit { print("gone") } }

@main struct S {
    static func main() async {
        let t: Task<(), Never>

        do {
            let c = C()
            t = Task {
                print(c)
            }
        }

        _ = await t.value
        print("done")
    }
}

prints "gone" before "done". You might check that the tasks are executing at the time you expect; if draw is in a @MainActor context, for instance, the new Task you spawn will inherit that, and then it'll contend with everything else that wants to run on the main thread to be scheduled. What happens if you use Task.detached instead?

4 Likes

Somewhat hijacking this comment to say I worry that Task inheriting the actor context is going to cause surprise and confusion for a long time. Especially in the @MainActor, I think developers are going to accidentally use Task to start work that they want to run on a global thread. It's confusing that child tasks and async let run on a global actor, while Task runs on the inherited actor. The distinction is very easy to forget. I think it would be more natural if a Task running on the same actor was opt-in like Task.attached.

2 Likes

Thanks for the responses!

I used @Joe_Groff's example to benchmark the timing between task execution and release and it seems basically identical to using a DispatchQueue, also with a detached task, and when using an actor.

My issue really seems to be specific for my rendering use case. Somehow the drawable is not returned to the pool in time when using the new concurrency APIs and I couldn't figure out why. When I do the exact same with a DispatchQueue everything is smooth as expected.

I'll keep you posted when I find out what's going on.
Thanks again!

I'm not familiar with MTKView but are you sure you're allowed to leave the draw() method instead of drawing synchronously? The documentation for the currentDrawable property says The view changes the value of this property only after returning from a drawing function.. Maybe you're not getting what you expect because you are working outside of draw()?

I kinda agree with this. The rules about what is inherited and what not though function calls, child tasks, unstructured tasks, is not super clear or even documented. I"m personally fine with the current API but the behaviour should be properly documented.

One subtle difference between classic DispatchQueues and Tasks is that tasks run in a cooperative thread pool with a limited number of threads, whereas libdispatch will allocate more threads if it falls behind in serving requests. So it may be possible that, if your present method makes blocking calls, or spends a lot of time in CPU, that the cooperative pool is getting starved. If that is the case, then it may make sense to continue dispatching it to a queue.

3 Likes

This is getting into platform-specific advice, but I wouldn’t advise using async or dispatch queues for real-time rendering. A dedicated thread driven by a timing source (such as a display link) is going to be the best solution for a render loop.

4 Likes

That's good to know! Something like this might be the cause of this problem. Though I measured the execution and release time of the block itself (using the mechanism you posted above) and it's identical in both cases, i.e., the calls inside the operation block should not stall the CPU. But I might overlook something.

Yes, I generally agree. I'm developing image processing apps, though. So most of my rendering tasks are caused by sporadic user input. That's why I decided on a more push-based approach and only trigger a rendering when something changed. But maybe a continuous render loop would make more sense here? It's hard to find best practices for this use case, unfortunately.

I thought so as well at first, but according to the documentation, the present() call is kinda async anyways:

Registers a drawable presentation to occur as soon as possible.

That's why I tried the approach with the DispatchQueue and works like a charm—fast rendering while also removing load from the main thread.

One problem with the naive DispatchQueue approach is that if the user does a lot of input very quickly, you’ll try to generate a bunch of intermediate frames, potentially faster than the display can show them.

Terms of Service

Privacy Policy

Cookie Policy