Truly decoupling two parallel async tasks in SwiftUI or: Random snake audio visualization thanks to Core Audio witchcraft

Dear community,

I'm new to Swift and wrote my first App last weekend. I'm an experienced software engineer, so I adapted quickly. I wrote some C code (a low-level audio-visualization implementation, anachronistic reimplementation of the famous Geiss screensaver/Winamp plugin from 1998) to render a framebuffer (RGBA). Now I wanted to render it natively in a macOS App and therefore decided to use Swift. I somehow regret my choice atm, as I'm really struggling in my fight over control of parallelization.

I have everything working basically - Audio App Capture via Aggregated Audio Device and Audio Tap with Core Audio; SwiftUI with it's UI and input fields; C integration of my code with FFI and pointer arithmetic. I don't think that I did a bad job at learning Swift in those 2-3 days...

I even implemented the datatype conversion to C datatypes and back, the FFT and all that. But when trying to make Swift decouple two Tasks that have to work in true parallel manner, I almost break my hands ;)

Core Audio shows some insane behaviour in delivering the audio buffers. Sometimes they arrive at a rate of 2 FPS and sometimes at 60 FPS. Only god knows when they will arrive. But due to the apparent strict synchronization in SwiftUI, my C codes render function is only called, when audio data arrived, pushing the rendering down to 2 FPS... or up to 60 FPS.. depends on witchcraft I guess ;)

Well, I understand that my writing sounds weird. And that's why I prepared a code repo for you to check. Simply checking this out, you'll be able to find a beautiful new open source music visualization... that runs at snake speed.... and to reproduce the issue in a matter of seconds.

Issue details: The audioQueue is basically receiving data at a pace that I didn't find a way to control. However, only when data is received, would the renderQueue actually call updateData() and let the detached rendering window call my C-code and re-render:

A) The primary fix I need is to get Swift to detach the queues. I simply want the renderQueue to re-render at the configured frequency, and not to wait for the audioQueue at all. It should always call updateData() at that pace, pick the latest audio data and that's all.

B) If that's fixed, my C code would finally render at the highest speed it can. I know that this is possible, because sometimes Core Audio delivers at 60 FPS and then everything is fine. It's a true Heisenbug. To get back control over that, I'd like to force CoreAudio to hand me the buffers at the fastest pace possible so that the waveform would sync better with the rendering visually...

This is probably hanging here:

Could you please help me with this? I really did work hard on getting all of that done on my own. This project will be open source -- I also have a WebAssembly version working. It's a bit frustrating to struggle so much with native performance when the thing renders at 25 FPS with high res in a browser...

https://kyr0.github.io/Milky.js/

How does that let Swift look like? ;)

Thank you in advance!
kyr0

I'd not use queues here (of any shape and form) and decouple mic capturing and everything else. Mic capturing (e.g. done with AudioUnits for crossplatfomobility) will be done in real time and use the push model to write to the ring buffer, everything else will be in non-realtime and use the pull model to read from that ring buffer. Where to put FFT – if it's quick enough it could be on the realtime side (so in this case the ring buffer will be actually the ring buffer of frequencies), otherwise it could be on the non-realtime side pulling from the ring buffer of audio samples. Draw the diagram first. Good luck and well done for the first Swift app.

Hi tera,

thank you for your reply. I'm using Audio Taps because the audio visualization uses the output audio stream of another App (the user can select it, Spotify for example, or iTunes) as an input. Microphone recording doesn't support this AFAIK. The feature of Audio Taps was introduced in macOS 14.2 with a Core Audio update.

Can I prioritize the Core Audio queues somehow? I suspect them to be de-prioritized and that's why they randomly deliver buffers slow, medium or fast.

The RingBuffer idea is interesting. I implemented a ring buffer for a similar reason in another integration of the same code, but as the syncing/blocking seemed to be at language level in Swift, I didn't bother to implement my own.

Does this implementation look good to you?

This impl. uses an UnsafeMutablePointer much like my C FFI code does. I was thinking about trying that myself too, but I was a bit too tired and decided to ask here first. Reflecting on this, Swift should def. not block on DMA via pointer-based memory reads, so.. it might work. I'll give it a try.

The FFT calculations are already on a global background DispatchQueue and are debounced by skipping every 2nd operation:

Thanks and wish you a great day.

1 Like

I'd triple check that, first reading the docs, then goggling, then doing tests and finally if still not clear – asking on the relevant forums (stackoverflow or apple dev forums). This forum is focused on the Swift language itself.

If it's intended to be used in real time context (for writing or reading or both), then no, as you can't do anything other than "read memory, write memory, and do math" (look around ~31:00 and ~38:00). Note that if this is in relation to visualising audio in a screen saver of similar – the provider and the consumer don't have to be perfectly synchronised – so long as they are not too far apart in time (say, 50ms - 100ms), nobody will notice if the consumer is slower or faster than the producer (in the former case it will miss some samples in the latter case it will use some samples more than once). And there's a trivial low-tech solution to keep the producer and consumer within 50-100 ms from each other – just make the ring buffer that big.

Implementing true parallelism in Swift seems impossible to me. There is no way to get Swift away from syncing Tasks, variable access etc. I ended up implementing it in C with double buffering. Now I have fast rendering for super slow audio waveforms passed down by Swift. That's already half of the rent payed.. but as the audio buffers are delivered in a pace of 2 FPS sometimes the image looks of course totally snake slow even though the image is rendered at 30 FPS now. So... I guess I have to hack my way through Core Audio to identify this mess of a bottleneck xD

So... I'm pretty convinced I found the root cause of the issue. When I start the program with power plugged in, Swift selects a p (performance) core of my M3 Macbook Air machine for processing the Queue that pulls the audio stream buffers. It's always delivering buffers fast-paced (60FPS+). Once I unplug and restart the program, it selects an e (economic) core, and audio buffers are delivered at 15 FPS. When the battery goes low, buffer delivery drops to 2 FPS. I clearly need to find a way to force Core Audio and my program to select p cores, no matter what.

1 Like

I'm not familiar with the Core Audio API, but DispachQueue's QoS affects thread priority and the CPU cores it runs on. How about changing .background or unspecified qos to .userInteractive?

@nukka123 Thank you for your reply. I didn't push my recent code when I posted my last reply. But here you go :slight_smile:

Render loop in C instead of Swift:

Double-buffered rendering:

Trying to force high priority on a low level:

Setting everything to .userInteractive

I still get slow audio buffer updates at times. Even when power cord is attached and macOS is configured to never go into energy saving mode.

I also tried calling low-level pthread priority, but it didn't help..

At this point I'm a bit lost. There is definitely some uncontrollable witchcraft going on in the runtime behind the scenes; something it seems, a developer has no control over. I don't know if there is an API to get hold of any threads reference of the App so that I could loop through it regularly and set the priority high?

@mickeyl Danke Dir für den Star auf meinem Repo :) Ich hab' gelesen, dass Du Experte für Swift bist. Vielleicht hast Du einen kleinen Tipp für mich? Ich helfe gerne mal bei irgendwas komplizierten Web-basierten oder so, wenn Du was brauchst.

It may be a different problem than yours.

Audio Provider FPS: 26.53220608096354
Rendering Call FPS: 28.449460761039138
Audio Provider FPS: 28.421989191753205
Rendering Call FPS: 2.8689712535414933
Audio Provider FPS: 2.8691773226920154
Rendering Call FPS: 29.013274305675647

As for the FPS drops that this log indicates, reducing the frequency of print may solve the problem.

    if fps < 10 {
        print("Audio Provider FPS:", fps)
    }
    if fps < 10 {
        print("Rendering Call FPS:", fps)
    }

I don't have the details, but the internal code of print is locking related to OutputStream.
I suspect the wait time occurs when the internal buffer is full.

@nukka123 Good idea. Thank you for cloning the repo and setting it up! I appreciate it! Actually, I commented out all logs and the issue remains. What you see with 2.8 FPS is exactly the issue I mean though. Please notice that "Rendering Call FPS" is not the actual rendering FPS.

It's just the queue that used to render before I moved the render loop in C+++, but now it's just the code that calls the actual rendering code with updated audio data.

btw. "Audio Provider FPS" and "Rendering Call FPS" are synced/locked to roughly the same number always because those queues cannot be decoupled in Swift.

So, for the moment, I'm planning to force the OS to move my threads on performance cores by using system commands; let's how successful I'll be. If that's not going to show good results, I'm going to rewrite the Audio code in footgun C++...

Whoever will read this someday, this is how you do it:

  • Use Metal for fast framebuffer painting instead of NSImage, and use Metal Shaders for post processing effects and upscaling:

DO NOT USE SWIFT FOR ANY REALTIME TASKS. IT'S A FOOTGUN ;)
The witchcraft is neither Core Audio nor CPU Core Affinity of the process/thread.
It is the Swift Runtime itself that comes with a scheduler that always prefers stability over performance. You'll constantly have background GC going on and there is no safe way to implement any realtime algorithm truly lock/sync free. This is good for the most part, but terrible for DSP/realtime code.

Here is a release build of the Realtime Music Visualizer for any App's Audio: Tags · kyr0/MilkyApp · GitHub

1 Like

Well, the good news is that you've discovered this correct answer to your original issue. I sort of feel that the all-caps are a bit over the top, since this issue has been discussed multiple times in the Swift forums, so it's pretty well known.

It isn't correct to blame this on Swift alone, though. It is also true that you should not use Obj-C for any real-time stuff like this. The actual rule is that there are certain things your code should not do in real-time processing. The most common (and easiest to overlook) things to avoid are:

  1. Memory allocations
  2. Locks
  3. I/O

Both the Swift runtime and the Obj-C runtime do some of these things, which means they are to be avoided, since your code cannot avoid those behaviors. C and C++ generally don't do these things, assuming you don't call into libraries which do them.

I don't know what this means, really. Swift (and Obj-C and some other languages) aren't architected for real-time work. That's about as extreme as you can validly get. :smile:

This is false. There's no GC in Swift or Obj-C. They also don't really do any housekeeping in the "background".

3 Likes

FWIW Swift got @nolocks / @noallocations recently, which features (once they are truly landed) make Swift more safe than C IRT realtime programming.

Reference counting is GC according to some definitions of GC. So maybe it's just a terminology issue.

This thread was a great read!