Large stack allocations

I'm aware that withUnsafeTemporaryAllocation allocates up to 1024 bytes on the stack. However, I'd like the ability to also allocate 2048 or 4096 bytes on the stack. Is this a good technique?

// In a header file
struct Slab4096 {
    uint64_t buffer[512];

// Swift code
var slab = Slab4096()
withUnsafeMutableBytes(of: &slab) { ... }

I understand that this initializer for Slab4096 sets each buffer element to zero, and disassembly shows that it's done via memset.

Are there other costs to this technique? Could the slab instance be copied around on the stack due to inout semantics or the upcoming stack protection features?


1 Like

If you use C header you might as well use this tiny C file:

void callit(void (^callback)(void* p)) {
    struct Slab4096 slab;

to get rid of memset overhead (assuming you don't mind to see the garbage in slab's memory).

I'd like to have these two things in withUnsafeTemporaryAllocation or similar API:

  • ability to pass a parameter to specify allocation size. The default maximum value of 1024 can still overflow stack (e.g. the stack is tiny or I am calling a function recursively), and on the contrary, there are legitimate situations when larger than 1K stack allocations are appropriate.
  • Opting out of falling back to heap allocation (getting the error back if stack allocation failed). This would be handy for realtime applications (e.g. an audio I/O callback: if bigger allocation fails I'd retry with a smaller buffer size).

What are you really trying to do? "Large stack allocation" isn't always wrong, but it's often a sign that there's a better way to be doing whatever you're doing.


I am calling C functions that write into buffers I provide. I then parse the buffers and initialize various Swift types from them. The buffers are temporary scratch space.

If my Slab4096 technique uses only 4K stack space (and not multiples of that due to Swift calling convention realities), then I'd think it's a significant performance improvement over heap allocation.

If you are not calling the function recursively / or from different threads at the same time – the typical workaround is to allocate the heap storage upfront once.