Getting stack size using lldb

I'm trying to calculate stack size based on thread backtrace output. I thought it would be simple: getting the top frame and bottom frame's addresses and do a subtraction. My laptop has Intel CPU, so I expect the top frame has lower address.

But when I examine the log I find something unexpected: some higher frames have higher addresses then frames under them. See frame 2 and 3. I wonder is it normal?

(lldb) thread backtrace
  * thread #5, queue = 'com.apple.root.default-qos.cooperative', stop reason = breakpoint 1.1
  * frame #0: 0x000000013dd261e2 acdbCNTests`CEWViewMapIterator.init(map=$s6acdbCN8AllACMapVyAA9DDCEWViewVAA11FundCEWViewVAA9TDCEWViewVAA9FICEWViewVGD @ 0x0000700009020950) at cewViewMapT+iterator.swift:72:9
    frame #1: 0x000000013dd25fe3 acdbCNTests`AllACMap<>.makeIterator(self=acdbCN.AllACMap<acdbCN.DDCEWView, acdbCN.FundCEWView, acdbCN.TDCEWView, acdbCN.FICEWView> @ 0x00007000090209f0) at cewViewMapT+iterator.swift:12:16
    frame #2: 0x00000001423155c5 acdbCNTests`_cews(acData=$s6acdbCN8AllACMapVyAA6DDDataVAA8FundDataVAA6TDDataVAA6FIDataVGD @ 0x0000700009020be0, today=(date = 2023-02-18 16:00:00 UTC), acResolver=acdbCN.EntityCollection @ 0x00006000006027f0) at storeViewsT.swift:31:30
    frame #3: 0x0000000142314f91 acdbCNTests`StoreViews.init(entities=acdbCN.EntityCollection @ 0x0000700009021ad0, acData=$s6acdbCN8AllACMapVyAA6DDDataVAA8FundDataVAA6TDDataVAA6FIDataVGD @ 0x0000700009020d70, today=(date = 2023-02-18 16:00:00 UTC)) at storeViewsT.swift:20:16
    frame #4: 0x000000013db2d50b acdbCNTests`Store._run(action=0x000000013d7d78d0 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateSingleExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthInterval: Swift.Int, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCN.Store @ 0x0000700009022540) at storeT+run.swift:84:30
    frame #5: 0x000000013db2e495 acdbCNTests`Store.run(checkLimit=0x000000013d7c71b0 acdbCNTests`closure #1 (acdbCN.Store) -> Swift.Bool in default argument 0 of acdbCN.Store.run(checkLimit: (acdbCN.Store) -> Swift.Bool, _: (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> ()) -> acdbCN.ACDBResult at <compiler-generated>, action=0x000000013d7d78d0 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateSingleExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthInterval: Swift.Int, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCN.Store @ 0x0000700009022540) at storeT+run.swift:110:32
    frame #6: 0x000000013d7cc3ce acdbCNTests`DemoStore.run(action=0x000000013d7d78d0 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateSingleExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthInterval: Swift.Int, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCNTests.DemoStore @ 0x0000700009022540) at loadDemoData.swift:152:23
    frame #7: 0x000000013d7d0094 acdbCNTests`DemoStore._generateSingleExpense(acID=(uuid = AEEA457D-A12C-4ECE-A046-2BB1A853D139), name="休假", expenseMonthInterval=12, expenseMonthDay=20, expenseAmount=10000.000000
, self=acdbCNTests.DemoStore @ 0x0000700009022540) at loadDemoData.swift:366:13
    frame #8: 0x000000013d7ce8ee acdbCNTests`DemoStore.generateExpenseOnCashAC(self=acdbCNTests.DemoStore @ 0x0000700009022540) at loadDemoData.swift:253:9
    frame #9: 0x000000013d7cbb62 acdbCNTests`DemoStore.generateExpenseAC(self=acdbCNTests.DemoStore @ 0x0000700009022540) at loadDemoData.swift:280:9
    frame #10: 0x000000013d7c8adb acdbCNTests`DemoStore.init() at loadDemoData.swift:142:9
    frame #11: 0x000000013d7c86c7 acdbCNTests`closure #1 in GenerateDemoDataTest.testLoadDemo() at loadDemoData.swift:61:66
    frame #12: 0x000000013d7c7ca3 acdbCNTests`generateStore(filename="abc", createStore=0x000000013d7c8690 acdbCNTests`closure #1 () -> acdbCN.Store in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at loadDemoData.swift:61) at loadDemoData.swift:18:17
    frame #13: 0x000000013d7c8209 acdbCNTests`asyncGenerateStore(filename="abc", createStore=0x000000013d7c8690 acdbCNTests`closure #1 () -> acdbCN.Store in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at loadDemoData.swift:61) at loadDemoData.swift:27:5
    frame #14: 0x000000013d7c8430 acdbCNTests`GenerateDemoDataTest.testLoadDemo(self=0x00007fe357f052a0) at loadDemoData.swift:61
    frame #15: 0x000000013d7c8c60 acdbCNTests`@objc closure #1 in GenerateDemoDataTest.testLoadDemo() at <compiler-generated>:0
    frame #16: 0x000000013d7cbf80 acdbCNTests`partial apply for @objc closure #1 in GenerateDemoDataTest.testLoadDemo() at <compiler-generated>:0
    frame #17: 0x000000013d7de300 acdbCNTests`thunk for @escaping @callee_guaranteed @Sendable @async () -> () at <compiler-generated>:0
    frame #18: 0x000000013d7de500 acdbCNTests`partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> () at <compiler-generated>:0
    frame #19: 0x000000013d7de3e0 acdbCNTests`specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0
    frame #20: 0x000000013d7de720 acdbCNTests`partial apply for specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) at <compiler-generated>:0

UPDATE: The displayed addresses of frame 2 and 3 are not in the valid range of stack. So maybe it's just some kind of optimization (frame 2 is a global func call, and frame 3 is a init() call) and LLDB shows the address of the func that's actually called. instead of the frame address? If so, I think I can still calculate the stack size by subtracting top and bottom frame addresses?

The addresses you see in the backtrace are not stack pointers. They are return addresses. They are pointers into executable code. Try looking them up using the image lookup -a command. So there is no reason for them to be in any particular order.

If you want to see stack pointers, you can use settings set frame-format to change what thread backtrace prints. This is what I have in my .lldbinit:

settings set frame-format "frame #${frame.index}: sp=${frame.sp} fp=${frame.fp} pc=${frame.pc}{ ${module.file.basename}{`${function.name-with-args}${function.pc-offset}}}{ at ${line.file.basename}:${line.number}}{${function.is-optimized} [opt]}\n"
3 Likes

Thanks for the explanation! I used your setting in my experiments and it worked well.


A note to myself and other people who may be interested. TLDR: it's possible to calculate stack size using thread backtraces, but it's more convenient to use the function @tera suggested. The size doesn't match with vmmap output because they are about different things.

A diagram to show the difference:

tmp1

How I did the experiment

  • I added a sleep(1) in my code to make it run slower.
  • Then I start vmmap in terminal.
  • Then I run the code (it's a xctest in my case) for a while.
  • Then I set a break point in my code to stop the test and run lldb to gather stack size data.

vmmap command and output

$ while true; do vmmap $(pgrep xctest) 2>/dev/null | grep "^Stack" | grep -v Guard; echo "";  sleep 1; done

This is the output in my experiment when the code hit the break point:

Stack                    700008e1a000-700008e9c000 [  520K     8K     8K     0K] rw-/rwx SM=COW          thread 1
Stack                    700008f20000-700008fa2000 [  520K    40K    40K     8K] rw-/rwx SM=COW          thread 2
Stack                    700008fa3000-700009025000 [  520K   204K   204K     0K] rw-/rwx SM=COW          thread 3
Stack                    7ff7b5f54000-7ff7b6754000 [ 8192K    24K    24K    72K] rw-/rwx SM=COW          thread 0
Stack                             9752K     276K     276K      80K       0K       0K       0K        4

(Ignore the last line. It's a summary.)

In this case the thread 3 is the worker thread (More on this later). You may observe that its size keeps increasing while the code runs, until you set the break point.

Identify the worker thread in vmmap output

A big caveat: both vmmap and Xcode list threads by number. But those numbers are arbitrary and unrelated.

Note vmmap output contains the stack address range. The thread whose stack address range covers the addresses in lldb thread backtrace output is the worker thread.

lldb command and output

Note: it's more convenient to use the function @tera suggested.

Details Use the setting @mayoff suggested:
(lldb) settings set frame-format "frame #${frame.index}: sp=${frame.sp} fp=${frame.fp} pc=${frame.pc}{ ${module.file.basename}{`${function.name-with-args}${function.pc-offset}}}{ at ${line.file.basename}:${line.number}}{${function.is-optimized} [opt]}\n"

Then run command:

(lldb) thread backtrace

This is the output in my experiment:

(lldb) thread backtrace
  * thread #5, queue = 'com.apple.root.default-qos.cooperative', stop reason = breakpoint 1.1
  * frame #0: sp=0x00007000090205d0 fp=0x0000700009020770 pc=0x000000013dd261e2 acdbCNTests`CEWViewMapIterator.init(map=$s6acdbCN8AllACMapVyAA9DDCEWViewVAA11FundCEWViewVAA9TDCEWViewVAA9FICEWViewVGD @ 0x0000700009020710) + 466 at cewViewMapT+iterator.swift:72
    frame #1: sp=0x0000700009020780 fp=0x0000700009020800 pc=0x000000013dd25fe3 acdbCNTests`AllACMap<>.makeIterator(self=acdbCN.AllACMap<acdbCN.DDCEWView, acdbCN.FundCEWView, acdbCN.TDCEWView, acdbCN.FICEWView> @ 0x00007000090207b0) + 115 at cewViewMapT+iterator.swift:12
    frame #2: sp=0x0000700009020810 fp=0x00007000090209f0 pc=0x00000001423155c5 acdbCNTests`_cews(acData=$s6acdbCN8AllACMapVyAA6DDDataVAA8FundDataVAA6TDDataVAA6FIDataVGD @ 0x00007000090209a0, today=(date = 2023-02-18 16:00:00 UTC), acResolver=acdbCN.EntityCollection @ 0x000060000061f660) + 613 at storeViewsT.swift:31
    frame #3: sp=0x0000700009020a00 fp=0x0000700009020c10 pc=0x0000000142314f91 acdbCNTests`StoreViews.init(entities=acdbCN.EntityCollection @ 0x0000700009021890, acData=$s6acdbCN8AllACMapVyAA6DDDataVAA8FundDataVAA6TDDataVAA6FIDataVGD @ 0x0000700009020b30, today=(date = 2023-02-18 16:00:00 UTC)) + 817 at storeViewsT.swift:20
    frame #4: sp=0x0000700009020c20 fp=0x0000700009021950 pc=0x000000013db2d50b acdbCNTests`Store._run(action=0x000000013d7d7590 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateMonthlyExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCN.Store @ 0x0000700009022540) + 2539 at storeT+run.swift:84
    frame #5: sp=0x0000700009021960 fp=0x0000700009021d80 pc=0x000000013db2e495 acdbCNTests`Store.run(checkLimit=0x000000013d7c71b0 acdbCNTests`closure #1 (acdbCN.Store) -> Swift.Bool in default argument 0 of acdbCN.Store.run(checkLimit: (acdbCN.Store) -> Swift.Bool, _: (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> ()) -> acdbCN.ACDBResult at <compiler-generated>, action=0x000000013d7d7590 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateMonthlyExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCN.Store @ 0x0000700009022540) + 213 at storeT+run.swift:110
    frame #6: sp=0x0000700009021d90 fp=0x0000700009021e40 pc=0x000000013d7cc3ce acdbCNTests`DemoStore.run(action=0x000000013d7d7590 acdbCNTests`partial apply forwarder for closure #1 (inout acdbCN.EntityCollection, inout acdbCN.ACChangeMap<acdbCN.S1Change>) throws -> () in acdbCNTests.DemoStore._generateMonthlyExpense(acID: acdbCN.Identifier<acdbCN.DD>, name: Swift.String, expenseMonthDay: Swift.Int, expenseAmount: __C.NSDecimal) -> () at <compiler-generated>, self=acdbCNTests.DemoStore @ 0x0000700009022540) + 142 at loadDemoData.swift:152
    frame #7: sp=0x0000700009021e50 fp=0x0000700009022110 pc=0x000000013d7cee53 acdbCNTests`DemoStore._generateMonthlyExpense(acID=(uuid = 1BDBBCA7-E453-4839-A3DB-F35B5D24C15C), name="房贷每月还款", expenseMonthDay=25, expenseAmount=8000.000000
, self=acdbCNTests.DemoStore @ 0x0000700009022540) + 1331 at loadDemoData.swift:299
    frame #8: sp=0x0000700009022120 fp=0x00007000090224b0 pc=0x000000013d7d06d1 acdbCNTests`DemoStore.generateMortgageAC(name="中信银行借记卡 (房贷还款)", mortgageMonthDay=25, mortgageAmount=8000.000000
, self=acdbCNTests.DemoStore @ 0x0000700009022540) + 1505 at loadDemoData.swift:269
    frame #9: sp=0x00007000090224c0 fp=0x0000700009022530 pc=0x000000013d7cbc04 acdbCNTests`DemoStore.generateExpenseAC(self=acdbCNTests.DemoStore @ 0x0000700009022540) + 196 at loadDemoData.swift:283
    frame #10: sp=0x0000700009022540 fp=0x0000700009022720 pc=0x000000013d7c8adb acdbCNTests`DemoStore.init() + 235 at loadDemoData.swift:142
    frame #11: sp=0x0000700009022730 fp=0x00007000090228b0 pc=0x000000013d7c86c7 acdbCNTests`closure #1 in GenerateDemoDataTest.testLoadDemo() + 55 at loadDemoData.swift:61
    frame #12: sp=0x00007000090228c0 fp=0x0000700009022c70 pc=0x000000013d7c7ca3 acdbCNTests`generateStore(filename="abc", createStore=0x000000013d7c8690 acdbCNTests`closure #1 () -> acdbCN.Store in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at loadDemoData.swift:61) + 163 at loadDemoData.swift:18
    frame #13: sp=0x0000700009022c80 fp=0x0000700009022cd0 pc=0x000000013d7c8209 acdbCNTests`asyncGenerateStore(filename="abc", createStore=0x000000013d7c8690 acdbCNTests`closure #1 () -> acdbCN.Store in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at loadDemoData.swift:61) + 121 at loadDemoData.swift:27
  frame #14: 0x000000013d7c8430 acdbCNTests`(2) await resume partial function for acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at loadDemoData.swift:61
  frame #15: 0x000000013d7c8c60 acdbCNTests`(1) await resume partial function for @objc closure #1 () async -> () in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at <compiler-generated>
  frame #16: 0x000000013d7cbf80 acdbCNTests`(1) await resume partial function for partial apply forwarder for @objc closure #1 () async -> () in acdbCNTests.GenerateDemoDataTest.testLoadDemo() async -> () at <compiler-generated>
  frame #17: 0x000000013d7de300 acdbCNTests`(1) await resume partial function for reabstraction thunk helper from @escaping @callee_guaranteed @Sendable @async () -> () to @escaping @callee_guaranteed @Sendable @async () -> (@out ()) at <compiler-generated>
  frame #18: 0x000000013d7de500 acdbCNTests`(1) await resume partial function for partial apply forwarder for reabstraction thunk helper from @escaping @callee_guaranteed @Sendable @async () -> () to @escaping @callee_guaranteed @Sendable @async () -> (@out ()) at <compiler-generated>
  frame #19: 0x000000013d7de3e0 acdbCNTests`(1) await resume partial function for generic specialization <serialized, ()> of reabstraction thunk helper <A, B where A: Swift.Sendable, B == Swift.Never> from @escaping @callee_guaranteed @Sendable @async () -> (@out A) to @escaping @callee_guaranteed @async () -> (@out A, @error @owned Swift.Error) at <compiler-generated>
  frame #20: 0x000000013d7de720 acdbCNTests`(1) await resume partial function for partial apply forwarder for generic specialization <serialized, ()> of reabstraction thunk helper <A, B where A: Swift.Sendable, B == Swift.Never> from @escaping @callee_guaranteed @Sendable @async () -> (@out A) to @escaping @callee_guaranteed @async () -> (@out A, @error @owned Swift.Error) at <compiler-generated>

Frames 14-20 are async frames. I believe they are in heap when suspended. So I'll use frame 13's fp (0x0000700009022cd0) as stack bottom, and frame 0's sp (0x00007000090205d0) as stack top. The stack size is:

0x0000700009022cd0 - 0x00007000090205d0 = 9984

The two sizes don't match

There are a big difference between the two sizes. I'm not 100% sure, but I think the following is a reasonable explanation:

  • When my code stops at the break point, the stack size is only less 10K.

  • But at some point earlier the stack size was 204K. That caused the OS to allocate 204K physical memory for the stack. Those memory are not deallocated when the stack size got smaller later. That's why vmmap reported the 204K size.

If the above understanding is correct, vmmap reported the peak size of the stack. It's a very useful feature when one suspects code uses up more stack space than expected but not sure which part of the code it's (vmmap can't help to identify the exact place in the code, but it helps to expose or confirm the issue).

If you suspect stack overflow happening and for some reason don't trust Xcode diagnostic tools catching those (seems unlikely :thinking:), you may use this manual "checkStack" call every here and there:

func checkStack(param: Any) {
    var x: UInt8 = 1
    func approximateSP(_ p: UnsafeMutableRawPointer) -> UnsafeMutableRawPointer {
        p
    }
    let sp = approximateSP(&x)
    let top = pthread_get_stackaddr_np(pthread_self())
    let size = pthread_get_stacksize_np(pthread_self())
    let bottom = top - size
    let safetySize = min(10*1024, size / 10)
    let safeBottom = bottom + safetySize // "relatively" safe stack bottom
    #if ENABLE_PRINTOUT
    print("top    : \(top)")
    print("SP     : \(sp) (approximate)")
    print("safe   : \(safeBottom) (relatively safe bottom)")
    print("bottom : \(bottom)")
    print("size   : \(size)")
    print("used   : \(top - sp) [\(100*(top - sp)/size)%]")
    print("param  : \(param)")
    print()
    #endif
    
    precondition(sp > safeBottom && sp <= top, "stack is about to overflow \(sp - bottom) bytes left, depth: \(param)")
}

func testRecursion(level: Int = 0) {
    checkStack(param: level)
    testRecursion(level: level + 1)
}

testRecursion()

With printout enabled it prints something like:

top    : 0x000000016ddfc000
SP     : 0x000000016dca9c77 (approximate)
safe   : 0x000000016d602800 (relatively safe bottom)
bottom : 0x000000016d600000
size   : 8372224
used   : 1385353 [16%]
param  : 17197

and with / without printout it checks available stack space and warns you if you are about to crash it with this:

failed: stack is about to overflow 10231 bytes left, depth: 104405

In the above for the safety margin I'm using the minimum of these two numbers: 10K and 10% of total stack size, adjust this according to your needs.

Edit: and when in lldb you can use the same calls:

(lldb) p pthread_get_stackaddr_np(pthread_self())
(UnsafeMutableRawPointer) $R0 = 0x16d600000
(lldb) p pthread_get_stacksize_np(pthread_self())
(Int) $R1 = 8372224
1 Like

Thanks, it's lifesaving. It's equivalent to what I did with lldb output manually. Two interesting details:

  • The address returned by pthread_get_stackaddr_np() is same as the fp value of the bottom-most frame (not including async frames) in thread backtrace output.

  • There is 8K size gap between that address and the stack base address reported by vmmap. I believe it's an area reserved by OS.

In typical scenario where stack overflow is caused by recursion, the function should be sufficient. However, that's not the case in my issue. Below are the characteristics of my issue:

  • It's a stack overflow (I'm very sure now and I'll explain the details in another thread later today or tomorrow).

  • My code doesn't have recursion.

  • The crash occurred in compiler generated code. For unclear reason, it used up a large amount of stack space (the value in my code has large size, so I don't mean the crash is compiler's bug. it's just that I didn't expect it because I wrapped data in arrays and dictionaries) . I suspect it's related to an enum accessor, but I'm not sure how to prove it yet.

    BTW, this may also explains why address sanitizer failed to report the error, because address sanitizer perhaps only inserts debugging code to user defined funcs but not the compiler generated code.

The difficulty is that when I started to investigate the issue, I know nothing of the above (except item 2). Adding your func in code wouldn't help, because the overflow occurred in compiler generated code. This is a scenario where vmmap is valuable, because it can tell me the peak size of the stack. It's especially useful when the code has potential to crash but hasn't yet, because it can indicate the code use far more stack size than expected in above case.

Caveat: in the case where the stack overflow already occurred, the size in vmmap is not the failed size, but the valid size before the crash.

I'm still investigating the crash issue. I'll update another thread when I have more information. Thanks.