`swift test` segfaults in RtlUnwindFunction running 3rd test

When running tests one at a time (e.g. via --filter) any test completes successfully, but running all tests the 3rd test case (regardless of the order I run the tests) will segfault inside ntdll.dll:RtlVirtualUnwind.
I have followed Swift.org - Getting Started as far as I can tell, and am experiencing this when running swift test from the command line or through Visual Studio Code's SSWG Swift Extension v0.2.0. Windows 10 21H2 (Build 19044.1466)

lldb output:

* thread #7, stop reason = Exception 0xc0000005 encountered at address 0x7ffc344a1548: Access violation reading location 0x00000008
    frame #0: 0x00007ffc344a1548 ntdll.dll`RtlVirtualUnwind + 1896
ntdll.dll`RtlVirtualUnwind:
->  0x7ffc344a1548 <+1896>: movq (%rdx), %rax
    0x7ffc344a154b <+1899>: movq %rax, 0x78(%rsi,%r8,8)
    0x7ffc344a1550 <+1904>: movq 0x10(%r14), %rax
    0x7ffc344a1554 <+1908>: testq %rax, %rax

I think that this is likely misleading. lldb is not able to work very well on the native side of the stack in my experience, and an invalid memory access within RtlVirtualUnwind is really rather unlikely. If you are motivated, you should be able to reverse engineer what %rdx is supposed to be at this point. I suspect that this is caused by lldb's inability to always correctly re-construct the stack. Cold you try using windbg or Visual Studio to debug the process and hopefully get a more accurate stack trace? I would guess that there is a nullptr floating about somewhere in the runtime.

Thank you, that has been helpful!

Sorry about the delay in reply, I feel very much out of my depth here. What I've found is:

I can see the error is happening in swift_task_dealloc. I'm having trouble understanding the output WinDbg is giving me, because stepping backwards from the exception shows after the return from swift_tast_getCurrent rax is 1; but if I step backwards over the function call then step forwards it contains a pointer which looks valid. I had originally expected this to be a thread safety issue, but am now confused.
When stepping backwards into what I expected would be swift_tast_getCurrent WinDbg shows I'm in frame XCTest!$s6XCTest9asyncTestyyyKcxcyyYaKcxcAA0A4CaseCRbzlF; rax seems to have been intentionally set to 1 by ntdll!RtlpFreeHeapInternal, and I can't see anything in swiftCore!swift_unownedCheck or the XCTest function which would restore it.

    swift_Concurrency!swift_task_dealloc:
00007ffa`def6bee0 56                     push    rsi
00007ffa`def6bee1 57                     push    rdi
00007ffa`def6bee2 4883ec28               sub     rsp, 28h
00007ffa`def6bee6 4889cf                 mov     rdi, rcx
00007ffa`def6bee9 e852cdffff             call    swift_Concurrency!swift_task_getCurrent (00007ffa`def68c40)
00007ffa`def6beee 4885c0                 test    rax, rax   ; rax is `1` at this point and so `je` is not taken
00007ffa`def6bef1 7409                   je      swift_Concurrency!swift_task_dealloc+0x1c (00007ffa`def6befc)
00007ffa`def6bef3 4889c6                 mov     rsi, rax
00007ffa`def6bef6 4883c660               add     rsi, 60h
00007ffa`def6befa eb28                   jmp     swift_Concurrency!swift_task_dealloc+0x44 (00007ffa`def6bf24)
...
00007ffa`def6bf24 488b06                 mov     rax, qword ptr [rsi] ds:00000000`00000061=????????????????

Edit: I can share the WinDbg .run file if that would be helpful

Hmm, out of curiosity, what version of Swift are you using?

>swift --version
compnerd.org Swift version 5.5.2 (swift-5.5.2-RELEASE)
Target: x86_64-unknown-windows-msvc

Please use 5.5.3. There was a regression in XCTest that was fixed in 5.5.3.

I love easy fixes :tada:
Many thanks

Can confirm that swift test with Swift 5.5.3 is working as expected.

1 Like

The behavioral description reminded me of the fix that went into 5.5.3 and I suspected that may have been the cause.