How to prevent crash due to Swift Runtime initializing `os_log` on macOS?

I’m developing a postgres extension with Swift. Overall using Swift in this context been a good experience and once this issue is ironed out I’d be happy to open-source the work I’ve put into making Postgres extensions in Swift usable and performant.

My biggest issue is “random” segfaults that occur after swift::runtime::trace::setupLogs is called on macOS. Depending on the timing (this is a race condition that seems to depend on when the extension code is called / loaded), I get a segfault with traces like this:


Exception Type:    EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x000000010520487d
Exception Codes:   0x0000000000000001, 0x000000010520487d

Termination Reason:  Namespace SIGNAL, Code 11, Segmentation fault: 11
Terminating Process: exc handler [36640]


Application Specific Information:
crashed on child side of fork pre-exec


Thread 0 Crashed:
0   libsystem_trace.dylib         	       0x19669d0b4 _os_log_find + 64
1   libsystem_trace.dylib         	       0x19669cdd4 os_log_create + 304
2   libswiftCore.dylib            	       0x1a9a6d42c swift::runtime::trace::setupLogs(void*) + 44
3   libdispatch.dylib             	       0x1967ccaa4 _dispatch_client_callout + 16
4   libdispatch.dylib             	       0x1967b5a40 _dispatch_once_callout + 32
5   libswiftCore.dylib            	       0x1a9e6fba4 swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptor<swift::InProcess> const*, bool) (.cold.4) + 44
6   libswiftCore.dylib            	       0x1a9a652f4 swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptor<swift::InProcess> const*, bool) + 5216
7   libswiftCore.dylib            	       0x1a9a623a8 swift_conformsToProtocolWithExecutionContext + 72
8   libswiftCore.dylib            	       0x1a99f4f84 swift::_conformsToProtocol(swift::OpaqueValue const*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptorRef<swift::InProcess>, swift::TargetWitnessTable<swift::InProcess> const**, swift::ConformanceExecutionContext*) + 152
9   libswiftCore.dylib            	       0x1a9a60214 swift::_checkGenericRequirements(__swift::__runtime::llvm::ArrayRef<swift::GenericParamDescriptor>, __swift::__runtime::llvm::ArrayRef<swift::TargetGenericRequirementDescriptor<swift::InProcess>>, __swift::__runtime::llvm::SmallVectorImpl<void const*>&, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>, swift::ConformanceExecutionContext*) + 9584
10  libswiftCore.dylib            	       0x1a9a45b2c _gatherGenericParameters(swift::TargetContextDescriptor<swift::InProcess> const*, __swift::__runtime::llvm::ArrayRef<swift::MetadataPackOrValue>, swift::TargetMetadata<swift::InProcess> const*, __swift::__runtime::llvm::SmallVectorImpl<unsigned int>&, __swift::__runtime::llvm::SmallVectorImpl<void const*>&, swift::Demangle::__runtime::Demangler&) + 2408
11  libswiftCore.dylib            	       0x1a9a53bb4 (anonymous namespace)::DecodedMetadataBuilder::createBoundGenericType(swift::TargetContextDescriptor<swift::InProcess> const*, __swift::__runtime::llvm::ArrayRef<swift::MetadataPackOrValue>, swift::MetadataPackOrValue) const + 264
12  libswiftCore.dylib            	       0x1a9a4fd98 swift::Demangle::__runtime::TypeDecoder<(anonymous namespace)::DecodedMetadataBuilder>::decodeMangledType(swift::Demangle::__runtime::Node*, unsigned int, bool) + 9016
13  libswiftCore.dylib            	       0x1a9a480cc swift_getTypeByMangledNodeImpl(swift::MetadataRequest, swift::Demangle::__runtime::Demangler&, swift::Demangle::__runtime::Node*, void const* const*, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 880
14  libswiftCore.dylib            	       0x1a9a439dc swift_getTypeByMangledNode + 368
15  libswiftCore.dylib            	       0x1a9a48b60 swift_getTypeByMangledNameImpl(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 1204
16  libswiftCore.dylib            	       0x1a9a41520 swift_getTypeByMangledName + 368
17  libswiftCore.dylib            	       0x1a9a41a60 swift_getTypeByMangledNameInContextImpl(char const*, unsigned long, swift::TargetContextDescriptor<swift::InProcess> const*, void const* const*) + 192
18  libPerformanceAnalysis.dylib  	       0x10575ee04 __swift_instantiateConcreteTypeFromMangledName + 52

This crash does not occur on Linux because swift::runtime::trace::setupLogsis not called there.

For some backstory: postgres seem to work by forking worker processes but never runs exec in those processes. They simply run to completion (or crash with a segfault, as the case may be). Apparently this puts us into dangerous “async signal unsafe” territory.

The issue is: I have no idea what that territory means or implies. I just want to write some Swift code that is called by postgres. I don’t want tracing or logs from Swift’s runtime.

Other than re-building the Swift compiler myself with SWIFT_STDLIB_TRACING disabled (see swift/stdlib/public/runtime/Tracing.cpp at 9be05988fbf2f086758a507037c99cf281092da3 · swiftlang/swift · GitHub ) is there a “userland” workaround for this issue?

It has really drained my productivity and proven very difficult to track down and debug – the limited clarity I’m able to discuss here was gained after weeks of “random” crashes that were unexplainable. I also don’t know a huge amount about postgres’ internals and feel I shouldn’t need to, to get some compiled Swift code running.

Would it help to use Embedded Swift, for example?

1 Like

Apparently this puts us into dangerous “async signal unsafe” territory.

Indeed. This is extremely problematic on macOS. The traditional Unix fork model fundamentally conflicts with Mach, so you’re likely to get into trouble if you use any Apple framework from that context. Heck, even Posix APIs are tricky, although Apple usually does enough to keep those working.

I’ve talked about this extensively on Apple Developer Forums. A good place to start is this thread, specifically this post and this post.

is there a “userland” workaround for this issue?

I very much doubt it. The only winning move is not to play.

I think it’d be worthwhile filing a bug requesting such an option though. If you do, please post your bug number, just for the record.

Would it help to use Embedded Swift, for example?

I don’t know enough about how Embedded Swift runs on macOS to answer that. But you could try test this yourself, by building an example and seeing whether it imports the system runtime or not.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

3 Likes

Thank you @eskimo for your support now and over the years. Your contribution is much appreciated.

Here’s the thing: the suggestion from the thread you linked is to stick with POSIX APIs in the case of fork. „I“ am! If I could prevent any Apple APIs from being called or even linked, I would, but apparently I can’t.

It’s Swift itself that is doing something fundamentally problematic here for reasons unknown to me.

I will file a bug requesting the aforementioned userland workaround. It will be tricky though, because an environment variable might not actually suffice here (let’s see).

I’ve filed a feature request for this here @eskimo

Are you able to use posix_spawn() instead?

It’s not me forking anything, it’s postgresql itself.

If I could not fork I would. If I could not call Apple APIs I wouldn’t.

But I do think that Swift calling async-signal-unsafe things at seemingly non-deterministic times on macOS seems to be the greater issue here.

fork() is effectively unsupported on macOS. The only safe thing you can do on macOS after calling fork() is to immediately call execve(). This is not a Swift-specific issue, so the Swift forums may not be the best place to discuss it.

Looking at the stack trace you've shared, I'm a little confused as it is incomplete. I'd like to know what is calling __swift_instantiateConcreteTypeFromMangledName(). Are you able to share more of that stack trace?

When you fork() a process, only the calling thread is cloned to the child process, so another thread can't be the one triggering this call into Swift. If you (or, yes, PostgreSQL) are then doing some work on that thread before calling execve(), and that work involves calling into non-PostgreSQL code such as a callback/hook function, then it's already undefined behavior on macOS regardless of what programming language the callback is implemented in.

1 Like

Is postgresql single threaded? AFAIK fork-without-immediate-exec is (only) fine in single threaded programs.

If it's not single threaded – it should not work (reliably) regardless of the language... but I do wonder if you actually had this extension written in C (or Objective-C) would you hit similar issues or not in practice?

Geordie_J wrote:

„I“ am!

Yep. Understood. The bind is:

  • You can’t use the system runtime because it calls os.log and there’s no way to prevent that [1].
  • There’s no easy path to building and statically linking the runtime on macOS [2].

I’ve filed a feature request for this here

Thanks. I think that’s the only way out of this tangle.

tera wrote:

AFAIK fork-without-immediate-exec is (only) fine in single
threaded programs.

In theory you can support fork-without-exec in a multi-threading program using pthread_atfork man page. Doing that, however, is very tricky.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

[1] I presume. I haven’t checked that myself, but I trust that you did your due diligence on that front.

[2] Because doing that would cause problems if other Swift code is running in your process, and if you’re calling Apple frameworks you can’t be sure that’s not the case.

3 Likes

I didn’t share more of the stack trace because I didn’t believe it’s important. The main thing is that some innocuous code calls into the Runtime which inits the logger. But I’ll share more of it here:

Application Specific Information:
crashed on child side of fork pre-exec


Thread 0 Crashed:
0   libsystem_trace.dylib         	       0x19ea1d0b4 _os_log_find + 64
1   libsystem_trace.dylib         	       0x19ea1cdd4 os_log_create + 304
2   libswiftCore.dylib            	       0x1b1ded42c swift::runtime::trace::setupLogs(void*) + 44
3   libdispatch.dylib             	       0x19eb4caa4 _dispatch_client_callout + 16
4   libdispatch.dylib             	       0x19eb35a40 _dispatch_once_callout + 32
5   libswiftCore.dylib            	       0x1b21efba4 swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptor<swift::InProcess> const*, bool) (.cold.4) + 44
6   libswiftCore.dylib            	       0x1b1de52f4 swift_conformsToProtocolMaybeInstantiateSuperclasses(swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptor<swift::InProcess> const*, bool) + 5216
7   libswiftCore.dylib            	       0x1b1de23a8 swift_conformsToProtocolWithExecutionContext + 72
8   libswiftCore.dylib            	       0x1b1d74f84 swift::_conformsToProtocol(swift::OpaqueValue const*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetProtocolDescriptorRef<swift::InProcess>, swift::TargetWitnessTable<swift::InProcess> const**, swift::ConformanceExecutionContext*) + 152
9   libswiftCore.dylib            	       0x1b1de0214 swift::_checkGenericRequirements(__swift::__runtime::llvm::ArrayRef<swift::GenericParamDescriptor>, __swift::__runtime::llvm::ArrayRef<swift::TargetGenericRequirementDescriptor<swift::InProcess>>, __swift::__runtime::llvm::SmallVectorImpl<void const*>&, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>, swift::ConformanceExecutionContext*) + 9584
10  libswiftCore.dylib            	       0x1b1dc5b2c _gatherGenericParameters(swift::TargetContextDescriptor<swift::InProcess> const*, __swift::__runtime::llvm::ArrayRef<swift::MetadataPackOrValue>, swift::TargetMetadata<swift::InProcess> const*, __swift::__runtime::llvm::SmallVectorImpl<unsigned int>&, __swift::__runtime::llvm::SmallVectorImpl<void const*>&, swift::Demangle::__runtime::Demangler&) + 2408
11  libswiftCore.dylib            	       0x1b1dd3bb4 (anonymous namespace)::DecodedMetadataBuilder::createBoundGenericType(swift::TargetContextDescriptor<swift::InProcess> const*, __swift::__runtime::llvm::ArrayRef<swift::MetadataPackOrValue>, swift::MetadataPackOrValue) const + 264
12  libswiftCore.dylib            	       0x1b1dcfd98 swift::Demangle::__runtime::TypeDecoder<(anonymous namespace)::DecodedMetadataBuilder>::decodeMangledType(swift::Demangle::__runtime::Node*, unsigned int, bool) + 9016
13  libswiftCore.dylib            	       0x1b1dc80cc swift_getTypeByMangledNodeImpl(swift::MetadataRequest, swift::Demangle::__runtime::Demangler&, swift::Demangle::__runtime::Node*, void const* const*, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 880
14  libswiftCore.dylib            	       0x1b1dc39dc swift_getTypeByMangledNode + 368
15  libswiftCore.dylib            	       0x1b1dc8b60 swift_getTypeByMangledNameImpl(swift::MetadataRequest, __swift::__runtime::llvm::StringRef, void const* const*, std::__1::function<void const* (unsigned int, unsigned int)>, std::__1::function<swift::TargetWitnessTable<swift::InProcess> const* (swift::TargetMetadata<swift::InProcess> const*, unsigned int)>) + 1204
16  libswiftCore.dylib            	       0x1b1dc1520 swift_getTypeByMangledName + 368
17  libswiftCore.dylib            	       0x1b1dc1a60 swift_getTypeByMangledNameInContextImpl(char const*, unsigned long, swift::TargetContextDescriptor<swift::InProcess> const*, void const* const*) + 192
18  libPerformanceAnalysis.dylib  	       0x1057eee04 __swift_instantiateConcreteTypeFromMangledName + 52
19  libPerformanceAnalysis.dylib  	       0x1057f7898 specialized _NativeSet.copyAndResize(capacity:) + 64
20  libPerformanceAnalysis.dylib  	       0x1057f727c specialized _NativeSet.insertNew(_:at:isUnique:) + 116
21  libPerformanceAnalysis.dylib  	       0x1057f6bf4 specialized Set._Variant.insert(_:) + 388
22  libPerformanceAnalysis.dylib  	       0x1057f5a54 specialized Set.union<A>(_:) + 216
23  libPerformanceAnalysis.dylib  	       0x105800114 ... + 8 (....swift:103) [inlined]
24  libPerformanceAnalysis.dylib  	       0x105800114 myCode() + 1128 

The fork manpage on macOS 26 states the following:

There are limits to what you can do in the child process.  To be totally
     safe you should restrict yourself to only executing async-signal safe
     operations until such time as one of the exec functions is called.  All
     APIs, including global data symbols, in any framework or library should
     be assumed to be unsafe after a fork() unless explicitly documented to be
     safe or async-signal safe.  If you need to use these frameworks in the
     child process, you must exec.  In this situation it is reasonable to exec
     yourself.

Running some numeric calculations in Swift (or especially in C) should not violate this. Indeed, postgres has been set up in this way for decades.

If the Runtime did not call into os.Logger then there would be no issue. I’m not doing threading, calling into any external APIs, etc. from my own code. I’m just using the Swift standard library in basic ways (I want it to be as basic as possible for performance reasons).

I’ve started experimenting with Embedded but there are challenges integrating that with the C code that interfaces with Postgres – I suspect this is the only possible “sanctioned” approach to avoiding Swift Runtime issues. But it’s harder than I expected, despite not calling any external APIs, Swift Embedded on macOS still tries to import Darwin and a bunch of other frameworks that should not be needed.

The runtime itself is not safe to use after fork. This has also been true of the Objective-C and CoreFoundation runtimes since Mac OS X 10.0. The CF runtime will hard crash similarly to libSystem if you call into it after a fork.

1 Like

I don’t know what to say to this. Are you also suggesting that the Embedded runtime is also unsafe to use?

While I understand that we have to make sweeping “safe” vs. “unsafe” statements because more nuance is complicated, this doesn’t seem to bode well for Swift as a systems language.

The problem is primarily with fork(), not with the Swift runtime. As has been stated previously: on macOS, there is really no safe action you can take after calling fork() except to call execve().

1 Like

I don’t know much about Embedded Swift. I’m referring to the normal Swift runtime.

Honestly, Postgres is playing with fire on Darwin. They should be re-execing (or posix_spawning) with additional command-line arguments to indicate which “mode” the child process should behave in.

2 Likes

You might need to do some things to prep for the exec, like closing file handles which mustn’t be inherited by the child process. There are sometimes alternative ways of doing this which can be done before the fork, but I’m not going to say they’re always viable.

1 Like

Can someone give an indication regarding Embedded Swift? Is it worth even trying to get this working, or is it also considered a non-starter on macOS? If not, I will just switch to Linux.

Embedded Swift has a smaller runtime footprint, but you are still going to run into the fundamental problem that it is unsafe to do almost anything other than call execve() in this context. Even with Embedded Swift, calling more Swift code may lead to calls to malloc() which is not async-signal-safe.

The problem is not Swift-specific. Any programming language, even C, will have this problem. After you call fork(), you can't just go off and do arbitrary things.

So the answer is "no, that's still leading to undefined behavior."

As far as I'm aware, the same fundamental problems exist on Linux and any other UNIX-like platform (although they no doubt manifest differently than on macOS.) Unless your process has exactly one thread and the total state of the program before calling fork() is controlled by the caller of fork(), you cannot guarantee that the system will be in a consistent state in the child process, hence why the only safe thing to do is to call execve().

1 Like

The issue of “async-signal-safe code” exists, but this is where it becomes a language issue. POSIX specifies that malloc and free certain functions are async-signal-safe. These functions are part of the C standard library, so it’s reasonable to consider this a language-level concern even if the stdlib and compiler are technically separate products.

Clearly the Linux ecosystem cares about making it possible for forked children to do useful work or else Postgres wouldn’t rely on this pattern. Though for what it’s worth, fork itself is required to be async-signal-safe; glibc’s implementation on Linux wasn’t until 2021.

1 Like

Respectfully, POSIX does not specify this. Do you have a reference?