How to troubleshoot upgrading Docker and Swift deployments?

I have my own blog engine, written in Swift using Vapor. I'm trying to update it to the latest versions of its dependencies including Swift. Last night I was able to update it to use Swift 6.2 (shipping with Xcode 26.2), and upped my Dockerfile from Swift 5.3.3 to 6.2. It all worked locally on my Mac in both Xcode and Docker.

I then built out the container and pushed it to my package registry on GitHub and pulled it down to my live Digital Ocean server. This is where disaster hit, because the container started exiting with tools_maverick_1 exited with code 132 as the error. After extensive discussion with Claude Code and Codex it seems that this was due to CPU flags on my server not including something called AVX-512 (I fully admit I'm way out of my depth here). Here is the PR with all my changes.

I later recalled that I had done something similar with another Vapor website that I run, so I pointed Claude at its Dockerfile and it was able to make some changes to my blog engine's Dockerfile that did get the app up and running (I think a big key was to bundle the Swift stdlib statically). The changes to the Dockerfile are in this PR.

However now when I run the app, I'm getting a crash that actually gives me a stack trace (Here's the trace, which I can make neither heads nor tails of). For what it's worth, Claude suggests that the problem may be a number of threads limitation in Docker and that I should add ulimits to my docker-compose file. I'm skeptical to say the least.

So this leaves me with a few questions:

  • Where did I go wrong in this process? Did I pick the wrong Docker image to run my Swift on? Is there a guide to picking the "right one"?
  • Are the CPU flag changes documented anywhere? Is this something a server developer running Docker would know to look out for? My day job is app building and not server running, so it's all new to me.
  • Is there a way I can make my new stack trace readable so I can take action on it? I can't even tell if it's in my code or not.
1 Like

The stack trace shows that you are hitting this precondition:

Thank you for that info! Is there a way to symbolicate the call stack above it so I can know what’s calling the code where this precondition lives? Because I’m betting it’s in Vapor somewhere, and not something in my blog engine package – I just don’t know what might be the root cause here.

I think it is fully symbolicated already
Reformatted with LLM:

ThreadOpsPosix.run(handle:args:)
└─ NIOThread.spawnAndRun(name:body:)
   └─ MultiThreadedEventLoopGroup.setupThreadAndEventLoop(name:uniqueID:parentGroup:selectorFactory:initializer:metricsDelegate:)
      └─ MultiThreadedEventLoopGroup.setupThreadAndEventLoop(name:uniqueID:parentGroup:selectorFactory:initializer:metricsDelegate:) [system/<stdin>]
         └─ closure #1 in MultiThreadedEventLoopGroup.init(threadInitializers:canBeShutDown:threadNamePrefix:metricsDelegate:selectorFactory:)
            └─ partial apply for closure #1 in MultiThreadedEventLoopGroup.init(threadInitializers:canBeShutDown:threadNamePrefix:metricsDelegate:selectorFactory:)
               └─ Collection.map(_:)
                  └─ MultiThreadedEventLoopGroup.init(threadInitializers:canBeShutDown:threadNamePrefix:metricsDelegate:selectorFactory:)
                     └─ MultiThreadedEventLoopGroup._makePerpetualGroup(threadNamePrefix:numberOfThreads:)
                        └─ MultiThreadedEventLoopGroup.__allocating_init(numberOfThreads:canBeShutDown:threadNamePrefix:metricsDelegate:selectorFactory:)
                           └─ MultiThreadedEventLoopGroup.__allocating_init(threadInitializers:canBeShutDown:threadNamePrefix:metricsDelegate:selectorFactory:)
                              └─ closure #1 in singletonMTELG initialization
                                 └─ one-time initialization function for singletonMTELG
                                    └─ swift::threading_impl::once_slow(...)
                                       └─ singletonMTELG.unsafeMutableAddressor
                                          └─ NIOSingletons.posixEventLoopGroup.getter
                                             └─ MultiThreadedEventLoopGroup.singleton.getter
                                                └─ EventLoopGroup<>.singletonMultiThreadedEventLoopGroup.getter
                                                   └─ default argument 1 of Application.init(_:_:)
                                                      └─ Maverick_main

So from your main into Vapor‘s Application.init which then tries to setup the default eventloopgroup of NIO and fails because it cannot create any thread. Hm weird. Just for fun, can you try this on a fresh droplet/vm?

1 Like

Indeed, this is a thread creation failure which is odd indeed.

I assume you're not accidentally starting lots of Vapor Applications?

Unfortunately, due to a Swift bug, the precondition failure message doesn't get printed in release mode.

CC @lukasa Maybe worth changing this (and others) to a fatalError instead to get at least the errno value. Said that, it'll likely be in the register dump.

1 Like

The only registers that may contain errno values are rax with 1 (EPERM) and r10 with 8 (ENOEXEC).

So it's either EPERM which is a legit errno code from pthread_create:

EPERM  No permission to set the scheduling policy and parameters
              specified in attr.

Or it's no longer in the registers. It can't be ENOEXEC that makes little sense.

But on Linux, NIO doesn't request any special scheduling policies or attrs...

I ended up going the route of making a new server and setting it up (a process I had not yet automated, so that was a big part of my day), and updating some other pieces of my Vapor configuration. Thankfully now my server is online and my blog is back.

Here’s the PR of the changes which got me up and running: [CI] Updating Docker to get running on the server by jsorge · Pull Request #14 · taphouseio/maverick · GitHub. This weekend has been such a whirlwind I honestly don’t know why this wasn’t working on my old server but is working on my new one. If I remember correctly, switching the Dockerfile build to include the Swift standard library statically made a big difference (I adapted the Dockerfile from the Vapor app I use for a different website).

My original intuition was that the thread mishandling could have been in my configure.swift file in how I scheduled repeating work to create my RSS & JSON feeds but I don’t think that was it. I think the static bundling is probably the thing that did the trick.

Honestly the thread handling issue seems highly unlikely to be the result of anything in your code. NIO was trying to create its threads very early in the lifetime of your process, and still is: at the point of creation of the Application object. This is using a singleton ELG so you aren’t creating a hilarious number: you’ll be creating however many threads correspond to the number of cores on your system. In the absolute worst case this would be a couple hundred, but it’s not likely you’re renting a DO VM that large.

Any change after Application.init() or Application.make() cannot influence this crash directly because those happen after we try to create all the threads, not before.

This is definitely extremely weird. If you’re motivated to try to help out and can produce a version of this code that consistently fails in a DO droplet, I’d love to have both a commit hash, docker image, and DO droplet configuration you’re using to try to investigate further.