Subprocess I/O hanging on Linux

Hey folks, I've been testing some code that runs a subprocess and attempts to read stdout/stderr back into Swift asynchronously, but getting hangs 100% of the time under Linux.

I've written up a bug report and example code here: Async Process IO hangs 100% of the time on Linux · Issue #5197 · swiftlang/swift-corelibs-foundation · GitHub

I see this space is pretty active so thought I'd open a discussion here too. I can see other people have seen similar issues[1], although in my case I'm also seeing the hang on GitHub Actions which I believe runs in a VM rather than in a container (at least for the raw execution steps), and I can also reproduce the issue in OrbStack which is pointed out as a workaround.

I've create a relatively minimal reproduction, with code similar to what's in my project, available here: GitHub - danpalmer/swift-linux-issue: Reproducing an issue in Swift on Linux.

# Works as expected
$ swift run

# Hangs
$ docker run --rm -v $(pwd):/code -w /code swift:6.0 swift run

Are there any known issues or edge cases in the Linux run loop implementation, or socket implementation, that might be causing this to happen? Does anyone know of better/less error-prone ways to implement subprocess communication like this?

Thanks all!


  1. RunLoop.run hang in containerized Linux ↩︎

2 Likes

Without looking too deeply, your reproduction is still fairly large. It would be more useful to have a much more minimal repro case that clearly identifies that the problem you are seeing has something to do with the libraries and isn't say, some sort of deadlock that relies on unrelated platform behavior.

Thanks for taking a first look. I've done another pass on the reproduction, and removed stderr to simplify as it isn't necessary to show the issue, however I think it's nearly as simple as it can be to demonstrate the issue. The framework doesn't provide async reading from processes in this way, so some amount of boilerplate is needed.

I've updated the reproduction with another version using an existing open source package that has been fairly battle tested on macOS at this point. The basic implementation may be subtly incorrect, but so far exhibits the same behaviour as the more complete one.

Your raw_impl case seems to work okay for me. Does that shift the blame to whatever this library is that you are using, rather than the standard libraries?

I've updated the GitHub issue with some more detail, in summary...

I think there are two issues here, one previously described in [SR-12080] readabilityHandler on pipe that is Process.standardError sometimes doesn't get called on EOF · Issue #3275 · swiftlang/swift-corelibs-foundation · GitHub, which can be worked around (my basic reproduction is vulnerable to this one, libraries aren't), and a separate one where the main thread appears to not be responding to SIGCHLD signals correctly (which multiple libraries appear vulnerable to).

I'm refocusing on the second issue, because the first is long standing and there are workarounds, and the second appears to be the main blocker for my application.

See the GitHub issue for links to an strace, new code sample, etc.

1 Like

This appears to be an issue using the latest Swift 6.2.1 container as I’m experiencing a similar issue. On macOS everything runs fine, but when running on a Linux container (Debian 12 in my case) it hangs until killed

In doing some additional digging, Cursor says that the issue is related to using .standardOutput and/or .standardError in the call to run, like so:

let result = try await run(
executable,
arguments: arguments,
output: .standardOutput,
error: .standardError
)

As a workaround, I have been able to get around this issue by using an overload of run that takes a closure with streams for stdout and stderr, like so:

let result = try await run(executable, arguments: arguments) { _, _, stdout, stderr in
try await withThrowingTaskGroup(of: Void.self) { group in
group.addTask {
for try await line in stdout.lines() {
try FileHandle.standardOutput.write(contentsOf: Data(line.utf8))
}
}
group.addTask {
for try await line in stderr.lines() {
try FileHandle.standardError.write(contentsOf: Data(line.utf8))
}
}
try await group.waitForAll()
}
}

Kinda hacky and it seems at times that not all of the output from the executable is ultimately printed out, however, this doesn’t hang (at least for me) and I’m able to proceed with my work until a formal solution is implemented. Hope this helps