[Pitch] Swift Subprocess

There will be special handling for Windows path lookup. Subprocess.Environment also support raw bytes so you can pass in Data directly.

I agree with @FranzBusch that Subprocess doesn't belong to System because it's meant to be cross platform type (which is exactly what Foundation aims to provide). However, I can see ProcessIdentifier and Signal to be in System. I'll talk to the system folks to see if that makes sense.

2 Likes

Thanks for bringing this up and I'll def keep this in mind. The reason I didn't add the "escape hatch" for Linux is because there is a chance that we won't be able to use posix_spawn on Linux because it's relatively new to Glibc and may not have WASM support.

This is definitely a goal for this project. However, I'd also want to give users the ability to make platform specific, low level changes to how process is run (hence the "escape hatches"). This means we will have platform specific APIs (see PlatformOptions) and I think that's okay.

This is a great point. I'll work with the Windows team to come up with a more platform neutral name.

Unfortunately PATH searching has security implications and may not be acceptable in some applications. Subprocess.Executable.at() is designed to allow developers to say "use this path directly without attempting to search". One can argue Subprocess should only use FilePath and always treat is as "full path", but that seems less ergonomic for the scripting case (imagining you have to explicitly call .findExecutableInEnvironmentPath() when you just want to execute ls)

Subprocess.Executable is designed to be the middle ground between the two (somewhat opposite) use cases. Depending on your use case you either use .at() or .named().

2 Likes

This was in one of my earlier designs and I moved away from it for one reason: we need to perform cleanups. Spawning a process involves opening file descriptors and other resources that needs to be released once the subprocess finishes running. If Subprocess is returned to you directly, there won't be a way for Subprocess to close all those file descriptors after you are done streaming, and asking users to explicitly calling close seems like a big foot gun.

Because of this reason, if you want to stream (as in take an AsyncSequence) the output you must use the closure approach. This way Subprocess can perform cleanup after your closure returns.

The other solution is to use the non-closure approach, which will capture the standard output for you and close the file descriptor before returning.

You can stream the subprocess output with the "closure" approach:

let result = try await Subprocess.run(
    executing: .named("llm-tool"),
    arguments: ["--prompt", userPrompt]
) { subprocess in
    for try await streamingResponse in subprocess.standardOutput! {
         handleResponse(streamingResponse)
    }
    return ...
}

Why can't an implicit close be permitted when the Subprocess deinits?

Having an explicit terminate() function (or similar) is fine too, for cases where that's more convenient than letting the instance fall out of scope.

It's true that users might stash a Subprocess somewhere persistent unnecessarily, but that's true of anything. I don't think Subprocess needs to be particularly defensive about that. Most uses (for the API style I'm advocating) would be contained to a single function and so deinit-on-scope-termination is perfectly fine.

Yes; why not?

(I've never much liked thread / task / similar APIs that make you explicitly call some parameterless start() method - it seems pointless and forces the object to be stateful)

You mean if the Subprocess deinits because it naturally goes out of scope? Why wouldn't you just kill it then? The caller has indicated that they don't need it anymore.

If they do want it to live, they can await its exit. If they still care about it (e.g. it's a background upload via curl), they need to check its exit status anyway.

When the Subprocess goes away, or the caller explicitly indicates they should be closed (a common requirement for stdin to the subprocess, as many programs need to see EOF on stdin to conclude). Seems no different to the closure case, really?

5 Likes

Data would still need the backing created and represented right? This feels like you would end up with quite a few conversions between encoding spaces. Would it be possible to consider something which would wrap the type to avoid the constant conversions?

For two major reasons:

  1. deinit must succeed while resource cleanup can fail. For example FileDescriptor.close() throws, what should we do if it fails in deinit?
  2. In order for Subprocess to even have a deinit we would have to make it ~Copyable. While this seems like a natural fit for Subprocess, in my experiments ~Copyable are severely limited in what they can do today (for example, it can't be used as a primary constraint on a protocol, and you cannot pass them to a structured child task for now). I don't think its right time for Subprocess to adopt ~Copyable just yet.

IMO expecting developers to remember to call terminate() is a big foot gun and I want to eliminate this kind of design at least for general use cases (I made a exception for this for StandardInputWriter because I almost consider it an "escape hatch")

That would mean you can no longer "save" the configuration of a task somewhere and spawn it later. One might argue this is a rare use case but it's still a valid use case none the less (and supported by Process today). With the current design you would just call the static run method, which let you have the convenience of not needing to create an object if you don't want to.

Killing a process is a somewhat complicated and can definitely error out. It really isn't something we should be doing in deinit IMO.

2 Likes

Overall functionality seems ok. Though I never used any fancy features, so I don't even understand some of the mentioned stuff.

Tomorrow (or the day after) I will post some more thoughts. But it will be waaaaay longer, so it has to wait. Today I would propose some quality of life improvements.

Arguments init with String

In Python I can use shlex.split (docs, source) to split str into a list[str]:

# Real life example. This is a single line!
# I don't even know how to split this thing!
command = f"{FFMPEG} -n -loglevel quiet -hide_banner -headers \"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3730.3 Safari/537.36?\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7?\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8?\r\nAccept-Encoding: gzip, deflate?\r\nAccept-Language: en-us,en;q=0.5?\r\n\"\" -i \"{video_url}\" -codec copy -f mp4 file:\"{path}\""
# Split it into a list.
args = shlex.split(command)
process = Popen(args)

In Swift we could name it Arguments(split: String).

let args = Arguments(split: "-n -loglevel quiet -hide_banner -i \"\(url)\" -codec copy -f mp4 file:"\(outputFile)")

You know how sometimes Swift compiler shows a 50 line swiftc invocation? It is not possible to split it by hand. With this init you can just paste it.

Combined stdout & stderr

Python has this nifty feature where you can combine stdout and stderr by setting stderr to a special subprocess.STDOUT value (docs, source):

If you wish to capture and combine both streams into one, use stdout=PIPE and stderr=STDOUT instead of capture_output.

For me this is a real use case. I have seen way too many programs using stderr for what should be stdout. Is there even any difference? <-- glibc

Log close errors

From the linked repository (I know that this is a proof of concept, but it is a nice base for discussions; btw. I removed a lot of code):

extension Subprocess {
  public struct Configuration: Sendable {
    public func run<R>(…) async throws -> Result<R> {
      @Sendable func cleanup() throws {
        // Clean up
        try process.executionInput.closeAll()
        try process.executionOutput.closeAll()
        try process.executionError.closeAll()
      }

      group.addTask {
        do {
          let result = try await body(process, .init(fileDescriptor: writeFd))
          try cleanup()
          return .workBody(result)
        } catch {
          try cleanup()
          throw error
        }
      }
    }
  }

  internal enum ExecutionInput {
    internal func closeAll() throws {
      switch self {
      case .noInput(let readFd):
        try readFd.close()
      case .customWrite(let readFd, let writeFd):
        try readFd.close()
        try writeFd.close()
      case .fileDescriptor(let fd):
        try fd.close()
      }
    }
  }
}

If try process.executionInput.closeAll() throws then executionOutput and executionError will not get closed? Will they silently leave dangling file descriptors?

There is a limit on how many file descriptors you can open. While closing error is not a big problem I would still log it somewhere and try to close the other ones. You may add closing errors to the Result:

var stdinClosingError: Error?
var stdoutClosingError: Error?
var stderrClosingError: Error?

It is not pretty, but the user can check them and log somewhere. If the process crashes then I would definitely be interested in them.

Name

Shortcut for setting the process name. True nonsense, I know…

Imagine having 20 processes named ffmpeg , and you have to 1 kill of them. The only way to find which one is to see which files have they open. This is a use case that I have, but I do not think it is popular, so feel free to ignore it.

3 Likes

The process name is derived from argv[0] on POSIX platforms, and I don't think this is as much a nonsense as you think.

There are real use case for having control of argv[0], and swift toolchain developers are probably aware of it, as they are heavily relying on argv[0] value in there driver (as did clang before them).

Being able to invoke for instance swift-frontend or swift-package with a specific process name without having to create a symlink first would be very helpful.

4 Likes

This is a great idea and I did consider this in one of my earlier explorations. I decided to defer this feature for future because it turned out to be a "hard problem" (and this proposal is pretty complex as it is) -- even shlex applies different rules on Windows vs bash, and as a cross-platform library ideally Foundation should abstract away these differences. I will add a section in Future Directions for this feature.

Another great suggestion! Let me think about this one. I think maybe we can add a new case to RedirectedOutputMethod.

Good catch! This is more or less a bug in the sample code. Subprocess is designed in the way that all errors, including close errors are thrown with .run(). You are actually forced to catch these errors because .run() throws.
(FYI this is why Subprocess collects the output into Data for the non-clousure runs -- we need to be able to collect the data, then clouse all file descriptors, before returning from .run())

Haha good news! Subprocess already supports this:

let ffmpeg = try await Subprocess.run(
    executing: .named("ffmpeg"),
    arguments: .init(["arg1"], executablePathOverride: "/path/to/new-ffmpeg")
)

Thie code above will send /path/to/new-ffmpeg as arg0 to the Subprocess.

3 Likes

I'm really happy about the provision of this library.

I've tried a bunch of other process APIs and found that this one didn't get in my way like all the others. It even allowed me to do some things that I've been unable to get to work with all the others—and these things are one-/two-liners in e.g. Python :sweat_smile:.

Overall the API felt pretty natural.

Some things I noted while using the PoC branch (not all necessarily API feedback, but thought I'd mention in case it is unsolvable with the proposed API).

Subprocess.sendSignal(_:) unilaterally sends signal to the process group

The implementation of this API calls kill(2) with -pid which indicates that the signal should be sent to all processes in the process group.

Specifically, if this API is called from a parent process running in Xcode, the underlying kill(2) call returns ESRCH. The same program has no such issues when run on the command line.

I'm not expert enough to comment with certainty, and this might have something to do with how Xcode spins up the process for debugging, but in any event, it made me question the default behaviour.

Especially since the Subprocess.PlatformOptions has both a createProcessGroup and a createSession option, it made me wonder if either:

  1. The implementation could use this information to determine whether to negate the PID for the kill(2) call; or
  2. The API could take a boolean parameter: sendSignal(_:toProcessGroup:).

Run unilaterally closes the FDs

Because .readFrom(_:) and .writeTo(_:) both take FileDescriptor I tab-completed my way into a footgun, by passing .standardInput, .standardOutput, and .standardOutput to input, output, and error respectively. This of course was silly for me to do, because they got closed after the process exited. What I needed to do was use try .standardInput.duplicate() etc.

It made me wonder whether it might be convenient to either:

  1. Make the closing optional, e.g. .readFrom(_:closeOnExit:); or
  2. Provide a convenience to hint folks: .readFrom(duplicating:); or
  3. Don't expose this API in terms of FileDescriptor but instead some wrapped variant that explicitly spells out the semantics of ownership of the FD.

Task gets blocked in read(2) even if process has exited

In my example I was calling a process that might print to standard output. Depending on how you interact with it, it might exit (cleanly) without printing anything.

In the case where the process does print something to standard output, everything works as expected—that is, the call to try await Subprocess.run(...) completes—but if the process does not print anything, then run(...) never returns.

It looks like it's stuck in a blocking read. I toyed around with some task cancellation handlers but it wasn't something I could get to work.

Cancellation conveniences

Maybe I'm holding it wrong, but if the task in which the call to run is cancelled, should we attempt to do any signalling of the process? IIUC this is not happening by default.

I realised that I can do this with the closure-based API and manually wrapping a cancellation handler, but I wondered if we wanted to expose some API for this, e.g. a parameter to run like signalOnTaskCancellation: Signal or some kind of graceful termination hook on task cancellation.

Asymmetric overrides for Subprocess.Arguments.init(_:executablePathOverride:)

The PoC contains two of these:

  1. public init(_ array: [String], executablePathOverride: String)
  2. public init(_ array: [Data], executablePathOverride: Data? = nil)

For me, because (1) and (2) differ in the optionality in the second parameter, I was getting some overload resolution annoyances when I tried to pass an array of Strings.

3 Likes

Possibly. I was imprecise in paraphrasing Michael's suggestion ("more like an actor"), but I don't even think that was intended as a definite proposal.

Indeed. However, the thing that is unclear to me is the extent to which process-launching interfaces actually diverge. Is it possible that the System implementation would essentially be identical across the major platforms?

I think we need to know that as part of figuring out where the line between System and Foundation is in this instance.

1 Like

Both are great suggestions. Thank you!

You are correct that Subprocess doesn't automatically cancel the child process, and I'm not sure if we should. I'll put this under future considerations since IMO it needs some additional thoughts to get it right and this proposal is big enough as it is.

Thanks for pointing this out. I will update.

I don't think we can leave this for future considerations. Every API needs to respect task cancellation otherwise your process might get stuck because we are using a Subprocess somewhere that just never ends. I would start with just sending a SIGKILL to the subprocess. This will make sure that we are shutting down the subprocess and the user code that might run in the closure will also get the cancellation.

In the future, we can then revisit this and see if we need the cancellation behaviour customisable; however, in my opinion this is something that the language should solve rather than every individual API. A concept like cancellation shields would allow users to opt-out of automatic cancellation propagation in parts of the application where they want to handle it differently.

13 Likes

I fully agree with your statement about task cancellation. Without it, this proposal seems incomplete to me.

Won't killing the subprocess only handle one level of subprocess? It seems to me subprocesses of the cancelled/killed subprocess will continue to run. To handle cancellation properly shouldn't the whole tree of subprocesses be cancelled/killed?

1 Like

I'm not sure if this is true and let me elaborate on why I wasn't sure if we should kill the child process:

When launching a subprocess, what's the "task" that the parent process runs? Conceptually, we'd like to think the "task" is the subprocess itself, which is a valid mental model. However, it's important to note a subtle nuance here: this API, along with its ancestor NSTask (aka Process), launches a process rather than thread. This child process does not share any memory space nor execution stack with the parent process, and it is therefore outside the scope and mandate of Swift's Structured Concurrency on principle. This means technically the cancellation of a task does not imply cancellation of a process.

Another (more practical) reason for my hesitation to cancel the child process is its implication on scripting -- Terminating the child process when the parent task ends means there would be no way to have a script spawn a long-running process and then exit before that process ends. While this concern may be deemed less relevant now that Swift supports top-level async, it remains a consideration.

With all that being said, I do agree with Franz that sending SIGKILL to the child process is the most reasonable approach for now. I'll update the proposal.

3 Likes