[Review, 3rd] SF-0007: Subprocess

LiarPrincess · March 5, 2025, 11:05pm

1 day past the review deadline, but I'm not getting paid for writing this anyway (as in: I using my own free time), so I do not need to stress about it.

Other packages

Do you remember what was wrong with the other packages? It would be nice if the proposal included sentences like:

There is a potential problem X which this proposal solves by doing Y [and the user is required to do Z].

Sometimes I go back to the proposals months after they were accepted, mostly when I start using the proposed feature. Having a reminder why a certain decision was made, without going to the forum, would be helpful.

`posix_spawnattr_t etc.`

public struct PlatformOptions: Sendable, Hashable {
  public var preSpawnProcessConfigurator: (
      @Sendable (
          inout posix_spawnattr_t?,
          inout posix_spawn_file_actions_t?
      ) throws -> Void
  )? = nil
}

I agree with others that exposing the posix_spawnattr_t/posix_spawn_file_actions_t may not be the best idea.

For years I have been using Hexagonal architecture (“Ports and adapters” thingie): you do not allow dependencies into the “core” business logic, so for every dependency you create an abstraction layer.

Subprocess lives at the edge of the system, so if we wanted to follow the “Hexagonal architecture” we would wrap the posix_spawnattr_t/posix_spawn_file_actions_t. This way we can change the Subprocess internals without changing the business logic.

That said posix_spawnattr_t/posix_spawn_file_actions_t are fairly big and difficult to wrap.

`waitid(P_ALL)`

In Linux implementation we have:

var siginfo = siginfo_t()
guard waitid(P_ALL, id_t(0), &siginfo, WEXITED) == 0 else {
    return
}

waitid documentation says: If idtype is P_ALL, waitid() shall wait for any children and id is ignored.

How this will work if I already use another Subprocess like solution? What is the migration path? For example swift-corelibs-foundation/Process has:

  repeat {
#if CYGWIN
      waitResult = waitpid( process.processIdentifier, exitCodePtrWrapper, 0)
#else
      waitResult = waitpid( process.processIdentifier, &exitCode, 0)
#endif
  } while ( (waitResult == -1) && (errno == EINTR) )

Which wait[p]id will trigger for the process started with Foundation.Process? The Subprocess -> waitid or Foundation.Process->waitpid?

Does anybody know? I have never tested this. I always avoided global solutions and P_ALL seems like one.

Pid reuse

I mentioned pid reuse long time ago, it seems like we kind of lost it at some point. The current implementation:

extension Execution {
  private func isAlive() -> Bool {
    return kill(self.processIdentifier.value, 0) == 0
  }
}

I'm not sure that is is enough. Personally I would store Bool in a property after the monitorProcessTermination finishes. If termination sequence was delayed (or somehow run again) it would do nothing, even in there is a new process with the same pid.

In theory this is still a race condition, but even the current implementation has a race condition: process exits after isAlive() but before we send signal. IIRC this is not a problem on Linux because pids are reused “going up”, so we would have to cycle all of the pids to get back at the one we had.

Async IO

In theory on Linux you can use a single epoll for everything.

When reading from the pipe (I simplified things, to make it more readable):

Create continuation
epoll determines that the file is readable -> resume continuation
Read

When writing:

Create continuation
epoll determines that the file is writable -> resume continuation
Write
Repeat until all bytes are written

Note that epoll will return “writable” as long as the pipe is not full, for example:

we want to write 4 bytes
pipe buffer is 16 bytes
pipe already contains 15 bytes
epoll determines it as writable - we write 1 byte, and try again with the remaining 3 bytes.

With non-blocking IO we can do a preemptive read/write. In the worst case nothing will happen. We (obviously) split our data into a PIPE_BUF sized chunks.

We can add another file from the subprocess thread, because epoll_wait documentation states:

While one thread is blocked in a call to epoll_wait(), it is
possible for another thread to add a file descriptor to the
waited-upon epoll instance. If the new file descriptor becomes
ready, it will cause the epoll_wait() call to unblock.

This is where things get more interesting. As far as I know you can use pidfile in epoll. This replaces the “wait” part of waitpid:

epoll waits for the pidfile to know when the child exists
then waitpid to reap the process

This puts us in a fairy () interesting place, as in a way Subprocess becomes an empty shell:

waitpid is mostly done by epoll
async reading/writing is also done by epoll

This archives the goal of being 100% Swift concurrency neutral - we do not block global concurrent executor. Also, we only need 1 epoll regardless of the number of child processes, so the whole solution scales in the same way as epoll.

In the pitch thread (1 year ago) I mentioned that this is one of the strategies used by Python. Their documentation states:

Child watcher implementation using Linux's pid file descriptors.

This child watcher polls process file descriptors (pidfds) to await child
process termination. In some respects, PidfdChildWatcher is a "Goldilocks"
child watcher implementation. It doesn't require signals or threads, doesn't
interfere with any processes launched outside the event loop, and scales
linearly with the number of subprocesses launched by the event loop. The
main disadvantage is that pidfds are specific to Linux, and only work on
recent (5.3+) kernels.

This is for epoll, I don't know about the kqueue or Windows. Those OS will use their own solutions. I have never used io_uring.

It would be nice if Swift had some abstraction over different poll implementations, similar to Rust/tokio-rs/mio.

Read all

If I wanted to capture the whole output how would I do that? Probably StringOutput(limit: Int, encoding: Encoding.Type), but what limit should I use?

DispatchIO.read(offset:length:queue:ioHandler:) says SIZE_MAX, but DispatchIO is an implementation detail.

Btw. shouldn't “read all” be the default? Otherwise we run into a risk of:

During development/testing I use small file - this works
In production I suddenly receive a big file - this will read only the first 128kb

This can be prevented by carefully reading the docs, but there are a lot of docs to read.

SubprocessFoundation

Can't this be replaced by #if canImport(Foundation)? Then we would have only 1 module.

Documentation

If we look at the very 1st proposed API:

/// Run a executable with given parameters and a custom closure
/// to manage the running subprocess' lifetime and its IOs.
public func run<…>(
  _ executable: Executable,
  arguments: Arguments = [],
  environment: Environment = .inherit,
  workingDirectory: FilePath? = nil,
  platformOptions: PlatformOptions = PlatformOptions(),
  input: Input = .none,
  output: Output = .string,
  error: Error = .discarded
) async throws -> CollectedResult<Output, Error>

Where is the “custom closure”? This method has the same documentation as: body: (@escaping (Execution<Output, Error>) async throws -> Result).

There were so many changes to the proposal, that I think it needs an editor - somebody who would proof read the whole thing (just like in newspapers).

Complicated input

I feel like IO became quite complicated, for input we have (this is a copy from the proposal to show of the volume of the new API):

public final class Pipe: Sendable {
  public init() {}
}

public protocol InputProtocol: Sendable {
  func readFileDescriptor() throws -> FileDescriptor?
  func writeFileDescriptor() throws -> FileDescriptor?
  func closeReadFileDescriptor() throws
  func closeWriteFileDescriptor() throws
  func write(into writeFileDescriptor: FileDescriptor) async throws
}

public protocol ManagedInputProtocol: InputProtocol {
  var pipe: Pipe { get }
}

public final class NoInput: InputProtocol { }
public final class FileDescriptorInput: InputProtocol { }
public final class StringInput<InputString, Encoding>: ManagedInputProtocol { }
public final class ArrayInput: ManagedInputProtocol { }

/// A concrete `Input` type for subprocess that indicates that
/// the Subprocess should read its input from `StandardInputWriter`.
public struct CustomWriteInput: ManagedInputProtocol { }

extension InputProtocol where Self == NoInput {
  public static var none: Self { get }
}

extension InputProtocol where Self == FileDescriptorInput {
  public static func fileDescriptor(_ fd: FileDescriptor, closeAfterSpawningProcess: Bool) -> Self
}

extension InputProtocol {
  public static func array(_ array: [UInt8]) -> Self where Self == ArrayInput
  public static func string<InputString>(_ string: InputString) -> Self where Self == StringInput<InputString, UTF8>
  public static func string<InputString, Encoding>(_ string: InputString, using encoding: Encoding.Type) -> Self where Self == StringInput<InputString, Encoding>
}

// SubprocessFoundation

public final class DataInput: ManagedInputProtocol { }

public struct DataSequenceInput<InputSequence>: ManagedInputProtocol where InputSequence.Element == Data { }
public struct DataAsyncSequenceInput<InputSequence>: ManagedInputProtocol where InputSequence.Element == Data { }

extension InputProtocol {
  public static func data(_ data: Data) -> Self where Self == DataInput
  public static func sequence<InputSequence>(_ sequence: InputSequence) -> Self where Self == DataSequenceInput<InputSequence>
  public static func sequence<InputSequence>(_ asyncSequence: InputSequence) -> Self where Self == DataAsyncSequenceInput<InputSequence>
}

(That was only the input part!)

Why Encoding is a generic argument to StringInput? Can't it be a private property? In which scenario(s) does it make a difference? Do we actually get anything out of it?

Pipe has only a init. This is weird.

Pipe has the same name as Foundation.Pipe, so in 99% of the cases users would have to write it as Subprocess.Pipe.

Just by looking at the proposal the ManagedInputProtocol makes no sense. We do not gain anything from conforming to it. The whole ManagedOutputProtocol seems under-described in the proposal. The code makes it more clear:

extension ManagedInputProtocol {
  public func readFileDescriptor() throws -> FileDescriptor? {
    return try self.pipe.readFileDescriptor(creatingIfNeeded: true)
  }

  public func writeFileDescriptor() throws -> FileDescriptor? {
    return try self.pipe.writeFileDescriptor(creatingIfNeeded: true)
  }

  public func closeReadFileDescriptor() throws {
    try self.pipe.closeReadFileDescriptor()
  }

  public func closeWriteFileDescriptor() throws {
    try self.pipe.closeWriteFileDescriptor()
  }
}

Why NoInput, FileDescriptorInput, StringInput, ArrayInput are final class? They should be struct.

I do not like “Custom” in class/function names as it does not tell anything to the user. What is so “custom” about CustomWriteInput? Also, the documentation is a 1 sentence that contains “input/that/subprocess” 2 times each:

/// A concrete `Input` type for subprocess that indicates that
/// the Subprocess should read its input from `StandardInputWriter`.
public struct CustomWriteInput: ManagedInputProtocol { }

Btw. So many symbols! And the output is largely the same which makes the whole IO huge.

Can you generate docs for this module? Over the years I discovered that there is correlation between the size of the documentation and the amount of people who actually read it.

I think that the proposal overestimates how much time people spend reading the docs, and it would be hugely beneficial if we removed some things.

NoInput

Why do we need NoInput? Can't we use nil? This is the default, so most of the time users will never write .none. NoInput just clutters the namespace.

// Before
_ = try await run(
  .name("cat"),
  input: .none // This is the default, but I made it explicit
)

// After
_ = try await run(
  .name("cat"),
  input: nil // Still the default
)

Changes:

Make input argument optional with nil as default.
Make NoInput private
Use let realInput = inputFromArgument ?? NoInput()

Remove ManagedInputProtocol (controversial one)

In proposal we have:

public final class Pipe: Sendable {
  public init() {}
}

/// `ManagedInputProtocol` is managed by `Subprocess` and
/// utilizes its `Pipe` type to facilitate input writing.
/// Developers have the option to implement custom
/// input types by conforming to `ManagedInputProtocol`
/// and implementing the `write(into:)` method.
public protocol ManagedInputProtocol: InputProtocol {
  var pipe: Pipe { get }
}

extension ManagedInputProtocol {
  public func readFileDescriptor() throws -> FileDescriptor? {
    return try self.pipe.readFileDescriptor(creatingIfNeeded true)
  }
  public func writeFileDescriptor() throws -> FileDescriptor? {
    return try self.pipe.writeFileDescriptor(creatingIfNeeded true)
  }
  public func closeReadFileDescriptor() throws {
     try self.pipe.closeReadFileDescriptor()
  }
  public func closeWriteFileDescriptor() throws {
    try self.pipe.closeWriteFileDescriptor()
  }
}

Pipe is empty and ManagedInputProtocol is a convenience mixin that acts as a proxy.

ManagedInputProtocol can be confused with InputProtocol, at least I was confused when reading the proposal.
ManagedInputProtocol.XXX just calls pipe.XXX.
In total ManagedInputProtocol saves users from writing exactly 4 lines of code.

This protocol is not pulling its weight.

Changes:

Make those 4 methods on Pipe public.
Make ManagedInputProtocol internal for StringInput and friends.

After those changes InputProtocol is the only InputProtocol we export -> no confusion here. I think it is cleaner and maps to a simpler mental model.

Complicated output

Same as input, but slightly “noisier”. In the proposal we have:

@available(macOS 9999, *) // Span equivelent
public protocol ManagedOutputProtocol: OutputProtocol {
  var pipe: Pipe { get }
  func output(from span: RawSpan) throws -> OutputType
  /// The max amount of data to collect for this output.
  var maxSize: Int { get }
}

@available(macOS 9999, *) // Span equivelent
public struct StringOutput<Encoding: Unicode.Encoding>: ManagedOutputProtocol { }

StringOutput has to export public let maxSize. I think that maxSize should only be present in the initializer and stored in a private property. Is there any scenario where it would be useful later?

maxSize does not have a unit. The standard naming in Swift for a size/length is count, for example Array.count. Maybe [max]byteCount would be better?

Btw2. the naming is inconsistent, we have maxSize, but in OutputProtocol we have limit.

@available(macOS 9999, *) // Span equivelent
extension OutputProtocol where Self == BytesOutput {
  public static func bytes(limit: Int) -> Self
}

Btw3. I'm not sure what is the convention here, but usually method names are verbs/actions, so OutputProtocol.bytes is slightly weird. In defense of the proposal there is something like that already in Swift - probably in async URLSession or AsyncSequence.

Same naming remark for OutputProtocol - I added comments in the code below:

public protocol OutputProtocol: Sendable {
  // This should be 'create[Read/Write]FileDescriptor'.
  // 'readFileDescriptor' suggests 'FileDescriptor.read' not creating it.
  func readFileDescriptor() throws -> FileDescriptor?
  func writeFileDescriptor() throws -> FileDescriptor?

  // Those are good.
  func consumeReadFileDescriptor() -> FileDescriptor?
  func closeReadFileDescriptor() throws
  func closeWriteFileDescriptor() throws
  func captureOutput() async throws -> OutputType
}

Btw4. in which case I would have a readFileDescriptor without the writeFileDescriptor? We pass no write handle to the child, but we still expect to read something. I have not thought about this, but the API allows me to do this, so I'm wondering what would be the meaning of that. Are all 4 ((read, write)/(read, nil)/(nil, write)/(nil, nil)) combinations supported? If some combination is not supported then the code should not compile.