[Pitch] Subprocess 1.0

Hi all,

Last year, we released Subprocess as a public beta to gather API feedback. I'm happy to announce that the beta period has concluded, and we are now ready for the 1.0 release review. This pitch describes all changes to the Subprocess public API surface since the initial review.

You can find the full pitch here. The main branch for swift-subprocess has already been updated to reflect the API changes mentioned here.


Subprocess 1.0 Update

Introduction

We introduced Subprocess with SF-0007 and shipped swift-subprocess as a public beta in Spring 2025. Since then, we've received a considerable amount of feedback from the community and updated the Subprocess API to address it. This proposal covers all API changes made since SF-0007 and proposes them for inclusion in the Subprocess 1.0 API.

Rename ExecutionResult and CollectedResult

We propose renaming ExecutionResult to ExecutionOutcome, to avoid confusion with Swift's Result type (since neither are a Result). ExecutionOutcome indicates that we are presenting the outcome (termination status) of the child process. We also want to add the missing Sendable conformances to ExecutionOutcome, since the wrapped value already has to be Sendable.

Similarly, we propose renaming CollectedResult to ExecutionRecord. ExecutionRecord indicates that we are presenting the recorded data of the child process.

Remove .standardOutput and .standardError Properties from Execution

We propose moving .standardOutput and .standardError properties from Execution to the run() closure parameter alongside Execution:

// Before
public func run<Result>(
    ...
    body: (Execution) async throws -> Result
) async throws -> ExecutionOutcome<Result> { ... }

let result = try await run(...) { execution in
    for try await item in execution.standardOutput { ... }
}

// After
public func run<Result>(
    ...
    body: (Execution, StandardInputWriter, AsyncBufferSequence, AsyncBufferSequence) async throws -> Result
) async throws -> ExecutionOutcome<Result> { ... }

let result = try await run(...) { execution, standardInput, standardOutput, standardError in
    for await item in standardOutput { ... }
}

This change eliminates the need for Atomic and AtomicBox within Execution, and more importantly, makes these two variables semantically closer to the correct mental model of how output streaming works.

In the original design, since .standardOutput and .standardError are properties on Execution, it creates the illusion that you can repeatedly call these properties and create different output streams:

let result = try await run(...) { execution in
    for try await item1 in execution.standardOutput { ... }

    // Example of undefined behavior
    for try await item2 in execution.standardOutput { ... }
}

However, this is not the case. Once you create an AsyncBufferSequence by calling execution.standardOutput, the returned sequence effectively "owns" the underlying OS pipe used to read data from. This means calling execution.standardOutput multiple times is undefined behavior, and we had to use an internal Atomic value to guard against it.

The new design eliminates this problem entirely by "promoting" the output AsyncBufferSequences to be sibling parameters of Execution. Since by definition you can't "get" a parameter passed to a closure multiple times, this eliminates the possibility of the aforementioned undefined behavior. This change also simplified Execution's design by making it non-generic.

The full list of closure-based run() overloads is listed below:

(Omitted due to charcater limit. Read the complete pitch here)

Introduce AsyncBufferSequence

One "side effect" of moving .standardOutput and .standardError to closure parameters is that we can no longer use some AsyncSequence as their type. Therefore, we propose exposing the previously internal AsyncBufferSequence as the concrete streaming type.

/// An asynchronous sequence of buffers used to stream output from subprocess.
public struct AsyncBufferSequence: AsyncSequence, Sendable {
    /// The failure type for the asynchronous sequence.
    public typealias Failure = any Swift.Error
    /// The element type for the asynchronous sequence.
    public typealias Element = Buffer

    /// Iterator for `AsyncBufferSequence`.
    public struct Iterator: AsyncIteratorProtocol {
        /// The element type for the iterator.
        public typealias Element = Buffer

        /// Retrieve the next buffer in the sequence, or `nil` if
        /// the sequence has ended.
        public mutating func next() async throws -> Buffer?
    }

    /// Creates an iterator for this asynchronous sequence.
    public func makeAsyncIterator() -> Iterator
}

@available(*, unavailable)
extension AsyncBufferSequence.Iterator: Sendable {}

Introduce preferredBufferSize Parameter

We propose adding a preferredBufferSize: Int? parameter to the run() overloads whose execution body closure allows standard output and/or error streaming. By default, Subprocess chooses the platform page size as the buffer size when creating AsyncBufferSequence for output streaming. This default buffer size, while striking a sensible balance between responsiveness and performance, might not be suitable for all use cases. In particular, when the child process output is sparse, Subprocess might appear stuck because it's waiting for the child process to write more bytes while the child process might be expecting more input. preferredBufferSize allows developers to choose the buffer size most suitable for their particular scenario.

public func run<Result, Input: InputProtocol, Error: ErrorOutputProtocol>(
    _ executable: Executable,
    arguments: Arguments = [],
    ...
    preferredBufferSize: Int? = nil,
    isolation: isolated (any Actor)? = #isolation,
    body: ((Execution, AsyncBufferSequence) async throws -> Result)
) async throws -> ExecutionOutcome<Result> where Error.OutputType == Void

Introduce AsyncBufferSequence.LineSequence

The original proposal only included a way to stream a list of Buffers. This makes streaming text difficult since naively converting each Buffer to String may not always succeed if the Buffer happens to break within a grapheme cluster. Since streaming text is one of the most common use cases for Subprocess, we propose introducing a new AsyncBufferSequence.LineSequence specifically designed to parse and partition an asynchronous sequence of buffers into text lines. Developers can optionally specify a String encoding and a BufferingPolicy to control how LineSequence handles the exhaustion of a buffer’s capacity.

extension AsyncBufferSequence {
    /// Line sequence parses and splits an asynchronous sequence of buffers into lines.
    /// The following list of Unicode characters are considered as paragraph separators (new lines):
    /// ```
    /// LF:	Line Feed, U+000A
    /// VT:	Vertical Tab, U+000B
    /// FF:	Form Feed, U+000C
    /// CR:	Carriage Return, U+000D
    /// CR+LF:	CR (U+000D) followed by LF (U+000A)
    /// NEL:	Next Line, U+0085
    /// LS:	Line Separator, U+2028
    /// PS:	Paragraph Separator, U+2029
    /// ```
    /// These newline characters are not included in the lines returned,
    /// similar to how `.split(separator:)` works.
    ///
    /// `LineSequence` is the preferred method to convert `Buffer` to `String`
    public struct LineSequence<Encoding: _UnicodeEncoding>: AsyncSequence, Sendable {
        /// The element type for the asynchronous sequence.
        public typealias Element = String

         /// The iterator for line sequence.
        public struct AsyncIterator: AsyncIteratorProtocol {
            /// The element type for this Iterator.
            public typealias Element = String

            /// Retrieves the next line, or returns nil if the sequence ends.
            public mutating func next() async throws -> String?
        }

        /// Creates an iterator for this line sequence.
        public func makeAsyncIterator() -> AsyncIterator
    }
}

@available(*, unavailable)
extension AsyncBufferSequence.LineSequence.AsyncIterator: Sendable {}

extension AsyncBufferSequence.LineSequence {
    /// A strategy that handles the exhaustion of a buffer’s capacity.
    public enum BufferingPolicy: Sendable {
        /// Continue to add to the buffer, without imposing a limit
        /// on the number of buffered elements (line length).
        case unbounded
        /// Impose a max buffer size (line length) limit.
        /// Subprocess **will throw an error** if the number of buffered
        /// elements (line length) exceeds the limit
        case maxLineLength(Int)
    }
}

extension AsyncBufferSequence {
    /// Creates a line sequence to iterate through this `AsyncBufferSequence` line by line with a default 128k max line length and UTF8 encoding
    public func lines() -> LineSequence<UTF8>

    /// Creates a line sequence to iterate through a `AsyncBufferSequence` line by line.
    /// - Parameters:
    ///   - encoding: The target encoding to encode Strings to
    ///   - bufferingPolicy: How should back-pressure be handled
    /// - Returns: A `LineSequence` to iterate though this `AsyncBufferSequence` line by line
    public func lines<Encoding: _UnicodeEncoding>(
        encoding: Encoding.Type,
        bufferingPolicy: LineSequence<Encoding>.BufferingPolicy = .maxLineLength(128 * 1024)
    ) -> LineSequence<Encoding>
}

LineSequence is created by calling .lines() on AsyncBufferSequence.

// Monitor Nginx log via `tail -f`
async let monitorResult = try await Subprocess.run(
    .path("/usr/bin/tail"),
    arguments: ["-f", "/path/to/nginx.log"]
) { execution, standardOutput in
    for try await line in standardOutput.lines() {
        // Parse the log text line by line
        if line.contains("500") {
            // Oh no, 500 error
        }
    }
}

Introduce Environment.Key

Environment keys have different case sensitivity requirements on different platforms. For example, keys are case-insensitive on Windows and case-sensitive on other platforms. We propose replacing raw String environment keys with a dedicated Environment.Key type. Environment.Key is designed to correctly respect each platform's case sensitivity requirements; it is also ExpressibleByStringLiteral for easy initialization.

extension Environment {
    /// A key used to access values in an ``Environment``.
    ///
    /// This type respects the compiled platform's case sensitivity requirements.
    public struct Key: Codable, Hashable, ExpressibleByStringLiteral, Sendable {
        public var rawValue: String
    }
}

extension Environment.Key: CodingKeyRepresentable, Comparable, RawRepresentable,CustomStringConvertible { }

Introduce CombinedErrorOutput and ErrorOutputProtocol

Merging standard output and standard error into one stream — like shell redirection 2>&1 — is a common use case for Subprocess. We propose introducing a new concrete CombinedErrorOutput type that merges the standard error and standard output streams.

The original design uses one protocol, OutputProtocol, to define the child process's standard output and standard error behavior. This worked because up until now, all concrete output types could be used for either output or error. CombinedErrorOutput, as its name implies, can only be used with standard error to combine it with standard output. Consequently, we expanded the OutputProtocol hierarchy by introducing a new ErrorOutputProtocol. ErrorOutputProtocol conforms to OutputProtocol and introduces no new requirements. Only CombinedErrorOutput conforms to ErrorOutputProtocol.

/// Error output protocol specifies the set of methods that a type must implement to
/// serve as the error output target for a subprocess.
///
/// Instead of developing custom implementations of `ErrorOutputProtocol`, use the
/// default implementations provided by the `Subprocess` library to specify the
/// output handling requirements.
public protocol ErrorOutputProtocol: OutputProtocol {}

/// A concrete error output type for subprocesses that combines the standard error
/// output with the standard output stream.
///
/// When `CombinedErrorOutput` is used as the error output for a subprocess, both
/// standard output and standard error from the child process are merged into a
/// single output stream. This is equivalent to using shell redirection like `2>&1`.
///
/// This output type is useful when you want to capture or redirect both output
/// streams together, making it possible to process all subprocess output as a unified
/// stream rather than handling standard output and standard error separately.
public struct CombinedErrorOutput: ErrorOutputProtocol {
    public typealias OutputType = Void
}

extension ErrorOutputProtocol where Self == CombinedErrorOutput {
    /// Creates an error output that combines standard error with standard output.
    ///
    /// When using `combinedWithOutput`, both standard output and standard error from
    /// the child process are merged into a single output stream. This is equivalent
    /// to using shell redirection like `2>&1`.
    ///
    /// This is useful when you want to capture or redirect both output streams
    /// together, making it possible to process all subprocess output as a unified
    /// stream rather than handling standard output and standard error separately
    ///
    /// - Returns: A `CombinedErrorOutput` instance that merges standard error
    ///   with standard output.
    public static var combinedWithOutput: Self
}

You can use CombinedErrorOutput like this:

let result = try await run(
    .path("/bin/sh"),
    arguments: ["-c", "echo Hello Stdout; echo Hello Stderr 1>&2"],
    output: .string(limit: 1024),
    error: .combinedWithOutput
)

result.standardOutput will print Hello Stdout;\nHello Stderr.

Remove runDetached API

runDetached() was initially pitched as an "escape hatch" for spawning processes synchronously on systems where concurrency might not be available. Consequently, runDetached() doesn't perform any async IO or async process state monitoring; instead, it acts as a convenient wrapper around posix_spawn and simply returns the child process ID to the caller.

While this design works conceptually, in practice we found that it's impossible to safely vend this API due to PID reuse. Specifically, on Windows a PID does NOT have the concept of wait() and reaping — the PID can be reused as soon as the process terminates. This creates a TOCTOU race condition: the PID may not be valid by the time runDetached() returns. Rather than designing an elaborate workaround for these race conditions, we elected to simply remove the runDetached API since it was never a core part of Subprocess.

Expand Platform-Specific ProcessIdentifier on Windows and Linux

To address the potential TOCTOU issue with PIDs described above, we propose exposing platform-specific process file descriptors via ProcessIdentifier on Windows and Linux:

// For Linux, Android, and FreeBSD
public struct ProcessIdentifier: Sendable, Hashable {
    /// The platform specific process identifier value
    public let value: pid_t

    #if os(Linux) || os(Android) || os(FreeBSD)
    /// The process file descriptor for the running execution. For example, pidfd on Linux
    public let processDescriptor: CInt
    #endif
}

// For Windows
public struct ProcessIdentifier: Sendable, Hashable {
    /// Windows specific process identifier value
    public let value: DWORD
    /// Process handle for current execution.
    ///
    /// `HANDLE` is imported as `UnsafeMutableRawPointer`, which is not
    /// `Sendable`. However, a Windows `HANDLE` is an opaque kernel object
    /// identifier, it is never dereferenced as a pointer in user space.
    /// Copying the value across threads is equivalent to copying an integer,
    /// and the kernel serializes access to the underlying object. Because
    /// this is an immutable `let`, there is no data race on the value itself,
    /// making `nonisolated(unsafe)` the safe here.
    public nonisolated(unsafe) let processDescriptor: HANDLE
    /// Main thread handle for current execution.
    ///
    /// `HANDLE` is imported as `UnsafeMutableRawPointer`, which is not
    /// `Sendable`. However, a Windows `HANDLE` is an opaque kernel object
    /// identifier, it is never dereferenced as a pointer in user space.
    /// Copying the value across threads is equivalent to copying an integer,
    /// and the kernel serializes access to the underlying object. Because
    /// this is an immutable `let`, there is no data race on the value itself,
    /// making `nonisolated(unsafe)` the safe here.
    public nonisolated(unsafe) let threadHandle: HANDLE
}

According to Linux documentation:

Even if the child has already terminated by the time of the pidfd_open() call, its PID will not have been recycled and the returned file descriptor will refer to the resulting zombie process.

We recommend using this property instead of the raw PID value due to its safety guarantees.

Expand FileDescriptorOutput

We propose expanding FileDescriptorOutput with two additional static properties, .standardOutput and .standardError, that redirect the child process's output to the parent process's standard output or standard error. This is useful when you want to follow along with the process output rather than capturing it.

extension OutputProtocol where Self == FileDescriptorOutput {
    /// Create a Subprocess output that writes output to the standard output of
    /// current process.
    ///
    /// The file descriptor isn't closed afterwards.
    public static var standardOutput: Self

    /// Create a Subprocess output that write output to the standard error of
    /// current process.
    ///
    /// The file descriptor isn't closed afterwards.
    public static var standardError: Self
}

Redesign TerminationStatus on Windows

The original TerminationStatus included two cases: .exited() and .unhandledException(). While these two cases make sense on Unix systems — where wait(2) returns a packed bitfield that distinguishes normal exits from unhandled signals — they do not translate well to Windows. Windows's GetExitCodeProcess() returns a single DWORD value, making it impossible to reliably distinguish between a normal exit code and an unhandled exception code.

We propose two changes to TerminationStatus:

  1. Remove .unhandledException() on Windows, since TerminationStatus cannot reliably determine whether the exit code represents a normal exit or an unhandled exception.
  2. Rename .unhandledException() to .signaled() on Unix systems, since the underlying mechanism is signal delivery, not exception handling.
/// An exit status of a subprocess.
public enum TerminationStatus: Sendable, Hashable {
    #if os(Windows)
    /// The type of the status code.
    public typealias Code = DWORD
    #else
    /// The type of the status code.
    public typealias Code = CInt
    #endif

    /// The subprocess exited with the given code.
    case exited(Code)

    #if !os(Windows)
    /// The subprocess was terminated by the given signal.
    case signaled(Code)
    #endif

    /// Whether the current TerminationStatus is successful.
    public var isSuccess: Bool
}

Drop Swift 6.1 Support

Subprocess was designed from the start to use Span as the performant currency type for file IO. At the same time, we wanted to support Swift 6.1 when we launched the public beta so more developers could try it out. This resulted in some shims and workarounds for Swift 6.1 when Span was not available.

As we prepare for the 1.0 release, we want to remove these workarounds from the official API since Swift 6.2 has been available for more than a year now. Our plan is to drop Swift 6.1 support on main and future releases while tagging a "final version" of Subprocess that supports Swift 6.1 for developers that need it.

This change removes the SubprocessSpan trait and the following workarounds:

public protocol OutputProtocol {
    ...
-    /// Convert the output from buffer to expected output type
-    func output(from buffer: some Sequence<UInt8>) throws(SubprocessError) -> OutputType
}

Error Overhaul

SubprocessError in the original proposal has two shortcomings:

  1. SubprocessError.Code was an opaque Int value. Developers had to "remember" what different numeric values represent.
  2. Subprocess didn't formalize how errors are thrown or how they should be handled, leaving developers to figure it out on their own.

We propose a new design for SubprocessError to address these issues and also provide guidance on how errors should be handled:

/// Error thrown from Subprocess. `SubprocessError` may wrap an
/// `underlyingError` to represent what caused this error
public struct SubprocessError: Swift.Error, Sendable, Hashable {
    #if os(Windows)
    public typealias UnderlyingError = WindowsError
    #else
    public typealias UnderlyingError = Errno
    #endif

    /// The error code of this error
    public let code: SubprocessError.Code
    /// The underlying error that caused this error
    public let underlyingError: UnderlyingError?
}

extension SubprocessError {
    /// A SubprocessError Code
    public struct Code: Hashable, Sendable { }
}

extension SubprocessError.Code {
    /// Error code indicating process spawning failed
    public static var spawnFailed: Self
    /// Error code indicating target executable is not found
    public static var executableNotFound: Self
    /// Error code indicating working directory is not valid or subprocess
    /// failed to change working directory when spawning child process
    public static var failedToChangeWorkingDirectory: Self
    /// Error code indicating subprocess has failed to monitor the exit status of child process.
    public static var failedToMonitorProcess: Self

    /// Error code indicating subprocess failed to read data from the child process
    public static var failedToReadFromSubprocess: Self
    /// Error code indicating subprocess failed to write data to the child process
    public static var failedToWriteToSubprocess: Self
    /// Error code indicating child process output has exceeded the set limit
    public static var outputLimitExceeded: Self
    /// Error code indicating platform specific AsyncIO failed
    public static var asyncIOFailed: Self

    /// Error code indicating subprocess failed to control the child process such as
    /// sending signal and terminating process
    public static var processControlFailed: Self
}

#if os(Windows)
extension SubprocessError {
    /// An error that represents a Windows error code returned by `GetLastError`
    public struct WindowsError: Error, RawRepresentable, Hashable {
        public let rawValue: DWORD

        public init(rawValue: DWORD)
    }
}
#endif

In the new design, we exposed static properties on SubprocessError.Code to represent different error codes. Developers can now check their error code against this list instead of relying on an Int.

We also formalized Subprocess's error throwing behavior: Subprocess now only throws SubprocessError internally, since most internal functions now use typed throws. The only exception is that developers can throw any Error from within the execution body closure or .preSpawnProcessConfigurator. With this newly defined behavior, we recommend writing Subprocess error handling code as follows:

do {
    let result = try await run(...) { execution in
        // Developers could throw any error from this closure
        throw MyError()
        ...
        throw MyOtherError()
    }
} catch let subprocessError as SubprocessError {
    // Handle errors thrown from within Subprocess itself.
    // These errors usually indicate some issue with the environment
    // or a bug within Subprocess itself.
    switch subprocessError.code {
    case .spawnFailed:
    ...
    }
} catch let myError as MyError {
    // Handle custom errors thrown from the closure
} catch let myOtherError as MyOtherError {
    // Handle custom errors thrown from the closure
}
17 Likes

ExecutionOutcome sounds a little funny to my ear while ExecutionResult sounds exactly right. I understand the concern. To resolve this, does it make sense to express execution results in the form of a Swift Result?

1 Like

To be honest, I agree with you that I don't really like the name ExecutionOutcome. Unfortunately, we can't represent ExecutionOutcome as a Result because it contains more information such as the termination status and the return value of your closure beyond a simple success or failure.

With that said, I’m definitely all ears if anyone has any other suggestions!

1 Like

To be honest, I don't think Swift.Result has enough significance in the language to warrant it usurping an entire set of good possible names that might want to use the very general term "result" as a suffix.

If you look at Result's usage within the core libraries, it's not that much:

  • No other standard library APIs use it, it only exists to support itself there
  • _Concurrency has a few APIs on continuation types that use it, but only as a means to translate them to throwing async patterns
  • From a quick scan, all of Foundation only has something like 7 references to Result

Indeed, I would take the opposite stance: having a name that shares the word "result" doesn't have to imply that it's exactly a "union of either successful outcome or failure status" as Swift.Result is, but it's helpful to relate that this is a very similar concept—it's a high-level combination of "outputs" and "status".

14 Likes

What benefits does this explicit type provide compared to using a tuple?

To reiterate < [Feature] null-delimited iteration · Issue #212 · swiftlang/swift-subprocess · GitHub >, tools often use null-delimited records rather than lines. It avoids entire categories of parsing problems and It’s a much simpler implementation (than tracking the many meanings of “line”). It would be wonderful as a convenience, and immediately useful e.g., for parsing git history.

Would you please consider including that feature?

1 Like

I agree with you. The more I think, the more I like the original ExecutionResult more than ExecutionOutcome because Result, as a generic term, is exactly what it is. Outcome does seem a bit forced.

What about CollectedResult? Does the community like CollectedResult more than ExecutionRecord?

5 Likes

Both "result" types are generic, so we can't use a tuple. Also, we want these types to conform to CustomStringConvertible and friends, which also can't be done on a tuple.

In general, tuples are for transient, local, ad-hoc groupings in short-lived scopes, and I don't think either ExecutionOutcome or ExecutionRecord fits within this category.

2 Likes

@wes1, thanks for the request! This is a great idea and we do want to support splitting by null bytes. This feature was originally planned for post-1.0 (we already have a few post-1.0 features lined up), but I can take a look and see if I can sneak it in before the actual review period so it can be included in the 1.0 tag.

2 Likes

I find ExecutionRecord quite nice — it describes the type well (slightly better than CollectedResult in my opinion) and doesn’t seem awkward or forced.

1 Like

I could see one, which is the second†? Simplifying:

enum TerminationStatus {
    case exited(CInt)
}
struct ExecutionOutcome<Result> {
    let terminationStatus: TerminationStatus
    let value: Result
}
func run<Result>(body: (() -> Result)) -> ExecutionOutcome<Result> { ... }

† Edit: probably you meant "ExecutionRecord" – I didn't realise that's a "result" type (the name didn't help).

I think we could, for example:

func run<T>(body: () -> T) -> (result: T, status: TerminationStatus) {
    (body(), .exited(42))
}

The natural follow-up is – why do we want those types to conform to CustomStringConvertible and friends?

As for tuples vs structs: Swift / Apple APIs do use tuples in similar spots – addingReportingOverflow, Mirror.children, this particular one looks strikingly similar (simplifying):

func URLSession.data(for request: URLRequest) throws -> (Data, URLResponse)

– so I'm curious what makes this case different. To me the dedicated struct adds ceremony without an obvious benefit, but I might be missing context.

Overall, the whole thing feels a bit over-engineered to me, but maybe that's just me.


PS. I am talking about ExecutionOutcome type here. The more beefy ExecutionRecord makes total sense.

1 Like

Having been down this road before with my own experiments, I get the need for the type - and I struggled with a reasonable name. I have a slight preference for CollectedResult.

1 Like

Ah, good point!

This is a great example, and IMO it's in the same category as Set's insert:

@discardableResult
mutating func insert(_ newMember: Element) -> (inserted: Bool, memberAfterInsert: Element)

Both these APIs are designed with the intent for you to destruct the fields, and in most cases you only care about some of these fields.

let (resultData, _) = try URLSession.data(...)

var set = Set(1)
let (inserted, _) = set.insert(2)

I don't think ExecutionOutcome quite fits this description, even though it currently only has two fields.

The other (and IMO the most important) reason to use struct over a simple tuple is extensibility. Even though ExecutionOutcome only has two fields now, it might grow in the future, and we might plan on adding methods to it. In fact, there is already a requested feature that will add more fields to ExecutionOutcome. We can't add more fields or add methods to a tuple as a structural type without source breaking, but we can with struct as a nominal type.

3 Likes

Good reasoning, thanks.

What concrete types are currently stored (or expected to be stored) in ExecutionOutcome.Result? Is it likely that the vast majority of use cases (say, 99%) will simply use a String, or is that assumption off base?

1 Like

I wouldn't say this assumption is "off base", but I don't think there should be any assumptions about what the type of Result would be. Developers could decide to return some String, return their custom type, or return a simple Bool to indicate whether the operation was successful. There isn't a single "right" or even "recommended" pattern for writing the execution closure so it could be anything.

2 Likes

Thanks for running another pitch with the updated API. I just read through your changes since the initial review and played around with the package again. Overall, the latest API look great but I had a few comments and questions while looking over it again.

Rename ExecutionResult and CollectedResult

I agree with the feedback from @allevato that using result here is okay and not leading to confusion with Swift.Result.

Adopt NonisolatedNonsendingByDefault and remove isolation parameters

The latest API contains a few methods that still have an isolation parameter. We should adopt NonisolatedNonsendingByDefault instead and drop all the isolation parameters in favor of this new language feature. This will also make sure that the various body closures are properly nonisolated(nonsending).

AsyncBufferSequence

I am a bit torn on this since the AsyncSequence uses Buffer as its Element which is a Copyable type that either stores DispatchData or [UInt8] internally. It feels like ideally the async sequence would just return Span<UInt8> or RawSpans but the AsyncSequence protocol doesn't allow this right now. We ran into the same problems in our new HTTP APIs and introduced a new AsyncStreaming module that introduces new AsyncReader and AsyncWriter types that allow the usage of spans.

I understand that the Subprocess work predates most of this, and we want to ship a 1.0.0 here. I am wondering what our forward evolution story is going to be if either AsyncSequence starts to support ~Copyable types and elements or we get new reader/writer types. Are we going to end up with even more run overloads since we won't be able to evolve the AsyncBufferSequence to become ~Copyable and have other elements than Buffer?

AsyncBufferSequence.LineSequence

I was thinking if this would belong into swift-async-algorithms instead of swift-subprocess? This is a general useful transform of an async sequence of bytes really.

Expand Platform-Specific ProcessIdentifier on Windows and Linux

Should public let processDescriptor: CInt use a FileDescriptor type instead of CInt here?
Should we introduce a Windows.Handle type in swift-system so we can avoid using HANDLE in the public API?

Expand FileDescriptorOutput

I find that this run(.name("ls"), output: .standardOutput) reads a bit confusing, and it isn't entirely clear that .standardOutput means the current process’s standard output. Should we give this a more descriptive name such as .currentStandardOutput instead?

StandardInputWriter actor

While this isn't mentioned in the changes, I reviewed the public API on the main branch and saw that the StandardInputWriter type is an actor. This seems potentially problematic since actors come with additional overhead. Furthermore, it makes the writer Sendable, which means we tolerate concurrent writes. That is generally speaking problematic, and I normally recommend that writers are either Sendable and ~Copyable or ~Sendable and Copyable.

Input and Output protocols

I know that a lot of different approaches for modeling the various inputs and outputs were tried since the initial prototypes. I would love to understand why we ended up with the input and output protocols now? It looks like we don't really expect users to conform their types to those protocols and are rather using them to provide the nice dot-shorthand-syntax when specifying the input. Furthermore, a few of the provided types are actually fatalErroring in their implementation since they get special treatment in the implementation.

Have we explored using a concrete type for these various inputs instead? Something like this:

public struct SubprocessInput {
  public static var none: Self
  public static func fileDescriptor(
        _ fd: FileDescriptor,
        closeAfterSpawningProcess: Bool
    )
  public static var standardInput: Self
  public static func array(
        _ array: [UInt8]
    ) -> Self
  // More ...
}

This should produce the same API ergonomics unless I am missing something and avoids exposing the protocol.

4 Likes

I agree. Also note that Foundation on Apple platforms already has this functionality as AsyncSequence.lines, but as far as I can’t tell this never made it to other platforms.

(The documentation for AsyncSequence.lines falsely claims that it's part of the Swift module, but I just confirmed that you need to import Foundation to make it available on Apple platforms. And it's not available at all in a Swift 6.3 Linux container.)

AsyncSequence.lines swallows blank lines?

Sidenote: I discovered in my quick test that the AsyncSequence.lines implementation in Foundation seems to swallow blank lines, i.e. multiple consecutive 0x10 (LF) bytes are treated as a single newline. That's not the behavior I expected and it isn't the behavior I'd expect from a corresponding Subprocess API.

Here's my test program. It feeds UTF-8 bytes into an AsyncStream and then iterates over stream.lines and prints the lines back out:

import Foundation

@main
struct AsyncLineSequenceDemo {
    static func main() async {
        let inputText = """
            import Observation
            
            @MainActor
            @Observable
            final class ViewModel {
                var value: Int = 0
            }
            """

        let (stream, sink) = AsyncStream.makeStream(of: UInt8.self)
        for byte in inputText.utf8 {
            sink.yield(byte)
        }
        sink.finish()

        var lineCounter = 1
        for await line in stream.lines {
            print("\(lineCounter): \(line)")
            lineCounter += 1
        }
    }
}

This is the output when ran on macOS 26.4 and Swift 6.3 (from Xcode 26.4). Not the missing blank line in line 2 compared to the input text.

1: import Observation
2: @MainActor
3: @Observable
4: final class ViewModel {
5:     var value: Int = 0
6: }
3 Likes

Ah good suggestion thank you!

Thank you for pointing out the "difficult" situation subprocess is currently in. I would love to adopt the new streaming API, but we really could not wait longer for the 1.0 tag since we are already way past the original planned date.

My plan for adopting the new streaming API is to introduce new stream() functions (peers to the run() functions) that specifically use the new AsyncStreaming mechanism. We can then decide in the future to when/whether to deprecate the original closure-based run() methods in favor of the stream() ones.

I agree in general. However, for performance reasons, LineSequence needs to operate on Subprocess's' own Buffer type, so it can't be moved to swift-async-algorithms. I think this is something we should consider when we move to the new AsyncStreaming API.

I don't think so because ProcessIdentifier itself is already a "wrapper" type on the raw value. In addition, processDescriptor isn't really a "file descriptor" (you don't read from it). The reason we exposed this value is for you to pass it to other system functions. It seems silly to wrap it in a FileDescriptor just to unwrap it every time you use it.

Perhaps, but that's outside of scope for this proposal. I also think it's "silly" to wrap them in a type when the expected usage is for you to pass the raw value to other system functions.

Ahh good point! Will do.

Sounds good.

This is such a core design for Subprocess and I don't think it's appropriate to change them now, since the original proposal has already been accepted. This around of pitch and review is really for the new additions instead of going back and change the accepted design.

This part is no longer true. We do support custom Input and Output type. You can conform to those protocols just like the built in ones.

Could you file a feedback about this? Since as you pointed out AsyncSequence.lines is a Foundation (not SwiftFoundation) concept.

Meanwhile I've added tests to make sure Subprocess' version behaves as expected.

3 Likes

I'm sorry, but I won’t file a feedback with Apple. I have decided not to engage with Apple's bug reporting process because I find it quite hostile to external developers.

Thank you!

3 Likes