[Review] SF-0037: Subprocess 1.0

Hello community,

Review of SF-0037: Subprocess 1.0 begins now and runs through April 20th, 2026.

Since the public beta launched last year, Charles has iterated on the Subprocess API based on community feedback. This proposal captures the decisions he made and proposes for the 1.0 release. For this review, I'd like to focus specifically on these changes rather than revisiting the previous design discussions.

All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager by email or DM. When contacting the review manager directly, please put "SF-0037" in the subject line.

Trying it out

Check out the package.

What goes into a review?

The goal of the review process is to improve the proposal under review
through constructive criticism and, eventually, determine the direction of
Foundation. When writing your review, here are some questions you might want to
answer in your review:

  • Have you used the package? What did you do and how did it work?
  • If you have used other languages or libraries with a similar
    feature, how do you feel that this proposal compares to those?

More information about Foundation review process is available at

swift-foundation/CONTRIBUTING.md at main · swiftlang/swift-foundation · GitHub

Thank you,

Tina L
Review Manager

8 Likes

A bit surprised by the lack of comments here so I'll add one: +1!

I'm certainly not an advanced user and since I primarily target macOS I can't speak to the cross-platform aspects of the pitch but compared to the old Process APIs this is awesome!!

4 Likes

-1. There are a number of other things I would like to discuss but I'll focus this on one of the most relevant issues:

Currently, the main API is a global run() function which is

  • Flies straight in the face of everything Swift API design. Spawning subprocesses is an important but also very niche business. It's also something that should be very obvious in code and calling just run(...) conveys the total opposite. That makes it less visible and global, free functions should be reserved for very commonly used primitives where any syntax noise is a real issue, e.g. print(...)
  • Annoying, plenty of code has func run() and if you're in that scope, there's ambiguity between the global run and self.run.
  • Makes testing the code very awkward. Swift doesn't support conforming a module to a protocol so it's unnecessarily hard to fake out code using swift-subprocess. Most people just expect that they can protocolise such a type and create a fake implementation. Subprocess as is makes this impossible

The bottomline is that the suggested module takes huge API design liberties that are completely unfounded. It would improve the usability and the readability if we just followed the standard pattern of putting this into a type.


Separately from this, the issue tracker still has a bunch of hanging issues which point to underlying implementation problems.

The other discussions around latency vs. throughput of the streaming also feel like the streaming apparatus isn't as solid as it should be quite yet. The main reason I'm bringing this is that the API contains preferredBufferSize which is not something I want to see. This should be automatic and probably be adaptive by default (see e.g. how SwiftNIO and AsyncProcess handle this: They have an adaptive buffer size prediction mechanism that power users can fine-tune if need be).

14 Likes

This topic was discussed extensively during the original review. With respect, characterizing the design as flying "in the face of everything Swift API design", or appealing to "a decade of precedent," as another commenter put it in the original review thread, is not a technical argument; it is an appeal to convention. I understand this API design broke some of those conventions, but we gave technical reasons of why we made that concious deicison.

I also want to push back on the comparison to print(). As we covered in the original review thread, a free-standing function in a package is fundamentally different from one in the standard library. You do not get run() automatically. You must explicitly write import Subprocess, which clearly communicates intent. Moreover, you can already qualify the call as Subprocess.run() if you prefer. I recognize the general concern about namespace pollution, but no one has yet demonstrated a concrete case where this design creates an actual ambiguity that cannot be trivially resolved during the beta testing period.

I understand and agree with the concern about testability. However, I do not believe it is Subprocess's responsibility to solve that problem. Subprocess has a single, well-scoped job: providing the primitives needed to spawn child processes. It is the client's responsibility to wrap those primitives in whatever abstraction they see fit, asuch as protocol, wrapper type, etc., that suits their testing strategy. This is no different from how consumers of FileManager routinely introduce their own protocol for testing purposes; we would not expect FileManager itself to be a protocol just to accommodate that pattern.

"Unfounded" implies the design decisions were made arbitrarily, which is not the case. We provided clear, documented reasoning for each choice during the review. I remain open to reconsidering, but that requires a concrete technical argument against the reasoning we already laid out — not a restatement of the conclusion that the design does not fit convention.

5 Likes

I agree that we haven’t found any example that creates ambiguity but I would argue the reason for this is mostly due to the fact that almost no package adds global methods. I understand and have read through all the arguments but I think what we haven’t considered is how this plays out in a year or two from now. Subprocess is setting a precedent here and we can anticipate other packages to follow. While it’s true that it requires an import to get access to the run method many files have many imports. Especially consider the often discussed scripting use-case where one might end up importing a significant amount of the ecosystem in a single file. I can easily imagine a world where more packages adopt this global run pattern and you end up in a file and type run and there are mixed code completion suggestions from the various imported modules. This will most likely not result in ambiguity but it will result in potentially a subpar developer experience. I say potentially here because we haven’t proven this yet but I could imagine something along those lines happening.

5 Likes

It seems a bit of a stretch to call this a "global run pattern". It's the name of a specific API in this package, not a pattern for other packages to follow. A package that decided "I'm going to create a global function named run just because this other package does so" would be making a decision based on an improper rationale. We can't protect people from making bad decisions for their APIs.

I see a lot of talk of "global namespace 'pollution'" as a tautologically bad thing but I think the better question is perhaps why the run function is being called "pollution" in the first place and the negative connotations that word has.

Most of the common conflicts can be dealt with simply by fully-qualifying the name, which is what folks who don't like the global function are effectively suggesting by asking for it to be wrapped in some type. But you can do that yourself if that's your preferred style or if you wish to avoid issues in the future.

There are two tricky cases I can imagine: one is where you're importing Subprocess and calling unqualified run, and then you import a second package in the same file that also defines a global run. That becomes ambiguous. However, there's a trivial workaround for that that doesn't even require you to update all of your call sites:

import Subprocess
import func Subprocess.run  // <-- prioritize this for unqualified lookup
import OtherRunDeclaringModule

// many uses of unqualified run()

You're right that multiple global run functions wouldn't cause an ambiguity unless they happened to have the exact same signature as one from Subprocess, which I think is extremely unlikely given the types involved. So that just leaves the autocomplete concern. How many run functions in an autocomplete list do you think is the threshold for when we should consider it a subpar experience? How many do you think would actually occur in practice? And if a particular developer ended up in this situation, what's stopping them from preemptively writing Subprocess. to get what they want faster?

The arguments against run all seem to be based on hypothetical futures, and while it's important that we consider how the language might be used in the future, I don't think we should shy away from establishing certain shapes for core APIs just because a particular name might be appealing to others in the future. The escape hatches exist and don't cost much to use.

5 Likes

I'd like to add a few notes which I've been wondering about the current proposed API.

Firstly the choice of a global function. I think others have already discussed well why a function named run in the global namespace might not fare well in the long run (sorry!).

But what's worse from purely an API user's point of view is that there's not just a single run function: there are 16 overloads of it! I think this makes the API hard to navigate and compiler errors harder to understand.

Why do we need the closures to take the execution, stdin, stdout, and stderr as separate parameters, some of which to be omitted if irrelevant? If the closure instead took just one (non-Copyable) argument, possibly generic over the stdin, stdout, and stderr types, we might only need one run overload, instead of all the combinations.

Then there are the overloads which return an ExecutionRecord<Output, Error> vs. an ExecutionOutcome<Result>. The first one captures the stdout and stderr of the process, alongside with its process ID, and both of the structs contain a stored property terminationStatus. But isn't the first one then essentially just an ExecutionOutcome<(ProcessIdentifier, Output.OutputType, Error.OutputType)>? So we could in fact just always return the same type from all run functions, just the default would capture all the outputs. I think ProcessTermination<Result> could be a better name for it.

Instead of a tuple, we might prefer a named struct for that default Result type though, if only so we can mark it as Sendable.

Which brings me to thd next question: why do even we need Sendable in the API? Don't we have exclusive ownership to the values in question in all cases, meaning we could relax it and use sending and nonisolated(nonsending) in the interface instead?

I'm fine with the name run, but as an alternative, try await Subprocess(executable, arguments).run(...) wouldn't seem too bad to me, where Subprocess is of course what you call Configuration currently, with the run method made public, with reasonable defaults for its input, output, error, and body parameters.

Last, I stil think it would be very nice if Executable was ExpressibleByStringLiteral so that writing just "ls" or "/bin/ls" would do the right thing!

All that said, while I really appreciate how the library makes subprocess execution fairly ergonomic already, I think the design space has more to explore still before it's a good time for v1.0.

2 Likes

Copying my suggestion from the original thread.

Consider the following "indirect" design for the protocols:

enum InputMethod {
    case pipe((StandardInputWriter) async throws -> Void)
    case fileDescriptor(FileDescriptor, closeAfterSpawning: Bool)
    case devNull
}

public protocol InputProtocol: Sendable {
    var inputMethod: InputMethod { get }
}
enum OutputMethod<T: Sendable> {
    case collect(maxSize: Int, (RawSpan) throws -> T)
    case fileDescriptor(FileDescriptor, closeAfterSpawning: Bool)
    case stream
    case discard
}

public protocol OutputProtocol: Sendable {
    associatedtype OutputType: Sendable
    var outputMethod: OutputMethod<OutputType> { get }
}
The conformances
struct NoInput: InputProtocol {
    var inputMethod: InputMethod { .devNull }
}
struct FileDescriptorInput: InputProtocol {
    let fd: FileDescriptor
    let closeAfterSpawning: Bool
    var inputMethod: InputMethod {
        .fileDescriptor(fd, closeAfterSpawning: closeAfterSpawning)
    }
}
struct StringInput: InputProtocol {
    let string: String
    var inputMethod: InputMethod {
        .pipe { writer in
            try await writer.write(string.utf8)
            try await writer.finish()
        }
    }
}
struct ArrayInput: InputProtocol {
    let array: [UInt8]
    var inputMethod: InputMethod {
        .pipe { writer in
            try await writer.write(array)
            try await writer.finish()
        }
    }
}
struct DiscardedOutput: OutputProtocol {
    typealias OutputType = Void
    var outputMethod: OutputMethod<Void> { .discard }
}
struct FileDescriptorOutput: OutputProtocol {
    let fd: FileDescriptor
    let closeAfterSpawning: Bool
    typealias OutputType = Void
    var outputMethod: OutputMethod<Void> {
        .fileDescriptor(fd, closeAfterSpawning: closeAfterSpawning)
    }
}
struct StringOutput: OutputProtocol {
    let limit: Int
    let encoding: any Encoding.Type
    typealias OutputType = String
    var outputMethod: OutputMethod<String> {
        .collect(maxSize: limit) { String(decoding: $0, as: encoding) }
    }
}
struct BytesOutput: OutputProtocol {
    let limit: Int
    typealias OutputType = [UInt8]
    var outputMethod: OutputMethod<[UInt8]> {
        .collect(maxSize: limit) { Array($0) }
    }
}
struct SequenceOutput: OutputProtocol {
    typealias OutputType = Void
    var outputMethod: OutputMethod<Void> { .stream }
}
Changes to run method
switch input.inputMethod {
case .devNull:
    // redirect to /dev/null
case .fileDescriptor(let fd, let close):
    // use fd directly
case .pipe(let writeFn):
    // create pipe, call writeFn(writer)
}

This makes me far more concerned about the free-function design. If I am trying to call a sibling run method defined in my own codebase, I have to find it among 16 other unrelated functions in the autocomplete menu? That seems egregious.

Given that the motivation for a top-level run function is scripting, I think we should be honest about the current state of Swift as a scripting language: it’s not great. As has been noted frequently on this forum, top level code is quirky and often buggy.

Rather than add these 16 overloads to the global namespace of any file that imports Subprocess, maybe a future effort to holistically improve the Swift scripting experience could re-expose the static Subprocess.run methods as free functions via a dedicated Scripting module. But for now, it seems unnecessary and incongruent with existing practice to expose the entry point to this functionality with free functions.

4 Likes

I agree that at this point it is all hypothetical, and I tried to make this clear in my previous reply. However, one thing that we need to do when designing new fundamental APIs is to look around the corner and anticipate potential problems down the road. The reason why I am personally very nervous about run in particular is the name itself. We have many (GRPClient, GRPCServer,Hummingbird, ValkeyClient) run methods in the ecosystem and a package called ServiceLifecycle that created a Service protocol for essentially a run method. While those are all scoped to a type, most of them could have been written as global methods and used the ClosureService to hook them into a ServiceGroup. I would be less nervous if we named the run method in Subprocess something like runSubprocess.

I just want to make it clear that I don't see this as a blocker personally, but I could see that this will set a precedent for more global run methods and that it might require tooling changes down the line to improve the developer experience when multiple imported modules have global run methods, such as changes to how code completion shows results from different modules.

The review period ends April 20th as planned. All feedback will be considered by the workgroup in our decision. Charles may choose to respond to specific technical points before then.

I'd also like to remind everyone here this review is focused on the changes made since the 0.1 beta. If you have concerns about aspects of the API that were established in 0.1, please clarify whether your feedback relates to something that has changed or could still reasonably change for 1.0. Constructive, specific feedback from those who have tried the package is especially appreciated.

1 Like

I understand and agree with the concern about testability. However, I do not believe it is Subprocess's responsibility to solve that problem.

A good library wants to be a good citizen.

Making testing hard, adding a three-letter global function with a non-specific name such as run() into the global namespace is in my opinion just not even trying to be a good citizen. Especially testability should be in the mind of every library author.

Finally, breaking precedent with everything should come with very strong arguments why this is necessary. I don't think it was clearly explained where the necessity to break API-design precedent comes from.

13 Likes

A global run() does not automatically make code untestable. Hard-coded concrete dependencies make code untestable. Whether that dependency is a function or type is secondary.

1 Like

Although I'm against run() as a global function, I have to agree here. You can easily make your Subprocess interaction testable, you just can't use a 1:1 protocol replacement. Personally, those sorts of "dependencies" are the worst way to abstract anything for testing. But even in that case, I also agree frameworks should provide testing hooks as public API. In Subprocess' case, it would be something underneath the run layer, above the OS interactions.

I just noticed another (I think major) problem with the global run(...) function. I have a module which (transitively) depends on a bunch of packages and I created a fresh file which has import Subprocess (and nothing else).

Then I created a new function and started to type run. Xcode then started to give me completions of all sorts of run functions, even ones that were in non-exported targets. The biggest issue however was that all the run(...) completions that actually came from Subprocess were unrecognisable. Nothing in them said the word "process" or "subprocess" or similar. They have often pretty generic-sounding completions such as

run(Configuration, input: Input, error: Error, preferredBufferSize: Int?, isolation: (any Actor)?, body: (Execution, AsyncBufferSequence) -> Result)

Worst of all, there isn't just one that I'm supposed to find, there's a whole plethora of functions. I think this was also pointed out above:

If I am trying to call a sibling run method defined in my own codebase, I have to find it among 16 other unrelated functions in the autocomplete menu? That seems egregious.

I agree.

And yes, I know that I can type Subprocess.run which gives me better quality suggestions and actually makes it obvious to a reviewer what I'm actually doing here too. Good-quality auto-complete & clarity to a reviewer sound great and a type would provide these at all times.

You can easily make your Subprocess interaction testable, you just can't use a 1:1 protocol replacement.

Yes, this is of course correct. There are definitely other (and often better) ways of doing so. But it's a thing that you can do with a type that you can't do with a module. And it seems to be a fairly popular choice too. Why would we give up the option of doing this, especially if that happens to also give us other, large benefits such as discoverability (in auto-complete), reviewability, even grep-ability is something that I personally like.

3 Likes

I think it's an odd prioritization to treat code completion logic as some immutable thing that we're stuck with and have to design our APIs around. Since SourceKit is just a tool and it's part of the toolchain, certainly we can just improve it instead of engaging in code-completion-driven API design, right?

If completions aren't doing sufficient ranking and/or grouping of suggestions based on what you have defined in your own module vs. what you've directly imported vs. what is transitively visible through non-exported modules, then that is a very worthy problem that we should solve. It's not something specific to a function named run.

2 Likes

IMO a proposal that's severely limited without tooling improvements should then propose a plan to improve such tooling. Or at least engage in a discussion about short-term improvements/workarounds etc.

Tooling-driven API design makes 100% sense to me. If tooling deficiencies significantly limit an API, then tooling improvements should be designed and/or land first before considering such API.

For example, we know that operator overloads are super problematic for type inference. Can type inference be improved as a result? Maybe. Would an API that doesn't take this into account be successful in the ecosystem? Would many people use a library that takes ages to compile because its design puts additional (and avoidable via different design approach) load on the type checker? I don't expect so.

2 Likes

That's not quite the same thing. The problems that come from having a large number of operator overloads are fundamental limitations of what the language allows as valid code and how the type checker is designed, and that leads to code that takes too long to compile or fails to compile entirely. Poorly ranked code completions obviously provide a worse developer experience, but they don't cause compiler performance issues and they're significantly easier to address than type checker performance, so it's not the same class of problem.

We certainly do consider the example you gave when we review new APIs for the standard library. For example, look at how long it took count(where:) to get from proposal acceptance to actual implementation. So if someone has data that would show that the overload set of Subprocess.run somehow causes type checker performance issues, that would be something that should definitely be raised now.

I certainly won't disagree that it would be good to see more discussion of improving the IDE space here, and using proposal like this one to guide those improvements. That's a far more useful outcome, not just for this proposal, but for the language in general.

Compiler performance issues are only provided as an illustration, where tooling absolutely does impact API design. Code completion issues are not in the same class of problems, but it's still a significant limitation that has usability and adoption impact.

1 Like

I think it's an odd prioritization to treat code completion logic as some immutable thing that we're stuck with and have to design our APIs around. Since SourceKit is just a tool and it's part of the toolchain, certainly we can just improve it instead of engaging in code-completion-driven API design, right?

Yes. But even if those completely irrelevant run functions don't show up, I still have no idea which one I should pick. Subprocess's run functions are all very similar and very opaque differing frequently only from a few parameters in.

With 'opaque' I mean that they all got very general, non-specific type names like Configuration, Input, Error, Int, Result, AsyncBufferSequence. Nothing in

run(Configuration, input: Input, error: Error, preferredBufferSize: Int?, isolation: (any Actor)?, body: (Execution, AsyncBufferSequence) -> Result)

tells me that we're launching a process and I do think that that's a problem.

This of course becomes worse if you import other modules. For example in an XCTest test case, if I type run( and complete, the only suggestion I get is XCTest.run() which happens to be available as self.run() in any XCTest test case which also becomes available as run() because Swift allows implicit self. :neutral_face:. Yes, we can fix this but all it means is that there's now another run function.


I also did a little bit of real-world testing and filed a few issues, some of them with API/usability impact: issues 245, 246, 247, 248, 249 and 250.

3 Likes