[Review, 3rd] SF-0007: Subprocess

johannesweiss · February 26, 2025, 4:25pm

You'd open the pidfd after fork but before execve and you know there's no race because you have a guarantee that you're the only pid.

Yes, alas, posix_spawn has a bunch of weaknesses that are hard to overcome (another one is getting ENOENT doesn't tell you if it's the working dir or the executable that was absent :|).

ksluder · February 26, 2025, 4:30pm

Importantly, this presumes the child creates the pidfd, which probably means the parent has to create a close-on-exec Unix domain socket for the child to send the pidfd back to the parent.

Karl · February 26, 2025, 4:40pm

If you would like to discuss objective design qualities (and limitations, surely!), then I have a simple challenge:

Give some examples of how function name collisions in the global scope are significantly worse than any other name collisions in the global scope.

It may seem like something to take for granted, right? Oh, of course global functions are bad! But I'm going to push back on that notion even harder - I think that's one of many out-of-date programming dogmas established by developers limited to 1980s tooling. In C, every function was in the global scope - rather than string.count, you'd have a global function for literally everything, with awful names such as strlen.

That kind of thinking does not apply nearly as strongly to modern programming languages, where we are able to establish hierarchies in select places of the API. Types can be global, or they can be nested to form a hierarchy. We don't have this same kind of allergic reaction to un-nested types. We don't call un-nested types "pollution of the global namespace".

Moreover, there are global functions which we use all the time in Swift without issue, such as print. Who has ever had a problem with print? Can you imagine how tedious it would be if we went with a hierarchical solution there? e.g Console.log(...)?

There are other things, too, such as the Task { ... } initialiser with its side effects and discardable result, which effectively serve as global functions. For some reason that doesn't trigger the haters, but arguably it should irk them even more.

Use of the global scope is not the same as "pollution" of the global scope. In the C era, you could reasonably call things like strlen pollution. I would not call print or Task pollution.

Ultimately I think if you are arguing that a large part of the language (global functions) are effectively unusable, the onus in a good faith debate should be on you, to back up that assertion. You state it as though it were fact, but I do not agree with it and its flaws do not seem so drastic to me.

Since type names are also in the global scope, we can use them as an example. If I declare a type named URL, it is given preference in its own module even in files where I also import Foundation. If I want a Foundation URL, I need to qualify it with Foundation.URL. I tested it, and exactly the same applies to functions in the global scope.

In third-party modules, files which do not import my library do not see my URL type. If they import both my library and Foundation, and they use URL, then they must qualify whether they want MyLib.URL or Foundation.URL. Again, I tested it, and precisely the same applies to functions in the global scope.

When I actually examine the evidence, the fear of global functions certainly looks overblown. That is not to say that they should be used everywhere, of course - as with anything, they should be used tastefully, but they do appear more difficult to disambiguate than anything else at the global scope -- and there is plenty of stuff in the global scope that nobody ever complains about!

Max_Desiatov · February 26, 2025, 4:48pm

I regularly have problems with print, especially thanks to the fact that implicit self is allowed. Good luck figuring out what went wrong when a print method is defined on a type, but you intend to call global print from a different method of the same type.

BTW, this is not a problem with JavaScript since you must call methods through this.. There this.fetch and fetch are unambiguously different, which is not the case for Swift.

I don't think plainly dismissing regular complaints from different users is productive for the discussion.

Quite the opposite, since the default design approach so far has been to avoid global functions. Swift standard library was riddled with free functions before protocol extensions. Since then the overall direction was to reduce the number of those and to namespace as much as possible. Remember when map and filter were free global functions? Stuff like print is one of the few free functions that remained for whatever reason. Expanding that number without addressing naming collision and discoverability problems would not help at all.

It's weird to argue that reversing the direction is now somehow the default assumption, and the onus is on those sticking to the established design approach.

Karl · February 26, 2025, 4:56pm

Certainly collisions can happen, as they can with types.

I have regular issues with type name collisions (SwiftUI's Section is particularly annoying), which have also been difficult for other developers to debug ("huh? Section isn't compatible with Section?"). This can be particularly annoying when the initialiser signatures are compatible, as the wrong type gets propagated far from where it was created.

OK, man, I tried to engage in a good faith debate with you, explained all of my arguments in detail and told you exactly what you'd need to change my mind. But you're going to start with that kind of stuff, and I don't have the time to play these kind of games.

davedelong · February 26, 2025, 5:07pm

I don't want global free functions because I can't discover them with autocomplete. With my cursor on a blank line, hitting escape brings up an autocomplete suggestion of thousands of things that provide no contextual help about what I should do. The only way to get the tools to be helpful in this situation is that I must put the module name first: Subprocess. <escape> to get a filtered list of relevant things.

So if that's going to be essentially required anyway, we should just go with that and remove the ambiguity and potential collisions with self-less method calls.

allevato · February 26, 2025, 5:21pm

Certainly the autocomplete concern applies to types in a module as well, no? If your starting point is "I want to know what APIs this module provides" with no other context, you still must either view the entire list from an unqualified cursor position, or type Subprocess.. To say that autocomplete would be easier if the method was attached to the type is to presume prior knowledge of the name of that type, which isn't an equal comparison.

Max_Desiatov · February 26, 2025, 5:48pm

One more case to point out is that ParsableCommand from ArgumentParser requires run method to be implemented. Given that it is quite common to spawn processes from CLI utilities implemented with ArgumentParser, users will have to spell it as Subprocess.run or Foundation.run anyway to disambiguate from a method implementing ParsableCommand requirement. Out of those two qualified names, I'd prefer Subprocess.run as more specific and discoverable than Foundation.run.

johannesweiss · February 26, 2025, 6:09pm

We're getting into the weeds here but that's not actually necessary: the parent and the child would however have to communicate and synchronise for a tiny little while. The child can wait for the parent to have created the pidfd before execve'ing (by waiting for a close/write/something on a file descriptor the parent inherited into the child).

icharleshu · February 26, 2025, 6:43pm

Can you elaborate what's the issue with using module names to disambiguate? Why have a Subprocess struct namespace under the existing Subprocess module namespace? If the concern here is discoverability and autocomplete, there's nothing stopping you from writing Subprocess.run. I disagree that free standing functions are polluting the global namespace because you only get them if you explicitly import Subprocess and they can be distinguished by the module name.

While I don't disagree, I think it's more important that we keep this name short for Scripting case (which is also one of the reasons why we are proposing dropping the Subprocess namespace). I understand many folks here are looking at this package from server/app development point of view, but this package also aims to improve scripting in Swift (however little it could). One of the biggest pain point for scripting in Swift is launching process takes too many boilerplate code. While resolvedFromPath might be more clear, I believe it's more important to keep it concise since this option is exclusively used by scripting.

Because Windows doesn't use Signals so the teardown sequence will look different for Windows anyway even if we support it. I'll add a section in future direction.

What's wrong with "forcing" Subprocess to use posix_spawn or fork/exec? Subprocess is entirely depend on these platform spawning primitives. If they change one day, Subprocess will need to be updated anyway so why not exposing the "implementation details" as "escape hatches" so if Subprocess doesn't provide high level API at least you can drop down to low level?

IMO extensibility is way more important than "hiding implementation details" because it enables more use case.

But that's not the case today. FileDescriptor itself doesn't have any async creation so it's not really Subprocess' role to create async file descriptors. Further more, InputProtocol was designed specially to be NOT async to support runDetached (which is not async).

Could you elaborate on what would this subject look like?

This is an intentional choice. IMO it's more important to allow Subprocess to be more extensible and allow escape hatches than hiding all the platform specific details. All the "leaks" mentioned so far follows progressive disclosure philosophy because developers won't have to deal with any of the platform details untiles they are specifically seeking advanced custom behavior.

Unfortunately that is the case. Are you suggesting to always use SystemPackage since System is not available on Linux?

davedelong · February 26, 2025, 7:22pm

A decade of precedence.

I can't think of another API like this were using the module name to disambiguate is a feature. Every single time I've had to do it since Swift was announced, it's because the thing I want to use has been shadowed by other stuff.

So… is it a goal of this proposal to overturn almost eleven years worth of learned behavior that module disambiguation is now desirable, and not just a limitation of the compiler?

ksluder · February 26, 2025, 7:27pm

It feels weighted far too heavily toward making single-line demos look “cleaner”, which is itself dubious because not all single-line Swift files exist in the same niche as shell scripts. Execution is just a more awkward synonym for Subprocess; if it were named Subprocess, the logical and precedent-following choice would be to consider static func run() to be a factory function for Subprocess instances.

Karl · February 26, 2025, 7:49pm

Right - whenever you start from the global scope, whether you are looking for a type or looking for a function, there is no context for autocomplete to suggest anything.

Is print undiscoverable? Is Task { ... } undiscoverable? Maybe at first, but you very quickly get used to it, and once you do, the benefits of making it short and globally available far outweigh the occasional collisions.

So, if I may, when I read this, it feels like I can summarise it by saying: "it's tradition". That's pretty much where the argument starts and ends.

And that's just not very compelling. We used to have more ambition with Swift, but in so many cases on the forums in recent weeks, I've seen what I can only describe as a kind of tunnel vision - an unwillingness or neglect to challenge preconceived notions, to think outside the box and create something original.

I think we should instead applaud developers who have the courage to look past the constraints of tradition, and we should always be sniffing around the edges of the box looking for fresh ideas. I'd like to see Foundation take more risks, think further outside the box, and really embrace the creative freedom that Swift gives them.

It was said earlier in the thread that "discoverability, disambiguation, and testability" are lessened with the use of global functions. In fact, none of these alleged flaws stands up to scrutiny. And I'm sorry to call you out, but your reply just captured this sentiment so perfectly and came at the exact right time - when the rationale doesn't hold up, but people still seem so steadfast in their opinions, it really feels more like an emotional attachment -- something subjective, which as @Max_Desiatov warned us has no place in our design considerations.

davedelong · February 26, 2025, 7:55pm

You claim the rationale doesn't hold up, and I wholly disagree. All API design is subjective. What is "familiar" is subjective. What is "readable" is subjective. What is "comprehensible" and "testable" and "discoverable" and "fluent" are all subjective.

When I design APIs, and what I look for in APIs, are ones that fit in with the patterns that I've been using to that point. I want the APIs to follow the same flows as others, so that the ways I use them are similar. For me (and I suspect for others) that means that an "appeal to tradition" isn't a fallacy, but a fundamental argument.

I don't see how this has anything to do with "ambition" or "courage" in Swift. I'm glad to see this API getting proposed because I desperately want it. I hate using Foundation.Process and its myriad of sharp corners. I'm trying to make this the best API I think it can be, and for me that heavily implies following established convention and precedent, and not doing global functions. That's why I'm looking for compelling arguments to overturn that precedent, and so far I haven't heard any.

Karl · February 26, 2025, 8:18pm

Yeah this is not entirely the primary concern for me. I look for APIs that model the problem in a natural way for that problem. If there are synergies with existing concepts (such as Collection), they should use them, but if the design becomes too artificial - too removed from the essence of what it is trying to model - you often find yourself with awkward impedence mismatches, hidden or verbose APIs, strange edge-cases, etc.

In my design philosophy, it is important to always centre the design on what you are trying to model.

In the case of things like accessing data from the network or running a process, these are capabilities provided to you by the environment; the platform on which you are running. Therefore, it seems natural to me that it is modelled as a service at the highest possible scope of the program.

Again, Javascript does this a lot. For example, the global window.document represents an associated HTML document, but it can be null when you run that code in an offscreen context, such as in Node.js. To me, this way of modelling the problem seems very natural - the document is a property of the environment which loaded my program, and it is exposed at the highest scope as any other variable/property would be. It would be possible to write it without a global - maybe something like Window.current.document or even Window.currentDocument or Document.current, but to me that doesn't quite capture the nature of it as well.

davedelong · February 26, 2025, 8:31pm

OK, so to use that argument…

Spawning a process is not a fundamental intrinsic capability provided by the environment. As I see it, you're issuing a request to another system to "please take all of these parameterizations and execute something" and technically the details of that are opaque to you.

Is it executing in a sandbox? On the same machine? In a virtualized environment? In a container? On a different machine? Is it not getting executed at all and instead you're getting a replayed result?

Since all of the answers to these are "well, it depends on something else", I don't see how this capability rises to the level of being a global function.

Karl · February 26, 2025, 8:52pm

It is a fundamental intrinsic capability provided by some environments. The vast majority of environments on which Swift currently runs, in fact.

The details are not opaque whatsoever, I don't follow any of the rest. This API spawns a platform subprocess, and even has a PlatformOptions type to capture the details of how your platform launches processes. It models that capability provided by those environments.

Anyway, I think I've made the point well enough. I can accept that some people really just don't like it.

Max_Desiatov · February 26, 2025, 9:02pm

The problem is that module disambiguation is implicit. This leads to unintended and confusing naming collisions due to shadowing with poor diagnostics. At the same time, disambiguation with a static function in a struct namespace is explicit and predictable. Consider this code:

import ArgumentParser
import Foundation

struct Echo: AsyncParsableCommand {
    func run() async throws {
        try await run("echo foo")
    }
}

This won't compile as is. It won't even tell me "hey, did you mean to call Foundation.run"? It will provide a red herring error diagnostic message saying "an argument was provided to function run that expects no arguments".

Now I had enough of these problems with print methods and misleading diagnostics before, so maybe I won't spend too much time on this. Eventually I will go back and spell it as Foundation.run and move on. But what about users who see this error for the first time?

That's in addition to the fact that Foundation.run naming is not clear enough on its own. What exactly are we running there? A process? A task? A request? A command? Or are we certain that this API will always stay in a separate Subprocess module for more clear spelling and discovery? The confusing error diagnostic point still stands.

Now if run were a static function on Subprocess, there's no way to misspell it, everyone is expected to prefix static function calls on types with a type name:

import ArgumentParser
import Foundation

struct Echo: AsyncParsableCommand {
    func run() async throws {
        try await Subprocess.run("echo foo")
    }
}

Not only this follows an established precedent, this is explicit, clear, and beginner-friendly, while ensuring that no confusing diagnostic messages due to unintended shadowing are produced.

pyrtsa · February 26, 2025, 9:04pm

A tiny little idea which might have been discussed in the past but I don't see it mentioned in the proposal (among alternatives or future directions)…

I think the call sites would benefit from conforming Executable to ExpressibleByStringLiteral such that "ls" would be equivalent to .name("ls") in a context expecting an Executable value.

johannesweiss · February 26, 2025, 11:14pm

icharleshu:

johannesweiss:

I dislike the name .name here. It's not that there is a specially "named" program ls, what it actually means is that the environment variable $PATH is traversed front-to-back. So maybe run(.resolvedFromPath("ls"))

While I don't disagree, I think it's more important that we keep this name short for Scripting case (which is also one of the reasons why we are proposing dropping the Subprocess namespace). I understand many folks here are looking at this package from server/app development point of view, but this package also aims to improve scripting in Swift (however little it could). One of the biggest pain point for scripting in Swift is launching process takes too many boilerplate code. While resolvedFromPath might be more clear, I believe it's more important to keep it concise since this option is exclusively used by scripting.

I'm sure we could find a name that suits our needs for security, scriptability and clarity.

To start with I'd focus on getting the core abstractions right. I don't think we should over-index on the scripting use case. For Subprocess to become a viable shell scripting alternative we'll need a lot more things such as pipelines (like grep ^foo /some/file | cut -d: -f2) & easy output redirections anyway. I really doubt .name("ls") vs. resolvedFromPath("ls") or resolved(path: "ls") will be the make or break for Swift as a shell scripting language. And if I'm wrong: A more concise one is very easy to add later, much harder to take away a confusing one.

This feels like we're leaving out a core piece that's pretty crucial when it comes to Structured Concurrency for a future direction, no? Windows has its own way to stop child processes and I'd love to explore if we can create something that works for all platforms, with or without UNIX signals. Not every Swift programmer is a UNIX and Windows expert, so having to learn special UNIX and Windows constructs for something as simple as spawning a process will just lead to unportable programs and lots of #ifs.

I think expressing something like 'try to stop this program in a "soft" way but if it hasn't disappeared within 1s, kill it in a "hard" way' should hopefully be expressible in a platform-independent way.

Subprocess today already has multiple options on UNIX: fork+exec, posix_spawn and on Linux soon io_uring_spawn.

It's very conceivable that Subprocess will change the underlying mechanism on at least some platforms -- potentially depending on the underlying libc/kernel versions. And potentially depending on what features the user actually selected.

That's because the capabilities/availability of these depend on the libc & kernel versions. So I don't think we should strive to create a user-facing API that entirely leaks what's used to implement unless absolutely necessary.

But of course, if there's some special case that's only satisfiable with one particular underlying mechanism, then I think it's fine to add a little platform-specific escape hatch if it's clear enough that this will make the code unportable.

Yes, it absolutely is. For example, if you're using DispatchIO or SwiftNIO or anything else that can do async I/O else to pump bytes, you cannot just synchronously close a file descriptor. You have to deregister & close and that's asynchronous. In SwiftNIO you call try await channel.close(), in DispatchIO you call .close() but then you have to wait for the cleanupHandler to run before you can close the underlying fd:

The docs have to say

It is a programmer error for you to modify the file descriptor while the channel owns it.

And channel ownership ends with the cleanupHandler. The same applies to DispatchSources and pretty much anything built on io_uring, epoll, kqueue, ...

Of course, for example if we had this straw-man error

public struct ProcessSpawningError: Error & Sendable {
    public struct Subject: Sendable {
        public static let executable: Subject = ...
        public static let workingDirectory: Subject = ...
        [...]
    }

    public struct Code: Sendable {
        public static let notFound: Code = ..
        public static let permissionDenied: Code = ...
        [...]
    }

    var subject: Subject
    var code: Code 

    var underlyingError: any Error
}

if the executable wasn't found, I'd get ProcessSpawningError(code: .notFound, subject: .executable, underlyingError: ...) but if the working directory wasn't found, it'd be ProcessSpawningError(code: .notFound, subject: .workingDirectory, underlyingError: ...).

But there are many other equally valid approaches to model this:

Have executableNotFound and workingDirectoryNotFound etc codes
Have notFound(Subject) and permissionDenied(Subject)

I honestly don't mind the details too much. The crucial thing is that we can figure out what wasn't found.

Extensibility and abstraction aren't mutually exclusive. And again, I don't mind escape hatches if exist to make hard things possible. But currently it appears like the majority of programs will have to use at least one of the escape hatches. So we're not necessarily making easy things easy.

I'm all for advanced custom behaviour and progressive disclosure but we shouldn't reach for platform-specific implementations and leak implementation details for situations that can be handled perfectly fine in a platform-independent way. Extensibility specifically is very hard if you prematurely paint yourself into a corner by leaking a lot of the implementation details. That will make it much harder (or impossible) to change anything later.

To start with, I was curious what's being proposed. The answer to this question is really important to determine how usable this will be on the various platforms. Ideally, there would be one FilePath that works on all platforms without any #if dances. FWIW, the same problem is holding up removing the underbar from _NIOFileSystem and committing to its API.