Swift Concurrency Case Study: Shwift

George · May 6, 2021, 11:20pm

Hi everyone! I've been working on a shell scripting framework that utilizes the incoming Swift concurrency features. I'm really happy with how it is coming along, and am excited to use it to replace bash scripts in some of my other projects, as well as avoid using python for this kind of thing. While it needs a few more (already planned) features in the Swift compiler to become truly useful, working on it was an interesting case study in using the new concurrency APIs. If you are interested in taking a look at the code, you can find it here:

In the meantime, I've kept some notes about my experiences with the new concurrency APIs (and some more general observations) which might be interesting to the community. You should read this as one developer's experience using (possibly misusing) the new concurrency features in a single project rather than commentary on the current or planned state of these features in general.

In No Particular Order...

The proliferation of `try` and `await`

I did write try and await a ton to make this work. Initially it really tripped me up that async throws is in the opposite order as try await, but towards the end my brain figured it out!
Much of the the excess try awaiting was due to autoclosures not working (yet), but there are a few other cases where we could probably do better.
Most notably, it would be great to be able to leave these indicators off of single-expression closures which are being passed to expressions already being awaited on. We don't currently do this for try and I haven't seen this discussed before, but there is an analogy to be made with the already-optional return in these cases.
Also, I'm assuming this is already on folks' radar, but I had to write await twice when I iterated over an AsyncSequence returned from an async getter.
An exception for top-level code has also been discussed, though I see this code mostly residing in the main function of a swift-argument-parser ParseableCommand so I'm not sure how much that will help.

Binding a task-local value is illegal within the body of a withTaskGroup invocation

I ran into this error in my first attempt at this code (shell.subshell sets a task local variable):

github.com

GeorgeLyon/Shwift/blob/8fb57012a1555cdd62243ac6504018f9b6fcddd5/Sources/Shwift/Pipe.swift#L22-L45


      
          return try await withThrowingTaskGroup(of: T?.self) { group in
            group.spawn {
              try await sourceOutput.closeAfter {
                _ = try await shell.subshell(
                  standardOutput: FileDescriptor(pipe.fileHandleForWriting),
                  operation: source)
                return nil
              }
            }
            group.spawn {
              try await shell.subshell(
                standardInput: destinationInput,
                operation: destination)
            }
            for try await result in group {
              guard let result = result else {
                /// We only care about `destination`
                continue
              }
              group.cancelAll()

This file has been truncated. show original

Here, I only care about the result of the second group.spawn and initially I tried to just put lines 32-34 in the withThrowingTaskGroup body so I could get at the result directly. I'm sure there is a good reason for this (and the error message was so long it was cut off in Xcode :-) but I still haven't quite internalized the reasoning.

Not being able to throw from defer resulted in some extra boilerplate

I ended up having to write things like this:

github.com

GeorgeLyon/Shwift/blob/8fb57012a1555cdd62243ac6504018f9b6fcddd5/Sources/Shwift/NIO.swift#L17-L27


      
          func closeAfter<T>(_ body: () async throws -> T) async throws -> T {
            let result: T
            do {
              result = try await body()
            } catch {
              try? await close().get()
              throw error
            }
            try await close().get()
            return result
          }

I expect most packages to just implement stuff like this so it probably won't be too painful.

String Interpolation Hack

This doesn't have anything to do with concurrency, but I used single-case-enums to enable two different semantics for swift interpolation (Value and Name here are enums with a single case each; value and name, respectively). In this case, using .name enables extra parameters to be provided to the interpolation. It works well, but feels a little hacky, and I'm not sure what a better approach would be:

github.com

GeorgeLyon/Shwift/blob/5cd82238b9736ec3b9b41bd673ee51d98983f969/Sources/Shwift/Named Argument Style.swift#L103-L118


      
          public mutating func appendInterpolation(_ value: Value) {
            fragments.append(.value)
          }
          
          
/**
           An interpolation which is replaced with the name of the argument. This method also allows the caller to specify conversions. For example, the interpolation `\(.name, convertingTo: .uppercase, termSeparator: "_")` will convert argument names like `cmakeBuildType` to `CMAKE_BUILD_TYPE`
           - parameter targetCase: If specified, performs a case conversion on the argument name
           - parameter termSeparator: If specified, separate camel-case components of the argument name will be joined with this separator
           */
          public mutating func appendInterpolation(
            _ name: Name,
            convertingTo targetCase: Case? = nil,
            termSeparator: String = "")
          {
            fragments.append(.name(targetCase, separator: termSeparator))
          }

Joe_Groff · May 6, 2021, 11:34pm

Thanks for the feedback! This sort of one-off variable binding is the kind of thing we intend async let to eventually be useful for. It looks like you could write this as:

@discardableResult
public func pipe<T> (
  of source: @escaping () async throws -> Shell.Invocation,
  to destination: @escaping () async throws -> T
) async throws -> T {
  let shell = Shell.taskLocal
  let pipe = Pipe()
  let destinationInput = FileDescriptor(pipe.fileHandleForReading)
  let sourceOutput = FileDescriptor(pipe.fileHandleForWriting)
  
  async let closeAfter = sourceOutput.closeAfter {
    try await shell.subshell(
      standardOutput: FileDescriptor(pipe.fileHandleForWriting),
      operation: source)
  }
  async let result = shell.subshell(
        standardInput: destinationInput,
        operation: destination)

  return try await result // will cancel closeAfter implicitly
}

once async let is implemented.

ktoso · May 6, 2021, 11:35pm

Right yes, that's illegal and what you do now is correct.

Whoa that's good feedback thanks, we should make sure it shows up nicely in Xcode.

The message is indeed very long as it attempts to give a detailed recipe with what to replace the wrong code pattern:

error: task-local: detected illegal task-local value binding at %.*s:%d.
Task-local values must only be set in a structured-context, such as:
around any (synchronous or asynchronous function invocation),
around an 'async let' declaration, or around a 'with(Throwing)TaskGroup(...){ ... }'
invocation. Notably, binding a task-local value is illegal *within the body*
of a withTaskGroup invocation.

The following example is illegal:
    await withTaskGroup(...) { group in 
        await <task-local>.withValue(1234) {
            group.spawn { ... }
        }
    }

And should be replaced by, either: setting the value for the entire group:

    // bind task-local for all tasks spawned within the group
    await <task-local>.withValue(1234) {
        await withTaskGroup(...) { group in
            group.spawn { ... }
        }
    }

or, inside the specific task-group child task:

    // bind-task-local for only specific child-task
    await withTaskGroup(...) { group in
        group.spawn {
            await <task-local>.withValue(1234) {
                ... 
            }
        }

        group.spawn { ... }
    }

https://github.com/apple/swift/blob/main/stdlib/public/Concurrency/TaskLocal.cpp#L116-L150

Hope this seems reasonable.

For the specific use-case as @Joe_Groff mentioned async let would be a nicer thing indeed

George · May 6, 2021, 11:40pm

Yup async let seems like the right thing here.

Solely out of curiosity, the thing I'm still trying to wrap my head around is what exactly a structured-context is, and why the body of a withTaskGroup invocation is not a structured-context.

ktoso · May 6, 2021, 11:44pm

The issue is that the task created by group.spawn, by design, "escapes" the "scope":

withTaskGroup ... { // group scope
  { // some scope
    group.spawn { ... }  // make the task
  } 
  // structured rules would normally imply that the task must complete by now,
  // but it does not which is exactly the purpose of the task group - 
  // to give this flexibility; sadly, this means that the pieces between 
  // group.spawn and group.next is somewhat unstructured

  group.next() // consume that task
} // group scope

so the task group is scoped by the withTaskGroup scope, however inside it there is some unstructured things: namely, the fact that the spawn "escapes" (in quotes since it does not really, it never escapes the group after all) and is later collected only on next() or the group scope exit.

So it is properly structured to either scope around the entire group, or inside a specific group child task, but it is not structured to scope around the group.spawn itself.

--

// We technically could make it work there, but at great implementation cost so we decided to not take the hit for now.

Joe_Groff · May 6, 2021, 11:47pm

It is still within the realm of possibility that the compiler could understand that the end of the task group's scope is a barrier for when all of the spawn-ed child tasks must end. We could teach the compiler to understand this at a later point, though for now, it uses its generic checking for @Sendable closures.

George · May 6, 2021, 11:51pm

Thanks, this explains it perfectly. For the record, I agree this isn't a big deal, I just wanted to understand.

George · December 5, 2021, 12:56am

I've gotten Shwift to the point that it is usable in real projects, and I'm currently using it for ancillary tools in a not-yet-open-source project of mine, which is exciting! Let me know if you take it for a spin and have notes :)

Not sure if it is interesting to anyone, but I had to work around a couple issues:

Firstly, operators with async autoclosure arguments still cannot be used, making pipes difficult to implement. Instead, I went with a slightly-more-verbose approach where things that might be piped have a second declaration which is an @_disfavoredOverload which returns a simple type wrapping a closure (1, 2). This unfortunately makes the autoclosure argument @escaping, but does create the desired UX where echo("Foo") echoes "Foo", but the echo in echo("Foo") | sed("s/Foo/Bar/") redirects to sed's standard input.

The second issue was much more tricky to diagnose and fix, and boils down to a race condition in Foundation.Process on Linux. In order to work around this I needed to reimplement process launching using clone, I filed the details in this bug: [SR-15471] Race condition when launching Process on Linux can leak file descriptors · Issue #3946 · apple/swift-corelibs-foundation · GitHub.

philipturner · December 5, 2021, 1:17am

This isn't really related to the discussion here, but I have never used any language other than Swift for scripting. I made a Swift built script for concatenating Metal files for ARHeadsetKit, instead of using bash. I also taught someone just starting out with programming to use Swift instead of Powershell and Bash. Now he's using Swift for self-motivated projects like exploring sorting algorithms (@cody-ferguson on GitHub).

Also, I'm trying to resurrect Swift for TensorFlow partially because I've gotten tired over how difficult Python is to configure in an IDE context (see link below). I think your repo is interesting and incredible work!

@George not many people have commented on my post yet, but here it is: Swift for TensorFlow Resurrection: Differentiation running on iOS

philipturner · December 20, 2021, 1:58am

@George I did a lot of Swift and Shell interop to side-load Swift on Google Colab: Swift for TensorFlow Resurrection: Swift running on Colab again, which might interest you.

Swift Concurrency Case Study: Shwift

In No Particular Order...

The proliferation of try and await

Binding a task-local value is illegal within the body of a withTaskGroup invocation

Not being able to throw from defer resulted in some extra boilerplate

String Interpolation Hack

The proliferation of `try` and `await`