Reporting progress on an async function

bjhomer · August 26, 2024, 6:07pm

As I've been converting some of our existing code to use async functions, I find that I often want to be able to show some sort of progress in the UI. I've played with a few patterns for doing this, but I'm curious if anyone has better ideas.

Currently, I'm doing this by returning an async stream of progress updates, instead of a single value. Something like this:

// Without progress reporting
func downloadLargeFile() async -> URL


// With progress reporting
enum DownloadLargeFileProgress {
  case connecting
  case downloadProgress(fraction: Double)
  case completed(URL)
}

func downloadLargeFile() -> AsyncStream<DownloadLargeFileProgress>

// usage:
var downloadedURL: URL!
for await progress in downloadLargeFile() {
  switch progress {
    case connecting: // update the UI accordingly
    case downloadProgress(let fraction): // update the UI accordingly
    case completed(let url): downloadedURL = url
  }
}

This works, but having to assign to downloadedURL in order to escape a value from the for loop feels kinda weird. Is there a better pattern?

bjhomer · August 26, 2024, 6:17pm

This pattern (returning an async stream) also has the downside that the implementation of downloadLargeFile() must return synchronously, which means that the async work is probably deferred to a separate unstructured Task internally. I'd love a way to avoid that.

vns · August 26, 2024, 6:38pm

You still can pass a closure:

func downloadLargeFile(
    progress: @Sendable (_ fraction: Double) -> Void
) async -> URL {
}

I think this is still a valid approach. I'm not sure on isolation, probably @Sendable @isolated(any) would be better, but it depends on a details – so far I have had more issues using it than profit, but maybe just because don't understand it properly.

Probably this can be leveraged with stream as well (solving isolation?):

func downloadLargeFile(
    progressStream: @Sendable (AsyncStream<Double>) -> Void
) async -> URL {
}

ExFalsoQuodlibet · August 27, 2024, 2:02pm

What about

func downloadLargeFile() -> (
  progress: AsyncStream<DownloadLargeFileProgress>,
  getURL: () async -> URL
) {
 ...
}

bjhomer · August 27, 2024, 2:48pm

I don't think passing in an AsyncStream would work, because it doesn't give anyone a way to insert items into the stream. You could probably have the caller pass in an AsyncStream.Continuation, I guess.

ExFalsoQuodlibet · August 27, 2024, 2:52pm

Check the function definition, I'm returning an AsyncStream, not passing it

bjhomer · August 27, 2024, 2:53pm

Oh, I see! Yes, that's an interesting option.

ole · August 27, 2024, 4:56pm

When Swift Concurrency came out in 2021, I dabbled with building a generically usable [NS]Progress-like abstraction for Swift Concurrency. The basic idea was as follows:

The caller (i.e. the subsystem that wants to display the progress) creates a progress object and registers itself as an observer of that object. The progress object could be an actor or a Sendable class, depending on what's best for performance.
The caller injects the progress object as a task-local value and then calls the function(s) that perform the actual work.
Any function that wants to do progress reporting can inspect its task-local values whether a progress object is set for the current task tree. If so, it can use the progress object to report its progress.
Any progress update via the progress object automatically notifies the observer, e.g. via a closure (or maybe the progress object itself would be @Observable).
Crucially, the progress-reporting function in step (3) would have the ability to divide its work into subtasks, each with their own progress reporting. This could be done by arranging progress objects in a tree where children contribute to the progress of their parent. Again, very similar to how Progress in Foundation works. This should fit really well with structured concurrency (task groups and async let).

I have a basic version of this working, but never used it in production, never evaluated its performance, never wrote about it and haven't touched the code in years.

Here's a short video of my little prototype app. It creates a task group with a number of child tasks, all of which have their individual progress and contribute to the overall progress: AsyncProgress.mp4

I kind of liked it in my little prototype, but I'm ultimately not sure it's a good idea. It has pretty much the same pros and cons as Progress in Foundation:

Pros:

The tree structure of parent progress and child progresses is attractive. The math how to aggregate multiple child progresses into one parent progress can happen behind the scenes.
APIs don't have to change. Because the progress object is passed implicitly via task-locals, you don't have to pass it explicitly via parameters.
By creating a progress object and injecting it into task-locals, the caller can decide whether it's interested in progress reporting. If not, the callees don't have to do the work.

Cons:

Passing values implicitly via task-locals also means they're not easily visible in the code. Callees that want to do progress reporting must actively know to look for these progress objects in task-local values. (It's possible this is one reason that NSProgress never got any meaningful adoption (I think) in the Apple dev community. You can't use something if you don't know it's there.)
Implementation-wise, it's a heavyweight solution that's built for an ideal world where every expensive function magically knows that it's expected to do progress reporting via this mechanism. To gain any traction in the ecosystem, it would have to come from the platform vendor, and even then it's not guaranteed developers will adopt it (see again Progress).

bjhomer · August 27, 2024, 5:10pm

Oh, I like the @TaskLocal progress idea; that's really nice.

dehesa · September 1, 2024, 9:00am

That is a beautiful idea. I will take a stab at it. I think a nice API could be:

public func withTaskProgression<Success, Failure: Error>(
  operation: () async throws(Failure) -> Success,
  progression: (TaskProgression) -> Void
) async throws(Failure) -> Success

Internally, we can leverage @TaskLocal as @ole suggested and also support a tree of child progresses

CharlesS · September 1, 2024, 2:54pm

I made a Swift Concurrency version of CSProgress a while back, as well. Supports a full tree structure just like NSProgress, but should perform a lot better: GitHub - CharlesJS/CSProgress at concurrency

dehesa · September 2, 2024, 12:08am

I was aware of your CSProgress @CharlesS, nice job. It is definitely an enhancement over regular Foundation's Progress; However, I would like to explore the concept a bit more focusing on the following topics:

Use @TaskLocal as @ole suggested.
Use the new affordances for concurrency and typed throwing provided in Swift 6.
Have no locks or actors serialization (abuse structured concurrency to ensure serialization).
I dislike the Progress API. I'd like to try different syntax (e.g. not specifying the total unit count from the get-go and let it grow as needed).
Don't consume/compute resources if the user is not interested in progress information.
Explore the ~Escapable concept.

CharlesS · September 2, 2024, 12:15am

I dislike the @TaskLocal idea myself, because it makes progress invisible, just like NSProgress does with its invisible "current progress" objects. Less "magic" is better, in my view. Avoiding this also makes APIs clearer—quick, which Foundation APIs support NSProgress reporting, and which ones don't? I certainly don't remember off the top of my head, and I bet you don't either. You have to just know, and if the documentation neglects to mention it, you have to test it empirically.

dehesa · September 2, 2024, 12:45am

I believe I managed to get something similar to what I was proposing above with this gist. It is still very raw, the API needs more work, and there is some todos and @unchecked Sendable that I would like to get rid off, but it works.

To track progression information, a user would reach for withTaskProgression(operation:progress:). The operation argument contains the async throwing operation to track and the progress closure communicates right away any progression information changes (similar to withTaskCancellation(operation:onCancel:).

let names = try await withTaskProgression {
  let numbers = try await generateRandomNumbers(count: numNumbers)
  return try await generateNames(from: numbers)
} progress: { info in
  // info.status tells you whether the process is 'ongoing', has finished, or failed
  // info.children tells you the state of the spawn process (it is recursive).
}

That is everything the user would need to do to receive progress information. The function supports sync and async functions and sync and async sequences.

For this to work out of the box, those functions would need to internally call any of the methods of the @TaskLocal's Task.unsafeProgress, such as:

Task.unsafeProgress.progressed()

Since we cannot expect all async operations to conform to that, we can easily retrofit them ourselves. For example:

try await withTaskProgression {
  try await Task.sleep(for: .seconds(1))
  Task.unsafeProgress?.progressed()
  try await Task.sleep(for: .seconds(1))
  Task.unsafeProgress?.progressed()

  for await value in myAsyncSequence {
    // Do something with value
    Task.unsafeProgress?.progressed()
  }
} progress: {
  print($0)
}

Successful termination and failure are automatically handled. This mini-library also supports recursive progresses, in case one wants to divide a task in discrete groups (which can have their own subgroups, etc.).

Next steps for this is:

Remove @unchecked Sendable (I don't yet fully grok the isolation boundaries and sending. In the process of experimenting I already found a compiler bug ).
Use ~Escapable for the TaskProgress to achieve zero computation if the user is not interested in some given infos.
Iterate over the API names (naming is hard).
Find out why typed throws is failing with @TaskLocal's withValue(_:operation:).

dehesa · September 2, 2024, 1:00am

100% in agreement with you. This issue paired with KVO and poor performance of locks made NSProgress unusable for me at the time. By the way, I did enjoy your rant here

I still think it has merits. It is true that it is invisible (as Task.isCancelled also is). However, I believe Swift concurrency intended usage is different than previous ObjC and infinite callback callings mechanisms. With Swift concurrency we expect to have linear and explicit async calls. For example:

func computeSomething() async throws {
  try await operation1()
  try await operation2()
  for await value in operation3Sequence {
    // Do something with the value
  }
  try await operation4()
  await withTaskGroup { group in
    group.addTask { /* ... */ }
    group.addTask { /* ... */ }
  }
}

All those operations occur in the same function context. Even if any of the operations wouldn't conform to our desired progression information transmission, we could easily retrofit such information. It is true that it wouldn't be as granular, but it would probably be good enough for most cases. Moreover by sake of the nature of progress tracking, developers are probably more interested in async sequences, which are the easiest to retrofit for progression tracking.

ph1ps · January 26, 2025, 3:12pm

I think this thread has some really good ideas about this problem. I tried making a general solution for this problem, which I've been using in a project of mine, which for now uses this exclusively for URLSession progress tracking, so my usage sample size is quite small.

I am almost happy with the shape of this. It does not make use of TaskLocals because I've found it not really playing along well with those "older" APIs that are not participating in structured Concurrency and inheriting TaskLocals.

public struct TaskProgress: Sendable {
  
  public struct Units: Sendable {
    public let completedUnitCount: Int
    public let totalUnitCount: Int
  }
  
  let continuation: AsyncStream<Units>.Continuation
  
  public func update(completedUnitCount: Int, totalUnitCount: Int) {
    continuation.yield(.init(completedUnitCount: completedUnitCount, totalUnitCount: totalUnitCount))
  }
}

enum TaskProgressResult<R, E> where R: Sendable, E: Error {
  case operationResult(Result<R, E>)
  case streamFinished
}

public func withTaskProgress<R, E>(
  isolation: isolated (any Actor)? = #isolation,
  operation: @escaping @isolated(any) @Sendable (TaskProgress) async throws(E) -> R,
  onProgress: @escaping @isolated(any) @Sendable (TaskProgress.Units) -> Void
) async throws(E) -> R where R: Sendable {
  
  let result = await withTaskGroup(
    of: TaskProgressResult<R, E>.self,
    returning: Result<R, E>.self
  ) { taskGroup in
    
    let (stream, continuation) = AsyncStream<TaskProgress.Units>.makeStream()
    
    taskGroup.addTask {
      defer { continuation.finish() }
      do {
        let progress = TaskProgress(continuation: continuation)
        return try await .operationResult(.success(operation(progress)))
      } catch {
        return .operationResult(.failure(error as! E))
      }
    }
    
    taskGroup.addTask {
      for await units in stream {
        await onProgress(units)
      }
      return .streamFinished
    }
    
    for await result in taskGroup {
      switch result {
      case .operationResult(let result):
        return result
      case .streamFinished:
        continue
      }
    }
    
    preconditionFailure("Invalid state")
  }
  
  return try result.get()
}

Things I am not happy with, yet:

The closures are not actually escaping because TaskGroups are structured primitives. However, withoutActuallyEscaping is not working with @isolated(any) currently. So I had to make a tradeoff to either support isolated(any) or have non-escaping closures.
I had to force cast the error, because the compiler somehow loses the errors type information when nesting the closure one level deeper.
I had to mark the closures @Sendable even though I think sending should be enough. Not sure if this will work at some point though.
await onProgress(units) should not actually be awaiting. The child task should actually be isolated to the isolation of the closure. However I don't know how you would do this with the current isolation control and I think this will not work until closure isolation control has landed.

I am not sure if it makes sense to extract this into a package yet because of those problems. But also, as I previously mentioned, I do not have a big enough sample size of usages yet, so I'd be glad if someone tries this and has some feedback, how this API could be improved.

For the curious, this is what I've used for URLSessions

import os
final class BytesReceivedDelegate: NSObject, URLSessionTaskDelegate {
  
  let progress: TaskProgress
  let observation = OSAllocatedUnfairLock<NSKeyValueObservation?>(initialState: nil)
  
  init(progress: TaskProgress) {
    self.progress = progress
  }
  
  func urlSession(_ session: URLSession, didCreateTask task: URLSessionTask) {
    observation.withLock { observation in
      observation = task.observe(\.countOfBytesReceived) { [weak self] task, _ in
        self?.progress.update(completedUnitCount: Int(task.countOfBytesReceived), totalUnitCount: Int(task.countOfBytesExpectedToReceive))
      }
    }
  }
}

let (url, response) = try await withTaskProgress { progress in
  try await URLSession.shared.download(for: URLRequest(url: downloadUrl), delegate: BytesReceivedDelegate(progress: progress))
} onProgress: { units in
  print("progress", Double(units.completedUnitCount) / Double(units.totalUnitCount))
}