The correct (and elegant) way to call async functions in Swift6 and more

Disclaimer: I'm a recovering developer, new to Swift and macOS development, came from C/C++ in Linux & Windows. Have working knowledge on how thread works.

I'm writing a sample macOS app in Swift6 and SwiftUI to familiarize myself with the language and macOS. My goal for the app is to do one thing only: generate video thumbnails from a given folder and display them in a view.

After some research, it appears that the async version of generateBestRepresentationForRequest (api link) is the way to generate thumbnails. However it's confusing to me how Swift6 is guarding the data race and how its concurrency model works.

I have included the pseudocode below that outlines the code I have (Gemini AI assist and other tools helped :slight_smile:

Can you please review the code, specifically I need help on writing the grammatically correct and "Swift6" concurrency code in func generateThumbnails() to accomplish the following:

  • async calls to generateBestRepresentationForRequest. My understanding is that the async call returns to its caller thread immediately. If that is correct, I presume it's safe to call this API from the main UI thread.
  • I expect that whenever a thumbnail becomes available, the worker thread (apart from Main UI thread) will call the "callback function" that's somehow "registered" with the async function. Need some help here for code example.
  • In such "callback function", I expect to create a new PhotoAsset object based on the thumbnail data, and then safely insert that new object to the main UI's photoAssets[ ] array. This insertion per my understanding must happen in MainUI thread. If so can you please give code example?
  • In my mental model, this app at runtime will have 2 threads: a Main UI thread that updates the UI when the @state variable changes, and a worker thread that gets "unblocked" whenever a thumbnail becomes available.

Sorry for the long post with very loaded questions. Much appreciate your feedback and help.

Thanks! See below the pseudo code.

// represents a single thumbnail
struct PhotoAsset: {
  let id: String
  let url: URL
  let fileName: String
  var thumbnail: Data? //png format raw data
}

struct ContentView: View {
  @State private var isLoading = false
  @State private var photoAssets: [PhotoAsset] = []
  @State private var loadedFolderURL: URL?
  
  var body: some View {
    VStack {    /*Later: add code to display the photoAssets */   }
    .padding()
    .navigationTitle("Photo2")
    .toolbar {
      ToolbarItem {
        Button("Select Video Folder") {
          let openPanel = NSOpenPanel()
          if openPanel.runModal() == .OK {
           // Task { //I doubt we need Task here but I can be wrong
              generateThumbnails(from: openPanel.url)
       //     }
          }
        }
        .disabled(isLoading) // Disable the button while loading
      }
    }
  } //end var body
 
  func generateThumbnails(from folderURL: URL) {
    loadedFolderURL = folderURL // Store for reloading
    photoAssets.removeAll() // Clear previous results from the view

    isLoading = true
    do {
      let fileManager = FileManager.default
      let directoryURL = folderURL

      // Check if the directory exists and is accessible
      // ....
      guard let enumerator = fileManager.enumerator(at: directoryURL,
                                                    includingPropertiesForKeys: resourceKeys,
                                                    options: [.skipsHiddenFiles, .skipsSubdirectoryDescendants, .skipsPackageDescendants],
                                                    errorHandler: { (url, error) -> Bool in
        print("Error enumerating '\(url.path)': \(error)")
        return true // Continue enumeration even if there are errors with some files.
      }) else {
        //show error and alert
        print("Error: enumerator is nil")
        return
      }

      for case let fileURL as URL in enumerator {
        //check if the file exists and that it is not a directory
        //...
        // Trying to use the async version of generateBestRepresentation
        // Need help here to complete the sample code with a "callback function"
        let request = QLThumbnailGenerator.Request(fileAt: fileURL,
                                                   size: CGSize(width: 256, height: 256),
                                                   scale: NSScreen.main?.backingScaleFactor ?? 1,
                                                   representationTypes: .thumbnail)
        try QLThumbnailGenerator.shared.generateBestRepresentation(for: request) async
        // .....
      }

    } catch {
      print("Error listing files: \(error)")
      isLoading = false
      // Show an error to the user.
    }
  }


}//end struct ContentView


I think a better mental model of Swift structured concurrency for someone coming from C/C++ (like myself) is:

  • There is a certain number of thread-safe message queues pre-created for the app; presumably the number of queues equals the number of CPU cores
  • All asynchronous calls are handled via these message queues; this, coupled with sendability of data types, allows Swift to guarantee serialization of certain things and thus also guarantee there will be no data races and no deadlocks
  • Each asynchronous call (await someFunc()) creates a continuation that will be executed on the same message queue as the calling site; where the function itself will be executed depends on how it's defined
  • True parallelism in your app can be created using either detached tasks or actors: an actor (or a detached task) picks one of the message queues and sticks to it
  • By default things are executed on MainActor whose message queue runs on the main thread

I hope I'm not very wrong in the above assumptions.

Given the above, and because I suppose you can show results of the thumbnail generation as they arrive, i.e. you don't necessarily need to wait until everything is ready; plus you probably want to generate thumbnails in parallel to each other, I think the best approach would be to spawn a task for each job. (Alternatively, if you say needed to get all results before displaying them, you'd use a task group).

Therefore:

for case let fileURL as URL in enumerator {
	Task {
		let request = QLThumbnailGenerator.Request(fileAt: fileURL, size: CGSize(width: 256, height: 256), scale: UIScreen.main.scale, representationTypes: .thumbnail)
		let thumbnail = try await QLThumbnailGenerator.shared.generateBestRepresentation(for: request)
		photoAssets.append(.init(id: UUID().uuidString, url: fileURL, fileName: fileURL.lastPathComponent, thumbnail: thumbnail.uiImage.pngData()))
	}
}

Notice how it is safe to append an object to your photoAssets array since that piece of code is executed on the same message queue as the rest of the code and therefore it is safe to do so.

Two important caveats here:

  1. The call to generateBestRepresentation() may throw an exception but because you are not interested in the result of the task execution, the exception will be literally lost.

I suggest that you think of a way of displaying errors first before implementing error handling. For example, you may display a special image (say :warning:) in place of each thumbnail that failed to generate. I suppose your PhotoAsset structure may have an additional property to reflect an error condition as one possibility.

  1. You need to take into account a potentially very large number of files to be processed.

This unfortunately will make your code a bit more complex than what's shown above as you'd need to limit the number tasks at any given time, i.e. you'd spawn N tasks first and the rest of the jobs would wait until the previous tasks finish and become available. In multithreading you'd call this a thread pool, but in Swift it would be a task pool. Right now I'm not aware of any readily available methods for this pattern in Swift, I presume you'd need to implement it yourself.

Hope this helps!

P.S. from the point of view of multithreading, the tasks that you spawn are all executed on MainActor, however you do this in the hope that actual thumbnail generation is executed on different threads and therefore is efficient enough. I don't know how this system API is implemented but it's most likely the case, i.e. that the code is truly parallel and therefore it makes sense to spawn multiple tasks.

1 Like

In Swift, async function does not returns immediately. For example, when you have:

await someAsyncFunction()
someSyncFunction()

The second line won't be executed until the someAsyncFunction() finish execution. What it does is that it allow the current thread to suspend the current Task and switch to some other Tasks. At least for the current Swift version, when you invoke an async function without passing any isolation parameter, it will be offloaded to a global executor. Although the current task cannot continue, the current thread is free to do other stuffs. So although your reason is not very accurate, it is indeed ok to call it from the main thread (at least for current Swift version). However, this behaviour may change in future version if this proposal is implemented. It specifies that async function will inherit the isolation of the caller by default. So here if you still invoke generateBestRepresentationForRequest directly on main thread, it will inherit the MainActor isolation and continue running on the main thread.

In your code, the generateThumbnails function is not an async function while generateBestRepresentation is. This is not allowed in Swift, you can only invoke an async function inside an async context. There are a few options if you want each iteration of the for loop to run concurrently

  • Wrap the body of the for loop with Task {}. Use this one if generateThumbnails does not need to wait until all the thumbnails are loaded.
    for case let fileURL as URL in enumerator {
      Task {
         let request = ....
         try await try QLThumbnailGenerator.shared.generateBestRepresentation(for: request)
      }
    }
    
  • Wrap the for loop using TaskGroup. Also need to mark the generateThumbnails function with async and wrap its invocation in the closure of Button with Task {}. Use this one if generateThumbnails need to wait until all the thumbnails are loaded
    func generateThumbnails(from folderURL: URL) async {
      ...
      try await withThrowingTaskGroup(of: Void.self) { group in
        for case let fileURL as URL in enumerator {
          group.addTask {
            let request = ....
            try await try QLThumbnailGenerator.shared.generateBestRepresentation(for: request)
          }
        }
        try await group.waitForAll()
      }
      ...
    }
    
    Note that if you have macOS 14.0 or higher, use withThrowingDiscardingTaskGroup instead of withThrowingTaskGroup and remove the group.waitForAll() call.
    Personally I recommend using the second method since it allows you to catch the error thrown by the child tasks from the caller directly.
1 Like

Follow up post from OP.

First, thanks for everyone who chimed in, much appreciated.

To clarify, my goal is to make the UI responsive, and update the UI continually as the thumbnails get generated. So I have revised the code, it compiles with Swift 6 now, it still has some issues, much appreciated your review and help! (github)

  1. I tested it with a local network location which has >1,000 video files. The UI is getting updated as thumbnails get generated, but at the same time UI is not responsive to inputs such as scrolling, the pointer is also shown as busy. Is there a way to improve this?

  2. I use Task.detached {} to wrap the async system API call. Is there a way to get notified when the Task queue is empty? So that I can set the isLoading according.

You're likely overloading Swift's fixed-width thread pool used to run all concurrency work. With it generating thousands of thumbnails, there's not enough capacity left to let other work proper resume or suspend, leading to the UI issues you see. Unfortunately Swift doesn't allow you to throw the work into a separate thread pool unless do it yourself, like wrapping DispatchQueue.concurrentPerform, which would help. If you want to stay within Swift Concurrency you can try inserting Task.yield() at the end of every bit of work (or batch of work) to give the concurrency system a bit of time to process other work, but you'll want benchmark and see how yielding after every thumbnail or a particular number of thumbnails affects your overall performance.

This is definitely an area I'd like to see improved but I lost the pitch I was writing. It's probably a good followup to the recent default executor proposals.

Edit: found my draft pitch, [Prepitch]: Scoped, Fixed Width Concurrency Executors

1 Like

I think when you create too many tasks that are running blocking code on the global concurrent executor (which is the default executor), it overwhelm all the available threads. Generally the best way is to simply avoid creating that many tasks by loading thumbnails lazily (only load a thumbnail when it needs to appear on screen). If loading all thumbnail is really required, consider limiting the amount of task created or combining with a custom Dispatch Queue. If you target for macOS 15, using a custom TaskExecutor is also an option.

There is currently no APIs to access the Task queue, though you can implement it yourself by placing those Tasks in an array. However, I would recommend using TaskGroup for that since it provide more APIs to work with child tasks (e.g.: use the waitForAll() method to wait until all the child tasks are finished).