Replace threadsafe var with new concurrency model (Actor?)

I have established code that makes use of a threadsafe var, to avoid data races.

class ThreadsafeVar<T>
{
  private var value = Optional<T>.none
  
  func callAsFunction() -> T?
  {
    queue.sync(flags: .barrier) { [unowned self] in value }
  }
  
  func set(_ value: T?)
  {
    queue.async(flags: .barrier) { [unowned self] in self.value = value }
  }
}

This is called from a "normal" var that is part of a common protocol for a number of different implementations…

  private var urls = ThreadsafeVar<[URL]>()

  var fileURLs: [URL]
  {
    get
    {
      urls()?.sorted(using: .finder, pathComponent: .fullPath) ?? []
    }
    set { }
  }

In this particular example, the setter is not required as the private var is updated as the result of an asynchronous callback from a metadata query - hence the need for concurrency management on the private var.

So, I have created an Actor to replace the "old-fashioned" ThreadsafeVar…

actor UrlManager
{
  private var value = [URL]()

  func callAsFunction() -> [URL]
  {
    value
  }
  
  func set(_ value: [URL])
  {
    self.value = value
  }
}

I can certainly use await to set the value but I am having problems finding the correct syntax for returning the value from, what would be, the asynchronous call to the actor.

Can anyone help please?

I don’t really understand actors yet, but if these functions are asynchronous, don’t they need to be marked async, or is that assumed in an actor’s methods?

From what I have read, async is implicit on actor methods

I think this is a matter of using the right tool for the job and to me the right tool here is a lock. Using an actor to perform such trivial work would not be very efficient and would turn all access to asynchronous operations (meaning all calling codes have to be async as well). Note that your existing code of using a queue barrier and async in the setter is also less efficient than a lock. You can simply use NSLock.

2 Likes

The old way could use locks as mentioned:

Code using locks
class ThreadsafeVar<T> {
    private var val: T
    private let lock = NSLock()
    
    init(_ v: T) {
        val = v
    }
    
    var value: T {
        get {
            lock.lock()
            let result = val
            lock.unlock()
            return result
        }
        set {
            lock.lock()
            val = newValue
            lock.unlock()
        }
    }
}

It will be faster compared to queues, and perhaps compared to actors (unverified assumption). Still faster will be the code that uses atomics (but that's applicable only for small primitive types like Int).

Note that protecting the variable like this might not be enough as the following example shows:

let balance = ThreadsafeVar(100)

func test() {
    balance.value -= 100

    // equivalent code:
    var v = balance.value
    // ← here other code may change balance
    v -= 100
    // ← here other code may change balance
    balance.value = v // this write will override other code changes
}

In other cases you'd need to atomically change 2 or more variables.


Having said that, you may still implement it using actors. It would be interesting to see how the two implementations compare (e.g. performance wise or in other regards).

Does it have a viral effect on the code base? yes. Is this a bad thing? no.

Personally I think yes it is a very bad thing because async code is much harder to get right than synchronous code. Every await invites the possibility of code being called in an interleaved fashion, it's not easy to try and come up with all the possible code logic issues that might be and find appropriate fixes. And it is also very difficult to be confident that a particular piece of async code has no such latent issues. In my opinion this is going to hit a lot of developers in a very hard way (hard to reproduce and understand bugs due to code races).

Actors are designed to eliminate data races! When using actors: "if the code compiles it is data race free". (I know we are not completely there yet but we are very close.)

It doesn't eliminate logic races at all. See the "eliminate data races" wwdc22 session where Doug shows what happens with the coconuts.

i.e. you're only good if you execute all your code synchronously (be it in an actor or elsewhere) but as soon as you have an "await" you can have logic issues by code executing while awaiting.

That example is equivalent to the above "balance.value -= 100", which is based on locks... so for these "logic races" actors are no better and no worse than other methods. In fact slightly better as you are alerted of all such "places of attention" via the presence of await keyword.

1 Like

What I meant is that awaits can cause new logic issues that weren't there before, for example you don't even need to be multi threaded to have logic issues because of interleaved code execution. This is not something that can happen with synchronous code.

When executing in the context of a thread/queue/actor, having to call async functions would add a lot more unsafety to the code compared to calling only synchronous functions.

This just gets more and more confusing.

Let me state what the situation is…

Unlike a normal Folder; a Smart folder relies on NSMetadataQueryDidFinishGathering and NSMetadataQueryDidUpdate notifications, sent from the default NotificationCenter.

The handling closure receives a notification object, which contains an array of all URLs that match the query's predicate. It is these URLs that need to be passed to a "buffer", to be returned by the fileURLs property on demand from a CollectionView.

The updating notifications only happen as a result of a user manually deleting or adding files to a folder, so it's hardly a time-critical operation.

Once the notification handler has assigned the changed URLs to the "buffer", it then posts a message to the default NotificationCenter, which is caught by the UI and provokes a refresh of the CollectionView, which then reads the "buffer" via the fileURLs property, in the same way as it would read a static list of URLs in a normal folder.

in other words, the rest of the app's code should know absolutely nothing about how the URLs are supplied, hence the protocol.

The code I have for the ThreadsafeVar works fine but, with all the hype about actors and isolation, I was trying to see if I could make it work in this situation.

Or is all this new stuff a solution looking for a problem? :wink:

Actors are a tool in your toolbox, it is not the solution to every problem. The hype is real (like with many new tech such as this one) and causes to overuse them (in my opinion) without regards to the cost. About your specific problem, to be honest I don't understand why you need thread safety at all, it seems to me Spotlight does its thing in the background and can simply call you back with the results on the main thread. Or am I missing something that requires you to explicitly run code on a background thread?

The update notifications are received, and the array is built, on a background thread but the code that reads the URLs comes from the main thread.

If a user is scrolling the CollectionView immediately after a file gets added or removed from the folder on disk, there is a chance (which I have had happen) that the incorrect URLs get requested when the update happens when the items in the CollectionView have changed.

Looks like you can receive those notifications on the main queue and get rid of your ThreadsafeVar wrapper.

        NotificationCenter.default.addObserver(forName: Notification.Name("xxx"), object: nil, queue: .main) { note in
            dispatchPrecondition(.onQueue(.main))
            change your items here unprotected, no fear
        }

Personally I would first try to receive the notifications directly on the main thread as it would make things a lot easier and updating the array with the results would be fast enough. If it actually causes issues like a lagging GUI, you could also do whatever you need from your background thread then periodically dispatch to the main thread so the array is only touched from there. Locking a shared array as you are doing would also work, seems more complicated but you know your use case better than me, maybe it works best in your situation.

Hmmm. Interesting but…

Quite a lot of processing has to go on once the notification has been received - reconciling the list of URLs now in the folder against those already there and determining the indexPaths of which ones have been removed or added. This is why I felt it is preferable to do all this "grunt work" on a background thread before simply signalling the UI to update.

1 Like

Added, removed, moved, updated, yeah. That's quite error prone to do right in general case when all these things can happen simultaneously. This is easier with SwiftUI, and if you are limited to UIKit - you may try its iOS 13+ diffable data source API that performs this heavy lifting for you. Interestingly its main apply(snapshot) call is not limited to the main thread (contrary to the classic reloadRows / reloadData):

Yes, you can do that work on a background queue and update UI on the main thread (or even background thread as discussed above). However, you don't have to protect that data if you only read / modify it on a single background queue... Example:

        let queue = OperationQueue()
        queue.maxConcurrentOperationCount = 1
        ...
        NotificationCenter.default.addObserver(forName: Notification.Name(""), object: nil, queue: queue) { n in
            dispatchPrecondition(condition: .notOnQueue(.main))
            // we are in background
            urls = ... // change urls here. single queue access → can be unprotected
            // you may apply diffable snapshot here (see above)
            
            let urlsCopy = urls // grab a copy. single queue access → can be unprotected

            DispatchQueue.main.async {
                ... urlsCopy // use urlsCopy here
                // you may apply diffable snapshot here (see above)
                // or you may apply insert / delete / move / modify updates here manually
            }
        }

I am talking about a macOS app, not iOS, and I certainly wouldn't be attempting to use SwiftUI for such complex app.

Thank you for pointing out the NSCollectionViewDiffableDataSource class. This wasn't available a couple of years ago, when I started on this app. It may make reconciling the changes to the old array easier, but I am still going to be left with having to assign and read the new array across thread boundaries in order for the collection view to be able to access it from the main thread after updating it on the background thread - unless I have misunderstood how the new datasource works.

Replying to my own post…

It would seem that the new NSCollectionViewDiffableDataSource requires a determinate number of sections, defined in an enum. If this is really the case, it is unusable for my app as the number of sections is determined at runtime.

Up to you of course. I found SwiftUI code about 3 to 4 times shorter than the equivalent AppKit / UIKit code, and for me shorter / easier / more understandable code helps especially if the app is complex.

No, that's just the particular example code you found. Section or Item can be anything of your choice so long they are Hashable. e.g. a String. Or a struct. In the following partial example I am using a struct for Section:

Partial Example
struct Row: Hashable {
    let string: String
}

struct Section: Hashable {
    let header: String
    let rows: [Row]
}

typealias Snapshot = NSDiffableDataSourceSnapshot<Section, Row>
typealias Datasource = NSCollectionViewDiffableDataSource<Section, Row>

class Model {
    var sections: [Section] = []
    
    func createSnapshot() -> Snapshot {
        var snapshot = Snapshot()
        snapshot.appendSections(sections)
        sections.forEach { section in
            snapshot.appendItems(section.rows, toSection: section)
        }
    }
    
    private var calledOnMainQueue: Bool?
    
    func applySnapshot(dataSource: Datasource, snapshot: Snapshot) {
        // can be on background or main queue, but be consistent
        if calledOnMainQueue == nil {
            calledOnMainQueue = Thread.isMainThread
        }
        dispatchPrecondition(condition: calledOnMainQueue! ? .onQueue(.main) : .notOnQueue(.main))
        dataSource.apply(snapshot, animatingDifferences: true)
    }
}

Whether section contains items themselves or just the key using which you can obtain the items for the section is up to you. Ditto for the Item (row) - can be the "whole thing" or just the key.

You may find this full example applicable. It is for iOS but there's virtually no difference (just a change from UIDiffableDataSource to NSDiffableDataSource and UITableView / UICollectionView to NSTableView / NSCollectionView.