Actors that serialise file access

jmjauer · August 8, 2023, 12:46pm

I wonder if actors are a good choice to protect resources from concurrent access, for example a directory of files. In the past I have implemented this scenario using dispatch queues. What are the pros and cons of using an actor as a funnel point for blocking file access APIs?

sspringer · August 8, 2023, 5:37pm

If you use async calls, your environment also has to be async. So async calls are “infectious” in this sense, which is very OK if you are in an async environment anyway. So in order to support both async and non-async environments, dispatch queues should be a good choice and I use them for exactly this use case.

(I generally try to support both types of environments where appropriate, e.g. by implementing a property named async which gives you a version of some object that functions in an async environment but actually does the same thing.)

…What do others think about this topic?

tera · August 8, 2023, 5:57pm

What are those operations specifically? Creating / deleting files and folders? Reading from / writing to existing files? Do you want to protect yourself from a situation that another app can, say, create a new file when you are iterating your directory, or reads from a file that your app wants to delete?

jmjauer · August 8, 2023, 6:04pm

Just the typical CRUD operations in the file system. I want to protect the integrity of these files by serialising access to them. An actor won't help to serialise access from other applications because it's a different process - but that's not the problem I want to solve.

EDIT: Sequence of CRUD operations that shouldn't be interrupted.

jmjauer · August 8, 2023, 6:10pm

I just need to serialise synchronous calls in my case. Due to actor re-entrancy, call order is not guaranteed when using async methods within actor methods.
My question is more about whether it is a good idea to use an actor to serialize access not to its properties, but to a file system resource.

QuinceyMorris · August 8, 2023, 7:54pm

Actors won't help you here. Even aside from actor reentrancy considerations, there is no guarantee of order of execution of actor methods called from sites outside the actor.

CRUD operations are already thread-safe. (It'd be a fairly disappointing file system if they weren't.) Since actors don't "serialize" anything in the sense of executing methods in the order they were called, you don't need an actor.

What you may need is a FIFO queue, and that's the benefit that a (serial) DispatchQueue solution brings to the party.

Now, if you're talking about making sequences of operations effectively "atomic" (e.g. you aren't allowed to mutate a directory while someone is enumerating it), then you have some mutable state that an actor can protect. That's at a higher level of abstraction than CRUD, I think. In that regard, I think @tera's questions are more relevant here than you might think.

taylorswift · August 8, 2023, 8:03pm

jmjauer · August 8, 2023, 8:25pm

You're right, I wasn't precise enough. What I am interested in is a synchronous and uninterrupted sequence of CRUD operations.

jmjauer · August 8, 2023, 8:47pm

There is pretty much the answer in this thread: Actors (without custom executors) use the global cooperative default thread pool. So blocking calls within the actor methods would block one of the few available threads. this is certainly not ideal, and dispatch queues don't have this problem.

David_Smith · August 8, 2023, 9:14pm

I mean… dispatch queues do have that problem, they just respond to it in a different way up to a limit.

dmt · August 8, 2023, 9:23pm

If you're considering a lot of concurrent io traffic, I guess you should take a look at posix aio for a truly asynchronous api (Perhaps someone already made a wrapper lib)

tera · August 9, 2023, 12:46am

About aio on macOS.

tera · August 9, 2023, 1:01am

I think it's ok to use actors if you expose your higher level API to work with atomic operation sequences, e.g. like so (a quick & dirty example):

actor Logger {
    func log(_ string: String) async {
        open()
        let text = read()
        write(text + string)
        close()
    }
    static func log(_ string: String) {
        Task {
            await Self.shared.log(string)
        }
    }
}

tclementdev · August 9, 2023, 10:18am

I don't think this Logger example would be recommended because 1) it participates in starving the thread pool and 2) order is not guaranteed.

doozMen · January 22, 2024, 11:16am

I'm still interested in this topic to easily make order count when writing to a file. I was made aware that this needed a change to have custom actors that support order. But I'm not aware of the current status. Does anybody know?

jmjauer · January 22, 2024, 2:04pm

If all file operations in a single actor method are synchronous, then the order of these calls is always guaranteed (since there is no suspension point). Only async calls can suspend and change execution order due to actor reentrancy. At least that is my understanding of actor reentrancy.

ktoso · January 22, 2024, 2:08pm

Technically speaking even that is not guaranteed.

Swifts actors are just not FIFO today. If a high priority task arrives and others are normal, it may get to execute before the others.

It was designed this way, in order to facilitate serving those high priority work as soon as possible. And even allowing an “skip the work, we no longer need it!” Messages to jump in front of the queue etc…

But yes, it means we just don’t — in the general sense of the word — have FIFO in actors today.

If all your work has the same priority, and all of this work has no suspension points then yes — you’d get FIFO behaviors, but it’s somewhat brittle.

I do think the requests for doing something better here have been heard, but so far priority was to get the isolation model without holes in Swift 6.

jmjauer · January 22, 2024, 2:10pm

Technically speaking even that is not guaranteed.

Swifts actors are just not FIFO today. If a high priority task arrives and others are normal, it may get to execute before the others.

But when the actor method is started, it will run completely and uninterrupted, right? Only the order of the actor method calls is not guaranteed - or am I wrong?

ktoso · January 22, 2024, 2:12pm

In that sense yes. But it’s not quite right to say it is “FIFO” since a last task to arrive, eight highest priority, may run next — before existing tasks in the queue.

Just something to be aware of.

Robust solutions that are always FIFO will specifically be using your own message queue, or synchronous methods on a custom executor that won’t do such reordering (that’s a trick to consider actually, as only default actors can do this escalation today).

jmjauer · January 22, 2024, 2:19pm

So an actor with only synchronous methods and a custom FIFO executor is FIFO, right?