Allowing for Arbitrary actors

CanTheAlmighty · May 22, 2025, 6:03pm

I've created a framework within my App to handle storage using SQLite, so far, I've let this library be free from any actor or threading, as I want the application to decide where it should run.

The problem with this, is that is really easy to end up calling the library from another thread/actor isolation, and SQLite does not tolerate this. The problem is specially noticeable since the library adds extensions to Modellable objects, and this might hide the intended thread.

struct User: Modellable {
   // ...
}

let user = User(name: "Freddy")
user.save() // Function provided by the library, this could be called from anywhere

The simple solution would be to enforce @MainActor all over the framework, but that would close the opportunity of the implementor of the library to use it on another thread dedicated only for storage (e.g @StorageActor)

I was wondering if it's possible to let the library accept an "Arbitrary actor" to use for isolation, like leave it as a some sort of typealias, and the client that ends up using the library be free to define it, for example (pseudo-swift):

// In the framework
extension Modellable {
    @AliasActor func save() { ... }
}

// In the application that uses the framework
typealias Framework.AliasActor = MainActor // or some other actor

John_McCall · May 22, 2025, 6:29pm

Should the objects just be non-Sendable?

rayx · May 23, 2025, 2:11am

CanTheAlmighty:

I was wondering if it's possible to let the library accept an "Arbitrary actor" to use for isolation, like leave it as a some sort of typealias, and the client that ends up using the library be free to define it, for example (pseudo-swift):
// In the framework
extension Modellable {
    @AliasActor func save() { ... }
}

// In the application that uses the framework
typealias Framework.AliasActor = MainActor // or some other actor

Assuming the methods in your framework are all synchronous, isn't it just the default behavior of nonisolated? what doesn't work for you?

CanTheAlmighty · May 23, 2025, 3:48pm

Assuming the methods in your framework are all synchronous, isn't it just the default behavior of nonisolated ? what doesn't work for you?

I will edit the main post, but SQLite does not allow for accessing the database connection on a different thread, so everything needs to happen on a single thread, that's why the Actor-isolation would work really well with this, but leave it to the implementor of the library to define the Actor, or default it to MainActor.

nkbelov · May 23, 2025, 6:29pm

You might want to consider creating an actor with a custom executor that would own and guarantee a specific thread (this is essentially what MainActor does). You could also let clients pass that thread on init:

actor MySQLiteActor {
    private let executor: MySingleThreadExecutor
    
    nonisolated var unownedExecutor: UnownedSerialExecutor {
        executor.asUnownedSerialExecutor()
    }

    init(_ thread: Thread) {
        self.executor = MySingleThreadExecutor(thread)
    }
}

ktoso · May 23, 2025, 8:06pm

AFAIR the SQLite requirement was just about "no concurrent access" and not "specifically the same thread", in which case just wrapping all accesses with one specific actor would be sufficient.

Can you double check and confirm or deny that it definitely has to be one specific thread consistently?

If so, you'd be able to do this using a custom actor executor, yes. But I'm suspecting that's not the actual requirement, and it's just saying it's not performing any synchronization and you have to do it yourself -- which any actor would be sufficiently guaranteeing.

Malien · May 25, 2025, 10:40am

SQLite supports three different threading modes:

Single-thread. In this mode, all mutexes are disabled and SQLite is unsafe to use in more than a single thread at once.

Multi-thread. In this mode, SQLite can be safely used by multiple threads provided that no single database connection nor any object derived from database connection, such as a prepared statement, is used in two or more threads at the same time.

Serialized. In serialized mode, API calls to affect or use any SQLite database connection or any object derived from such a database connection can be made safely from multiple threads. The effect on an individual object is the same as if the API calls had all been made in the same order from a single thread. The name "serialized" arises from the fact that SQLite uses mutexes to serialize access to each object.

The threading mode can be selected at compile-time (when the SQLite library is being compiled from source code) or at start-time (when the application that intends to use SQLite is initializing) or at run-time (when a new SQLite database connection is being created). Generally speaking, run-time overrides start-time and start-time overrides compile-time. Except, single-thread mode cannot be overridden once selected.

I also thought that it had to be the exact same thread and was looking towards custom executors. From looking at the "unsafe to use in two or more threads at once", I hope that the "at once" here means "it can be transferred between threads, just don't call functions on it concurrently". That falls exactly in-line with the Swift's Sendable.

In my own lil' sqlite library I open the connection in a "multi-threaded" mode and mark the connection struct as Non-Sendable (aka. bring your own synchronization). Or use a Pool actor that yields a ~~single-threaded~~ non-sendable connection. Of course you can serialize all accesses on a single actor, but SQLite (especially in WAL mode) can achieve much better performance by having multiple connections in use concurrently^[1].

I've spent a lot of time agonizing over SQLite's synchronization integration into Swift's concurrency, and settled onto that design:

let pool = Pool(filename: ":memory:")

//                      v Might execute on whatever thread there is
func handleRequest() async {
  try await pool.acquire { conn in
    try conn.execute(...)
    // Nope! "conn" has to be reinitialized before closure exit
    global.smuggleConnection = conn
  }
}

I might be able to achieve this:

let pool = Pool(filename: ":memory:")

func handleRequest() async throws {
  let conn = try await pool.acquire()
  try conn.execute(...)
}

But that would either require async deinit of the conn to send the connection back to the pool, or just use mutexes instead of actor isolation in Pool^[2].

Terribly sorry for trying to steal the discussion towards an orthogonal to the OPs API.

also transactions become much trickier, since there can be only one transaction in-flight in this scenario. There are so many footguns to be fired, once you want to achieve safer transaction APIs. That's why I just don't allow them in my version of SharedConnection ↩︎
Also ~Escapable would be super nice to prevent users from hoarding connections onto something like a global array, that would exhaust the connection pool ↩︎

rayx · May 26, 2025, 6:21am

I have little experience in Sqlite, but I find this is an interesting case. Sqlite document defines threading models for typical scenarios in C programming, but doesn't explicitly cover the scenario of how Swift actor works.

Below is my understanding of the three modes in Sqlite document (please correct me if I'm wrong):

Single-thread mode: everything should be performed in a single thread
Multi-thread mode: different threads must have its own database connection. It's OK to perform operations concurrently (I suppose it's up to user to synchronize them?).
Serialized mode: different threads can share database connection. It's OK to perform operations concurrently, but they are synchronzied (serialzed) internally.

Let's assume we don't use custome executor, so a Swift actor isn't associated with a fixed thread. Serialized mode should work perfectly with actor. But what about Multi-thread mode? I'd guess it's OK too, because although the underlying thread may change, there is only one thread accessing connection at one time (I assume Sqlite doesn't use thread local variables). In this vein, I'd think it's even OK to use Single-thread mode if the wrapper framework makes sure user can only create one database connection.

BTW, I did a quick search in GRDB code. I think it's hardcoded to use the default Serialized mode (though I'm not 100%s sure, because I don't understand the code comments below. Is it a typo or am I misunderstanding it?).

    /// - Note: Only the multi-thread mode (`SQLITE_OPEN_NOMUTEX`) is currently
    /// supported, since all <doc:DatabaseConnections> access SQLite connections
    /// through a `SerializedDatabase`.
    var threadingMode = Database.ThreadingMode.default