Actors, Threading and SQLite

gwendal.roue · July 5, 2021, 12:22pm

Hello, I just wanted to share the concurrency patterns used in GRDB the SQLite wrapper. They have succesfully supported the development of many apps.

All writes are serialized.
```
// ONE write made of TWO steps
try dbPool.write { db in
    try db.execute(sql: "INSERT INTO player ...")
    let count = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")!
    assert(count > 0) // Will never fail
}
```
Serializing all writes avoids the app from conflicting with itself with SQLITE_BUSY (5).

The application developer can write subsequent related statements, such as the insert and the count above, while being sure no other thread in the app can break reasonable assumptions.
All writes are wrapped in a transaction by default.
```
// ONE write made of TWO steps
try dbPool.write { db in
    try db.execute(sql: "INSERT INTO debit ...")
    try db.execute(sql: "INSERT INTO credit ...")
}
```
Explicit transaction handling is possible, but by default all writes are wrapped in a transaction. This avoids two kinds of bugs:
1. Committing partial and inconsistent database states, should one step fail.
2. Letting concurrent reads see partial updates, even if no step fails (see below).
All reads are isolated:
```
// ONE read made of TWO steps
try dbPool.read { db in
    let count1 = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")!
    let count2 = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")!
    assert(count1 == count2) // Will never fail
}
```
By isolating all reads, the application developer can write subsequent related statements, such as the two counts above, while being sure no other thread in the app can break reasonable assumptions.

Isolation can be acheived in two ways: either by serializing all reads and writes, either by wrapping all reads in a transaction, when one wants to run parallel reads and writes on a WAL database (this is called snapshot isolation).
It is possible for perform a read from the last committed state of the database that does not block concurrent writes.

This is an advanced optimization that is more a building block for high-level tools than an application tool. Basically, when you perform a write and then a read that depends on the previous write, you may want to be able to allow concurrent writes during this dependent read. For example, when you observe the database, you may want to fetch freshly modified values without blocking concurrent writes. Such optimization requires keeping the writer locked until a reader could acquire snapshot isolation (this is usually quite fast).

In the sample code below, the count and the insertion run concurrently, maybe in parallel. Yet the assertion always succeed (even if the count happens to run after the insertion).
```
try dbPool.writeWithoutTransaction { db in
    // Delete all players
    try db.inTransaction {
        try db.execute(sql: "DELETE FROM player")
        return .commit
    }
    
    // Count players concurrently, 
    // from the last committed state of the database
    dbPool.spawnConcurrentRead { dbResult in
        do {
            let db = try dbResult.get()
            let count = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")!
            assert(count == 0) // Will never fail
        } catch { ...   }
    }
    
    // Insert a player
    try db.execute(sql: "INSERT INTO player ...")
}
```

Absolutely none of the above behaviors are SQLite defaults:

Writes are not serialized by default (you'll just get an SQLITE_BUSY error)
Writes are not wrapped in a transaction by default (so you end up with inconsistent database states, or concurrent reads of unwanted values).
Reads are not isolated by default (they see all concurrent transactions)
Concurrent writes and reads do not synchronize by magic.
All assert above would sometimes fail.

In the above examples, you can see two kinds of APIs:

Database access methods (write, read, writeWithoutTransaction, spawnConcurrentRead...), with well-defined concurrent behavior, which provide database access to a closure argument.
Database methods (execute, fetchOne...) which must be run inside the closure argument of a database access method.

The unit of concurrency is the closure argument of a database access methods. Inside such a closure, all api calls are synchronous. It is inside such a closure that isolation guarantees apply.

Those closures are all serialized, or run concurrently, depending on the kind of database you use (WAL or non-WAL).

Non-WAL databases serialize all reads and writes.
WAL databases serialize writes, and allow concurrent read and writes.

We can build quite useful use cases on top of those building blocks:

Local reasoning: one does not have to think about the behavior of other application components. As long as you are in the closure of a database access method, you are always fully isolated.
Transactional reasoning: one can establish clear transaction boundaries.
Single source of truth: a write access guarantees full exclusivity on the database.
One can run the app on a efficient WAL database on disk, and tests and previews in an in-memory database, with the guarantee of identical behavior.
Database observation is able to not miss a single change.

In this landscape, and going back to the topic of the thread:

Reentrancy would clearly be a problem if allowed inside one database access (one read or one write). Local reasoning could no longer be possible. For example, opening transactions would not be possible as all, if anyone could interleave a commit or a rollback at any time.
I don't know how to implement "serialized writes, concurrent read and writes" with the new Swift async APIs. I'm sure it is possible, but this is not a built-in feature of actors.
I (personally) prefer thinking about the user needs, and eventually support them with Swift async features, than the opposite (expose async features and let the user guess how to build what she needs on top of those primitives). SQLite is difficult to use correctly in an asynchronous world.

gwendal.roue · July 13, 2021, 8:08am

Another data point: some people want to access several databases simultaneously: Multiple Databases Simultaneously

Current GRDB implementation allows this by maintaining a list of valid database handles in a DispatchQueue specific. Local task values are the new way to do it.

liuliu · July 13, 2021, 9:22pm

This is only problematic if you share the SQLite connection from writer to reader. If you use WAL mode, reader connections should be pooled while writer connections should be separated (ideally per table if you use SQLite as document storage engine like Dflat does).

(Of course, there are cases where read connection can do write, for example, triggered checkpoint operation in SQLite, and that can cause deadlocks below. Luckily, SQLite ships today can open pure read-only connection for WAL mode)

There could be deadlocks though. The only place SQLite (in WAL mode) would take a internal pthread lock is when you execute write transactions from two SQLite connections at the same time. If one actor holds one SQLite write connection, and because the executor implementation (from WWDC) capped concurrency, it is probable that you will end up with deadlocks.

In particular, imagine you have this code from two actors:

// In actor A method
db0.execute("SAVEPOINT t0")
db0.write("INSERT XXX") // This write SQL will grab an internal pthread lock
await someAsyncFunction() // Deschedule current work, potentially schedule actor B method here 
db0.write("SELECT YYY")
db0.execute("RELEASE t0") // The internal pthread lock will be released here

...

// In actor B method
db1.execute("SAVEPOINT t1")
await someOtherAsyncFunction() // Resume from actor A.
db1.write("REPLACE XXX") // This need to grab the file-level pthread lock for write access, and cannot.
db1.execute("RELEASE t1")

With limited concurrency (without aware the pthread lock), the above scenario can happen.

I will start to work on async interface for Dflat later this year, and this thread is interesting to follow. The async / await interface is not that important ATM for me because it is iOS 15 only and Dflat doesn't have a good server-side story with other more popular storage backends (such as PostgreSQL).

Chris_Lattner3 · July 25, 2021, 5:26pm

I don't think that actors are going to be the ideal solution for every concurrency problem. Specifically, if you have a reference type and you want multiple concurrent multiple tasks to be able to access it, actors won't do the job (since they allow only one in at a time).

That's ok, though, because the greater Swift concurrency story has an answer: just make these things be a swift class, mark it Sendable and implement concurrency internal to the class however you'd like (atomics, locks, whatever).

Actors are intended to cover a widely used common case, while interacting with legacy code and allowing advanced cases like this to compose in. If you'd like to learn more, I'd recommend reading the SE-0302 Sendable proposal and the related discussion threads.

-Chris

liuliu · August 4, 2021, 3:08am

One thing unclear from Sendable proposal is how, for things such as SQLite that uses pthread mutexes under the hood, can interpolate well with Swift executors. Per swift-evolution/0000-custom-executors.md at custom-executors · rjmccall/swift-evolution · GitHub, default concurrent executors will have fixed width. To wrap around these C libraries like SQLite as Sendables, it must have ways to implement these pthread mutex in a way that can yield within Swift executors.

On iOS, Apple engineers ship their own version of SQLite, they probably already did this internally (i.e. have a Swift runtime friendly pthread mutex implementation). But it seems not communicated and these mixed pthread mutex with Swift async / await can be a bit trickier than I would expect.

Chris_Lattner3 · August 13, 2021, 4:01am

Hi @liuliu ,

As you say, this is more a question about custom executors and how we manually place work on them than it is Sendable. The custom executors pitch has never been run as a formal review, and I don't know the status of it, but I hope it includes something that will enable this use case.

-Chris