Hello, I just wanted to share the concurrency patterns used in GRDB the SQLite wrapper. They have succesfully supported the development of many apps.
-
All writes are serialized.
// ONE write made of TWO steps try dbPool.write { db in try db.execute(sql: "INSERT INTO player ...") let count = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")! assert(count > 0) // Will never fail }
Serializing all writes avoids the app from conflicting with itself with SQLITE_BUSY (5).
The application developer can write subsequent related statements, such as the insert and the count above, while being sure no other thread in the app can break reasonable assumptions.
-
All writes are wrapped in a transaction by default.
// ONE write made of TWO steps try dbPool.write { db in try db.execute(sql: "INSERT INTO debit ...") try db.execute(sql: "INSERT INTO credit ...") }
Explicit transaction handling is possible, but by default all writes are wrapped in a transaction. This avoids two kinds of bugs:
- Committing partial and inconsistent database states, should one step fail.
- Letting concurrent reads see partial updates, even if no step fails (see below).
-
All reads are isolated:
// ONE read made of TWO steps try dbPool.read { db in let count1 = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")! let count2 = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")! assert(count1 == count2) // Will never fail }
By isolating all reads, the application developer can write subsequent related statements, such as the two counts above, while being sure no other thread in the app can break reasonable assumptions.
Isolation can be acheived in two ways: either by serializing all reads and writes, either by wrapping all reads in a transaction, when one wants to run parallel reads and writes on a WAL database (this is called snapshot isolation).
-
It is possible for perform a read from the last committed state of the database that does not block concurrent writes.
This is an advanced optimization that is more a building block for high-level tools than an application tool. Basically, when you perform a write and then a read that depends on the previous write, you may want to be able to allow concurrent writes during this dependent read. For example, when you observe the database, you may want to fetch freshly modified values without blocking concurrent writes. Such optimization requires keeping the writer locked until a reader could acquire snapshot isolation (this is usually quite fast).
In the sample code below, the count and the insertion run concurrently, maybe in parallel. Yet the assertion always succeed (even if the count happens to run after the insertion).
try dbPool.writeWithoutTransaction { db in // Delete all players try db.inTransaction { try db.execute(sql: "DELETE FROM player") return .commit } // Count players concurrently, // from the last committed state of the database dbPool.spawnConcurrentRead { dbResult in do { let db = try dbResult.get() let count = try Int.fetchOne(db, sql: "SELECT COUNT(*) FROM player")! assert(count == 0) // Will never fail } catch { ... } } // Insert a player try db.execute(sql: "INSERT INTO player ...") }
Absolutely none of the above behaviors are SQLite defaults:
- Writes are not serialized by default (you'll just get an
SQLITE_BUSY
error) - Writes are not wrapped in a transaction by default (so you end up with inconsistent database states, or concurrent reads of unwanted values).
- Reads are not isolated by default (they see all concurrent transactions)
- Concurrent writes and reads do not synchronize by magic.
- All
assert
above would sometimes fail.
In the above examples, you can see two kinds of APIs:
- Database access methods (
write
,read
,writeWithoutTransaction
,spawnConcurrentRead
...), with well-defined concurrent behavior, which provide database access to a closure argument. - Database methods (
execute
,fetchOne
...) which must be run inside the closure argument of a database access method.
The unit of concurrency is the closure argument of a database access methods. Inside such a closure, all api calls are synchronous. It is inside such a closure that isolation guarantees apply.
Those closures are all serialized, or run concurrently, depending on the kind of database you use (WAL or non-WAL).
- Non-WAL databases serialize all reads and writes.
- WAL databases serialize writes, and allow concurrent read and writes.
We can build quite useful use cases on top of those building blocks:
- Local reasoning: one does not have to think about the behavior of other application components. As long as you are in the closure of a database access method, you are always fully isolated.
- Transactional reasoning: one can establish clear transaction boundaries.
- Single source of truth: a write access guarantees full exclusivity on the database.
- One can run the app on a efficient WAL database on disk, and tests and previews in an in-memory database, with the guarantee of identical behavior.
- Database observation is able to not miss a single change.
In this landscape, and going back to the topic of the thread:
-
Reentrancy would clearly be a problem if allowed inside one database access (one read or one write). Local reasoning could no longer be possible. For example, opening transactions would not be possible as all, if anyone could interleave a commit or a rollback at any time.
-
I don't know how to implement "serialized writes, concurrent read and writes" with the new Swift async APIs. I'm sure it is possible, but this is not a built-in feature of actors.
-
I (personally) prefer thinking about the user needs, and eventually support them with Swift async features, than the opposite (expose async features and let the user guess how to build what she needs on top of those primitives). SQLite is difficult to use correctly in an asynchronous world.