concurrentRead Question

mazz · February 9, 2019, 3:57pm

So I have a function that does a writeInTransaction then a read. Should I do a dbPool.writeWithoutTransaction and then a concurrentRead right after, like your documentation suggests? Your docs:

try dbPool.writeWithoutTransaction { db in
    // Increment the number of players
    try Player(...).insert(db)
    
    let futureCount = dbPool.concurrentRead { db
        // Guaranteed to be non-zero
        return try Player.fetchCount(db)
    }
    
    try Player.deleteAll(db)
}

My current db function:

    // always return ALL books, even when appending
    public func addBooks(books: [Book]) -> Single<[Book]> {
        return Single.create { [unowned self] single in
            do {
                try self.dbPool.writeInTransaction { db in
                    if let user = try User.fetchOne(db) {
                        for book in books {
                            print("book: \(book)")
                            //            try! self.dbQueue.inDatabase { db in
                            var storeBook: Book = book
                            storeBook.userId = user.userId
                            try storeBook.insert(db)
                        }
                    }
                    return .commit
                }
                var fetchBooks: [Book] = []
                try self.dbPool.read { db in
                    fetchBooks = try Book.fetchAll(db)
                }
                single(.success(fetchBooks))
            } catch {
                print(error)
                single(.error(error))
            }
            return Disposables.create {}
        }
}

gwendal.roue · February 9, 2019, 5:34pm

Hello Michael,

So I have a function that does a writeInTransaction then a read

OK. Let's assume the piece of code below:

func doTheJob() throws {
    // Write
    try dbPool.write { db in
        ...
    }
    // Read
    let value = try dbPool.read { db in
        ...
    }
    // Use value
}

(Note: I use write instead of writeInTransaction, because since GRDB 3 DatabasePool.write opens a transaction).

Should I do a dbPool.writeWithoutTransaction and then a concurrentRead right after, like your documentation suggests?

This is a good question, because DatabasePool.concurrentRead is an advanced, yet useful method. Knowing how and when to use it can solve real problems.

Please apologize if this post is pretty long.

To answer your question, we first have to decide if the doTheJob() method above is correct, or not. Then we'll wonder if it is worth optimizing it with concurrentRead.

Two subjects: correctness first, optimization second.

Correctness is obviously the most important topic.

As written, doTheJob() looks legit. But does it do what you want? Let's get practical, and look at actual database requests. We'll insert a row in the write block, and fetch rows in the read block:

// WRONG (as we'll see below)
func doTheJob() throws {
    // Write
    try dbPool.write { db in
        try Book(title: "Moby-Dick").insert(db)
    }
    // Read
    let books = try dbPool.read { db in
        try Book.fetchAll(db)
    }
    print("After insertion, the database contains \(books.count) books")
}

The doTheJob method above does not, actually, do the job ! It inserts a book, but it prints an unreliable number of books.

This is because between the write and read calls, any other thread of your application can write in the database:

// WRONG (explained)
func doTheJob() throws {
    // Write
    try dbPool.write { db in
        try Book(title: "Walden").insert(db)
    }
    // <- Any other thread can modify books here.
    // Read (a random value)
    let books = try dbPool.read { db in
        try Book.fetchAll(db)
    }
    // A lie
    print("After insertion, the database contains \(books.count) books")
}

Of course, if no other thread modifies books in between, the doTheJob method behaves correctly. But there is room for one of the most nasty bugs: a race condition. It is nasty because it only happens sometimes (usually during a demo in front of one thousand people).

So the first question you have to ask yourself is: are the values I fetch in the read block dependent on the modifications performed in the write block? If the answer is no, then a write then a read are correct:

// CORRECT (as long as apple and oranges are unrelated)
func doTheJob() throws {
    // Write
    try dbPool.write { db in
        try Apple(...).insert(db)
    }
    // Read
    let oranges = try dbPool.read { db in
        try Orange.fetchAll(db)
    }
}

Furthermore, there isn't any opportunity for optimization in this scenario. End of the story: you don't need concurrentRead.

Let's consider the other scenario, where the values you fetch depend on the previous modifications. In this case, you achieve correctness (preventing other threads of your application from messing with your application logic) by grouping related database accesses in a single "database access method" (here dbPool.write):

// CORRECT
func doTheJob() throws {
    let books = try dbPool.write { db -> [Book] in
        // Write
        try Book(title: "The Grapes of Wrath").insert(db)
        // <- Other threads can not mess with our business here
        // Read (safely)
        return try Book.fetchAll(db)
    }
    print("After insertion, the database contains \(books.count) books")
}

Grouping related statements in a single database access method (write, read, writeInTransaction, ...) is a fundamental safety rule in GRDB. This is the Rule 2 in the GRDB Concurrency Guide. It will save your demos!

Now we're done with correctness. We can look at optimization.

When you wrap a write then a read in a writing method such as write or writeInTransaction, you may experience undesired locking of the database:

// POTENTIAL WRITE CONTENTION
func doTheJob() throws {
    let value = try dbPool.write { db in
        // Write...
        // Read... (concurrent threads can not write here)
    }
    // <- Now concurrent threads can write
    // Use value
}

Most reads are very fast: there is no real problem, then.

But when the read is very slow, problems may arise. For example, your UI freezes when the user of your application hits the "Save" button, because your main thread has to wait for some background job to finish its slow read until it can write in the database. This is not good.

So the second question you ask yourself is: does my application suffer from undesired blocking due to write contention induced by slow reads? If the answer is no, then you have nothing to do: end of the story.

If the answer is yes, if your app really does suffer from write contention, then you may indeed enjoy the DatabasePool.concurrentRead method:

// PREVENT WRITE CONTENTION
func doTheJob() throws {
    let future: Future<Value> = try dbPool.writeWithoutTransaction { db in
        // Write
        try db.inTransaction { // transaction is recommended
            ...
            return .commit
        }
        // Read
        return dbPool.concurrentRead { db in
            try slowComputation(db)
        }
    }
    // <- Now concurrent threads can write
    let value: Value = try future.wait()
    // Use value
}

Optimization: The concurrentRead method returns very quickly, and this is why write contention is avoided. The value is fetched concurrently, and you call the wait() method in order to access it. Slow fetches are still slow, but they no longer block concurrent writes.

Correctness: the fetched values are still isolated from modifications performed by concurrent writes. I'm still amazed by this database feature, called snapshot isolation.

It is time to conclude. Whenever those two conditions are met:

Your app needs to perform a write and then a read.
That read depends on the previous modifications.

Then you have to look for correctness first. And optimize second, if needed:

// WRONG
func doTheJob() throws {
    // Write
    try dbPool.write { db in
        ...
    }
    // Read
    let value: Value = try dbPool.read { db in
        return ...
    }
    // Use value
}

// CORRECT
func doTheJob() throws {
    let value: Value = try dbPool.write { db in
        // Write
        ...
        // Read
        return ...
    }
    // Use value
}

// CORRECT, WITH OPTIMIZATION
func doTheJob() throws {
    let future: Future<Value> = try dbPool.writeWithoutTransaction { db in
        // Write
        try db.inTransaction {
            ...
            return .commit
        }
        // Read
        return dbPool.concurrentRead { db in
            return ...
        }
    }
    let value = try future.wait()
    // Use value
}

gwendal.roue · February 9, 2019, 5:46pm

In your case, the fetch should be quick (unless a user has thousands of books). So you can simply write:

// always return ALL books, even when appending
public func addBooks(books: [Book]) -> Single<[Book]> {
    return Single.create { [unowned self] single in
        do {
            let books: [Book] = try self.dbPool.write { db in
                if let user = try User.fetchOne(db) {
                    for var book in books {
                        book.userId = user.userId
                        try book.insert(db)
                    }
                }
                return try Book.fetchAll(db)
            }
            single(.success(books))
        } catch {
            single(.error(error))
        }
        return Disposables.create {}
    }
}