For a long time, when using Swift, I have used a home-grown SQLite solution. It is a simple approach that uses the JSON abilities of SQLite to behave as a simple, fast noSQL storage. However, with changing needs and with the stringent requirements in Swift 6 I am strongly exploring a move to GRDB. Core Data feels archaic and makes decoupling hard, Couchbase and Firebase are proprietary and make me dependent on a third party, Realm is abandoned, SQLite.swift is ... less than I had before. Whether I like GRDB or not (and I hope I will): it really feels like it is the only game in town. I know this sounds a bit like it is my last pick, but what I'm really trying to say is that there is a very strong motivation to move to GRDB.
GRDB is great for local storage and I know this is its goal. Nevertheless, for my own needs, that's not enough. I develop on both iOS and Android (Room, so also SQLite). I want devices to sync, so there will be some server (Rust, some SQL database) or ad-hoc communication in the middle. I don't want to be tied to a third-party. And security is always a concern. This brings me to three requirements that I am missing. Two of them are "easy", one is hard. Each requirement leads to a number of questions which I have thought about, but I really would appreciate insides from those with extensive GRDB experience or those that contribute to GRDB how to best handle these.
Database security
There's no way around SQLCipher and, having worked with it in the past, I do feel the pain. But I did not know about Apple's data protection. This gives rise to two questions (yes, I want GRDB 7):
- is there a guide on how to (manually) include SQLCipher in a build alongside GRDB? Is it simply following the source integration outlined here: SQLCipher Community Edition - Adding Full Database Encryption for SQLite to iOS and macOS | Zetetic ? Will GRDB 7 still have the API to use a SQLCipher database or am I on my own?
- is there a worked-out example of using GRDB alongside Apple's data protection? I'm using SwiftUI, so the database setup would happen right at the
struct ...: App
. This gives very little opportunity to work with the APIs such asapplicationProtectedDataDidBecomeAvailable(_:)
. And I don't immediately have an idea how to handle a situation where access isn't available.
Knowing when the database changed
For any kind of syncing it is going to be important that I know when the database changed. I haven't worked out the whens and wheres, but the basic idea is that a database change triggers a synchronisation request. This can obviously fail for various reasons, and it should also try to synchronise using a schedule too, but these would be the entry point for any synchronisation. How to synchronise is not a concern for GRDB, but sync will require some extra data on top of what is stored through GRDB. At its simplest, it requires an extra isSynced
column. This leads to two questions:
- what is the best way to detect database changes for such a low-level (background) task? Triggers? Or any of the Database observations tools: GRDB Documentation – Swift Package Index ? Please note that this ties in with the next requirement, as only some observations will be of interest.
- I want to build tools on top of GRDB instead of within GRDB if this is at all possible, akin to e.g. GRDPQuery. What is the best way to build an extension on top of GRDB that enforces additional columns?
I am aware of Harmony (GitHub - aaronpearce/Harmony: Harmony provides CloudKit sync for GRDB and only GRDB.) but I haven't checked the code in any great detail. This may (or may not) be a good starting point.
Handling conflicts
I don't want the server to handle conflicts, when each device can do it itself. This naturally brings me to Conflict-Free Replicated Data Types (CRDT). And this is its own can of worms. Since this is related to syncing it comes with some very similar questions.
Handling CRDT requires extra database fields. For example, a simple and often sufficient CRDT is the "last write wins". If I have a struct with different properties, some of which can change, then I can assume that the latest write is correct. I can handle this by keeping track of an audit table that records changes to the changeable properties. For example, if I have a Movie
struct, with a title
and director
, I can assume these will never change. However, a property isWatched
and rating
are expected to change. This change can happen on multiple devices and they could be offline when the edit happens. This requires some extra fields such as a logical clock value to know which is first, and a flag whether the change has synced. I keep saying fields, but since there can be many we are really talking about an extra table here. The questions are as follows:
- I can implement this database structure manually. Even better would be if some simple property wrapper or protocol did the work for me. This is similar to the question before but, really, this time it feels more like property wrappers are creating an extra table instead of a protocol adding some extra rows. Can it be done and where do I best look for it?
- Implementing CRDT is hard and part of the problem is that they tend to grow quickly. For example, assume an undecided user who is offline. He/she sets the rating to 3, then to 5, then to 4. There is no use in syncing all these values as the last write wins. Since none of it synced yet, I can trivially collapse this into the last value. How can I handle this? To be precise, this would be an insert trigger on an extra (CRDT) table that itself can delete (and maybe insert?) rows. Are triggers the way to do this? The ultimate goal would be to collapse it into only a single (last) row where the rating is set to 4.
I know this is a lot. Many thanks for any insights on how best to tackle these problems!