Database API/Framework

(Sorry for the fake threading; I read this thread in the web archive and wanted to chime in.)

Helge Heß:

> 4. Coming from a diverse backend/server background, I’ve seen different lang+environments handle sql interfaces differently. Most have left it up to third party frameworks/ecosystems (ruby/node/python) but some like Golang provide at least a default SQL interface that driver authors can adapt to. I’m starting to lean towards the latter of the two but I’m by no means and expert on anything :). This probably needs more community engagement/proposal.

IMHO that sounds more like something the higher level frameworks (ORMs) should be concerned about (as they will define what they want to map from and how, and probably in pretty different ways). But maybe not. Personally I’d like to avoid something like JDBC as it provides little value at extra overhead.

As an example: if I have received a DispatchData from the socket containing a PG row values, I don’t want to first convert that to a JDBC or EOAdaptor record-Dictionary<Key,Any>> which is then mapped to a Customer.purchaseStatus enum. I would rather like to pass up that DispatchData to the highest level and only have it converted at the very last point and only if necessary - sometimes an ORM, but other times it may be just directly streamed into a different format, say JSON (in such cases you can often accomplish zero-copy).

> My hope is we can start writing some of the drivers in pure Swift and help all those frameworks/libs get better too.

Not sure a generic ‘driver' interface will help a lot here. I think you help most by providing a SwiftPQ library which can then be integrated by the ORMs/frameworks in a way which fits best.

I think Perl has a good model we might want to pay attention to. Perl has a largely database-independent library called DBI (Database Interface) which basically just standardizes the interfaces to connect to databases, run queries, iterate through results, examine schemas, and so on. A system of plug-ins, called DBDs (Database Drivers), handle each individual database. Applications can either use DBI directly or through a library that functions as an ORM. (Most of these higher-level libraries are called DBI Extensions (DBIx), though this is just a convention.)

    Drivers Interface Extension/Client
  DBD::MySQL -+ +- DBIx::Class
  DBD::PG -+----- DBI ----- +- DBIx::DataModel
  DBD::Oracle -+ +- Direct use

DBI doesn't abstract away *all* differences between databases. But it gets most of them, and in practice, it's not very difficult to switch from one database to another, or use different databases in development and production. It saves individual ORMs from needing to individually implement different database backends, and facilitates sharing of code for features like connection pooling. If the interest is there, this can support a wide and varied ecosystem: Perl's CPAN module repository includes 103 DBD modules and 473 DBIx modules.

In Swift, we may also be able to use it to encourage good database practices. The control Swift offers over string literals and interpolation could be leveraged to encourage use of prepared statements and good escaping (or rather, use of parameterized queries instead of escaping). Imagine if, when you wrote something like this, it was perfectly safe:

  let userID = …
  for row in try connection.query("SELECT * FROM posts WHERE user_id = \(userID)") {
    // `query`'s parameter is actually a `SQLStatement`, so the string literal is actually a SQL statement literal.

Because `userID` would be an `Int`—that is, a type expected to contain data—it would be automatically passed as a parameter so there were no escaping issues. If you used a type for a table, column, or fragment of a statement, on the other hand, it would be handled appropriately:

  let userID = …
  let sortColumnName = …
  let ascending = …
  let postsTable = try! SQLTable("posts", in: connection)
  let sortColumn = try SQLColumn(named: sortColumnName, in: postsTable)
  let sortDirection: SQLFragment = ascending ? "ASC" : "DESC"
  for row in try connection.query("SELECT * FROM \(postsTable) WHERE user_id = \(userID) ORDER BY \(sortColumn) \(sortDirection)") {
    // In the above, the SQLTable, SQLColumn, and SQLStatementFragment are inserted without escaping,
    // while the non-database-specific userID is passed as a parameter.

This is cool stuff that the dynamic languages don't offer, but you probably don't want to write it for each individual database driver. You want to write it once, either at the database-independent interface layer or as a layer above it. To do that, you need to funnel all database use through a single chokepoint. In other words, you need something like DBI.

I'm not saying we should adopt DBI exactly as it is—Swift would demand a very different design. But I think we should consider using a similar architecture. It has a lot of advantages over forcing high-level frameworks to talk directly to low-level databases.


Brent Royal-Gordon

This is essentially the original message of the request, the desire to have a JDBC. Maybe people really want to have that :-) I guess a few protocols for simple use cases do not hurt, but I would advise against over designing this.

IMHO it doesn’t make that much sense since todays databases are so different. If you are a PostgreSQL user, you quite likely want to make use of its features (be it JSON columns, FTS, custom types, row permissions, table inheritance, etc).
Even pooling has very different requirements based on the setup of the database (e.g. in PG connections are pretty expensive, in others not at all, and SQLite doesn’t even have them). Let alone transactions and how they behave.
Presumably a HL framework has to have specific support to account for dealing with such differences, it can’t just rely on ‘JDBC'.

Personally I’m more interested in comprehensive interaction modules for the respective databases that can be used by all HL projects than in an API trying to bundle everything into a set of basic protocols. It is the HL project presenting a consistent API to the enduser (be it a full ORM or something more low level [e.g. I would rather stream batches of tuples coming from the database than building an EOF/CoreData like object graph]).

But I’m quite interested to see what proposals are coming in :-)



On 26 Oct 2016, at 11:34, Brent Royal-Gordon via swift-server-dev <> wrote:

I'm not saying we should adopt DBI exactly as it is—Swift would demand a very different design. But I think we should consider using a similar architecture. It has a lot of advantages over forcing high-level frameworks to talk directly to low-level databases.