SQLite Temp Directory Setting

Dear All,

Is there a way to configure the sqlite3_temp_directory global variable in the SQLite library used by the SQLite Fluent Driver?

The motivation is to have a service with persistent storage with very light resource requirements (hence SQLite) and be able to sandbox it to a specific directory.

However, SQLite has a default temp directory baked in, which is in /var/folders on macOS. And issuing PRAGMAs to keep the files in memory with app.db.raw(... appears to be too late, the sandbox will have crashed the app by then, when the app attempts to open the temp files outside the data directory.

The environment variables SQLITE_TMPDIR and TMPDIR are not effective, because the internal global variable appears to take precedence.

cc @graskind

Making this directly settable would require making SQLiteNIO's CSQLiteNIO module part of the package products, which we deliberately avoid in order to prevent conflicting usage of the library.

Additionally, the SQLite documentation has this to say about using this setting:

Applications are strongly discouraged from using this global variable. It is required to set a temporary folder on Windows Runtime (WinRT). But for all other platforms, it is highly recommended that applications neither read nor write this variable. This global variable is a relic that exists for backwards compatibility of legacy applications and should be avoided in new projects.

I don't know of a better option for a sandboxed project, unfortunately :confused:.

This is a bit lazy of me to not look up, but are you sure the path is hardcoded rather than based on a different environment variable…or rather configuration option? I believe the /var/folders/ paths are user-specific, so it would be odd for sqlite to hardcode one of them in the source or even at compile time. env doesn't show anything helpful except TMPDIR, but…

% getconf -a | grep /var/folders                              
DARWIN_USER_DIR: /var/folders/0d/<snip>/0/
DARWIN_USER_TEMP_DIR: /var/folders/0d/<snip>/T/
DARWIN_USER_CACHE_DIR: /var/folders/0d/<snip>/C/

Unfortunately, I didn't spot anyway to override these settings (see also man confstr), but you could at least check that these match what you're seeing.

As, sadly, the internal global variable is the first priority, even if it is "strongly discouraged" and the corresponding PRAGMA is already deprecated, it should be settable. Or SQLite would need to be compiled from source with the sqlite3_temp_directory variable unset to allow the environment variables to take effect.

There could perhaps be two options:

  1. When the .file storage option is selected, it could have an optional tempPath parameter passed down, so the beautifully done isolation between the modules can be maintained. This would still rely on the legacy global variable, but I doubt that will be going away altogether - if it did, even better, then the environment variables will take precedence.

  2. For the .file storage option, there could be an optional memCache Bool parameter and if set, the driver would execute the PRAGMAs for journal and cache to be in memory. Not ideal, as a journal should be persistent, but sandboxable.

SQLite does use the per-user temp directory set by MacOS. However, when sandboxing the app, I wouldn't want to include the user level temp directories in the sandbox, because it would expose data not related to the app.

I should probably point out that when using SQLiteKit or the even higher-level FluentSQLiteDriver, the .memory option is internally remapped to .file using temporary files in order to support multiple connections to "in-memory" databases in the absence of the deprecated shared cache option. It's very possible that if you're using .memory databases, the "environment variables don't take effect" behavior you're seeing isn't coming from sqlite at all, but rather from this line of code. If that is the case, that's an issue in Vapor's implementation and something most certainly can and should be done about it.

I am using the .file option and the file in the sandboxed directory does get created -- empty, because as the next step the library tries to open the temp files outside the sandbox and gets killed.

Didn't realise that .memory was in fact .tempfileonly... :slight_smile: The location of the temp files could be passed down with the .memory option, though.

Perhaps a tempDir optional parameter for both .file and .memory might be appropriate.

That would require source-incompatible changes to existing public API and would thus be a breaking (e.g. semver-major) change. That being said, a new major version of SQLiteNIO that's based on Concurrency rather than relying on NIO is planned for Fluent 5.

What might be possible for the current version would be an API that allows setting the SQLite global at the SQLiteNIO level.

However, I'm looking at libsqlite3's source code, and it appears that sqlite3_temp_directory defaults to NULL and is never set to anything else. The library does check the global first, but falls back on the SQLITE_TMPDIR and TMPDIR environment variables when it is NULL. The /var/folders/... path is what I'd expect to see coming from TMPDIR; is it possible the sandbox is preventing you from overriding these environment variables?

Well, then there is something more fishy going on, because adding

let sqliteTempLocation = Environment.get("SQLITE_TMPDIR") ?? "<NOT SET>"
L.info("SQLite temp location: \(sqliteTempLocation)")

to configuration.swift does print the location set in the environment.

How can I make sure that the SQLite library loaded by the app is actually the brew version? Is there a way to print the version of the library the driver is talking to?

I have:

$ sqlite3 --version
3.47.0 2024-10-21 16:30:22 03a9703e27c44437c39363d0baf82db4ebc94538a0f28411c85dda156f82636e (64-bit)

and

$ echo $LDFLAGS ; echo $CPPFLAGS ; echo $LIBRARY_PATH ; echo $DYLD_LIBRARY_PATH 
-L/opt/homebrew/opt/sqlite/lib
-I/opt/homebrew/opt/sqlite/include
/opt/homebrew/opt/sqlite3/lib
/opt/homebrew/opt/sqlite/lib

as per the instructions for the keg and adding the library paths.

I have built the app using swift build -c release, so would not be a statically linked executable, and... would it find the right library or use the macOS default?

Just to chime in here as I know the answer for this bit - SQLiteNIO builds it's own copy on SQLite so you get whatever version we currently have bundled with that. This gives us more control over the version and feature support and removes the need to ensure it's installed on every system where the app is deployed.

The current version we bundle is 3.47.0 - sqlite-nio/Sources/CSQLite/version.txt at main · vapor/sqlite-nio · GitHub

Thanks!

That's good -- gives stability, but bad, as despite the NULL global variable set in the library and the environment variable set to the desired path, the library appears to try to go outside the sandbox.

Perhaps not worth digging further, perhaps it might be a deeper macOS behaviour.

It's a bit of a shame, as I progressed through my PoC, I found a number of good packages for various capabilities, which unfortunately don't compile on Linux, so no virtualisation or containers, and without the sandbox, I wouldn't run an internet facing server process on macOS... that seems to hit the end for now.

If you're interested in digging a little deeper on this (you've certainly caught my interest!), we may be able to trace where the code is getting the default TMPDIR path from by setting a breakpoint in the open(2) system call and reading the backtrace upwards.

That's very kind!

Swift definitely gets 10 out of 5 for community!

Where I am is that I am yearning to write something like

var someInt = 32

instead of

let someInt = Arc::new(RwLock::new(32));

and so Swift looks really attractive.

With on-going projects, I don't have any capacity left. Available libraries, flexibility of deployment, operational risks etc. are paramount considerations and I have exhausted my night hours with the PoC. :slight_smile:

At the end of the day deployment on macOS only is not encouraging to begin with. And I see that many packages rely on a lot of macOS services, which is fair enough, but macOS servers are too expensive for my taste and I have little experience defending them in production, so... with Linux as main target, need to live with the Arc::news (for now).