In the last days I've been improving the low-level layers of GRDB the SQLite toolkit by asking SQLite to not copy C strings or blob buffers arguments when appropriate (SQLITE_STATIC vs. SQLITE_TRANSIENT).
// Avoid a useless copy of the "Arthur\0" buffer by
// instructing SQLite to use the temporary one created by Swift
let sql = "INSERT INTO player (name) VALUES (?)"
let statement = try db.makeStatement(sql: sql)
try statement.execute(arguments: ["Arthur"])
To this end, I execute the SQLite statement from within the String.withCString closure, in order to guarantee the validity of the C string pointer for the whole duration of SQLite operations:
// Simulated execution
"Arthur".withCString { cString in
// Give cString to SQLite (asking to not copy it)
// Execute statement
}
When executing a statement that accepts several arguments, it's the same, but I must take care of nesting calls to String.withCString:
try statement.execute(arguments: ["Arthur", "Barbara", "Charles"])
// Simulated execution
"Arthur".withCString { cString in
// Give cString to SQLite (asking to not copy it)
"Barbara".withCString { cString in
// Give cString to SQLite (asking to not copy it)
"Charles".withCString { cString in
// Give cString to SQLite (asking to not copy it)
// Execute statement
}
}
}
The real code, which handles an arbitrary number of elements, must use recursion in order to nest the calls to String.withCString.
This recursion can lead to stack overflow when there are a large number of arguments.
Hence my question: How can I prevent stack overflow without hard-coding a limit (say, only use this optimization when there are less than 20 arguments)?
I believe that relying on tail call optimization is not an option, since the recursion happens from within the String.withCString closure, not at the very end of the recursive function.
What's supposed to happen when you prevented it? a "safe" crash (e.g. precondition)? a throw?
You can check remaining stack size and act accordingly, although this is rarely done.
Do you see these stack overflows in practice? What's the relevant stack frame size of your functions? How much stack size is used per one level? For example if available stack size is 200K and one parameter eats 1K it'll take some 200 parameters before overflowing stack.
Here's a different approach that sidesteps recursion altogether by utilising the older autoreleased c-strings obtained via NSString API:
extension String {
var cstring: UnsafePointer<CChar> {
(self as NSString).cString(using: String.Encoding.utf8.rawValue)!
}
}
let a = "Arthur, King of Britain".cstring
let b = "Barbara of Portugal".cstring
let c = "Charles, Prince of Wales".cstring
print(strlen(a), strlen(b), strlen(c)) // 23 19 24
In my particular case, the optimization should be avoided if it creates a risk. The user has written valid code, so the library ought to run it correctly. I'm currently writing a fix, and don't recurse if there are more than 20 (chosen arbitrarily) arguments.
You can check remaining stack size and act accordingly, although this is rarely done.
Right, but I believe it will be overkill in my particular case. Is it worth the complexity/knowledge? What about future maintenance?
Do you see these stack overflows in practice? What's the relevant stack frame size of your functions? How much stack size is used per one level? For example if available stack size is 200K and one parameter eats 1K it'll take some 200 parameters before overflowing stack.
Those are excellent questions. On the SQLite version that ships on my Mac, statements can accept up to 500000 arguments. So GRDB users can provide up to 500000 arguments. Very few will do so. But they can. And they can complain if this does not work. With unbounded String.withCString { ... } nesting, that's up to 500000 nested function calls. I actually don't know if this could create a stack overflow (and I don't know if local variables in the recursive function are stored on the stack or in registers), but I also feel that the stack trace, in case of thrown errors from the top of the stack, or in case or hard crash, would look pathological
Eventually, providing a hard-coded and "sensible" limit is maybe not a bad option. Frequent use cases will profit from the optimization (we're talking about ~4% faster). Thanks again for your questions :-)
Here's a different approach that sidesteps recursion altogether
Avoiding copies of C strings is the whole point of the optimization, so unless I'm mistaken only Swift.withCString { cString in ... } guarantees the absolute minimum amount of work when I start from a Swift String: there is one and only one copy of the C string, the temp one produced by this method.
It all started from a challenge that I found funny and interesting :-) You never know where you end up!
Still I'm not sure I feel comfortable with the inability to produce an arbitrary number of temp C strings without performing copies, and generally the inability to use an arbitrary number of temp values produced by the many withTempStuff { temp in … } methods of the standard library and Foundation. I understand the need for those apis, but how are really demanding users supposed to do???
I’m not aware of a great answer here specifically because materializing a C string from a String might require alloca at the very least, to handle the inline representation. I think ultimately this is something only the stdlib can provide, perhaps as a Collection-of-Strings to Array-of-Pointers API, or perhaps an UnsafeBuffer-of-Pointers to underscore their temporary nature. People have been asking for this kind of API forever for exec-like char ** arguments anyway, and today’s recommendations all require copying the strings.
Weirdly, as a workaround you could convert everything to NSString and ask for UTF8String, which is documented to potentially be an interior pointer and thus should do the minimum number of copies. Just don’t forget to explicitly keep the NSStrings alive until you’re done.
Yes it is, because people look for iterative solutions when recursion won't do it. The withTempStuff { ... } apis force recursion, hence the feature requests mentioned by Jordan above.