[This is not intended to even be a pitch yet; it's exploring a problem and the design space even before that. There are more important things to happen in Swift 5.3 anyway and I won't be pushing for this to happen in that timeframe.]
TLDR:
- Variadics and array literals both default to allocating Arrays, which usually means a heap allocation.
- The compiler can already stack-promote Arrays if it can prove that there are no outstanding references to the Array instance.
- But it's hard to do that through a non-inlinable function call.
- We can sidestep that problem today by using UnsafeBufferPointer.
- When move-only types come along, we're close to being able to make a safe BorrowedBuffer type. (Which we'll very likely want anyway, for other purposes.)
- If we then come up with a syntax to allow types other than Array to be used for variadic parameters, we get safe stack-allocated variadics out of it. (I don't much care what the syntax is at the moment.)
Background
Variadics are a convenient syntax for passing several arguments to a function:
// Without variadics: clearly too hideous to ship
func max(_ values: [Int]) -> Int { … }
let requiredHeightOfShowerhead = max([myHeight, yourHeight, theirHeight, plumbingHeight])
// With variadics: ahh, much better!
func max(_ values: Int...) -> Int { … }
let requiredHeightOfShowerhead = max(myHeight, yourHeight, theirHeight, plumbingHeight)
As you can see, they're pretty much syntactic sugar for an array. Other threads have talked about ways to make these two calling syntaxes interoperate—that is, if you already have an Array, you should be able to pass it to a variadic function, which is important for forwarding arguments from one variadic function to another.
That's not what this is about. In fact, I've somewhat done a bait-and-switch here. For most of the rest of this post I'm going to talk about array literals.
Passing Arrays requires a heap allocation
In both versions of max
, we have the nice property that we can treat the values
parameter as a plain old Array. We can call methods on it, pass it to other functions, store it in a global…
*record scratch* …wait, store it in a global? That means the storage for the Array has to outlive the call! And that means calling a variadic function or passing an array literal is going to result in a heap allocation.
(Aside: In a language like C++ with copy constructors, there'd be a chance for the array to initially be allocated on the stack and then copy itself to the heap when stored somewhere else, much like what "copy-on-write" logic does when you mutate an array with shared storage. But in Swift, copying one value somewhere doesn't allow calling custom user code, except for the possible deallocation of the value that was there before.)
So calling variadic functions, as well as passing one-off arrays to functions, has a bit of a cost over just passing individual arguments (assuming the function can't be inlined). That's probably the right trade-off for approachability, but as a (small) part of "Making Swift A Systems Language", I feel like there should be support for passing a variable number of values to a function without a heap allocation. After all, the stack is right there.
Making it work in today's Swift
Let's add another overload of max
. (And please forgive the simple example; of course the max
that's on Sequence would provide the real implementation of this.)
func max(_ values: UnsafeBufferPointer<Int>) -> Int
let requiredHeightOfShowerhead =
[myHeight, yourHeight, theirHeight, plumbingHeight].withUnsafeBufferPointer { max($0) }
An unsafe buffer pointer offers no guarantees about lifetime, so as far as the compiler is concerned, the storage for the temporary array is not referenced after the call to withUnsafeBufferPointer
. That's already enough for the compiler to promote the array to the stack when optimizations are turned on.
Since I said this is meant for systems programming, I suppose this is technically good enough to package up and use, even with the sharp edges UnsafeBufferPointer
has.
@_transparent
func withUnsafeBuffer<Element, Result>(
_ elements: Element...,
do body: (UnsafeBufferPointer<Element>) throws -> Result
) rethrows -> Result {
try elements.withUnsafeBufferPointer(body)
}
let requiredHeightOfShowerhead =
withUnsafeBuffer(myHeight, yourHeight, theirHeight, plumbingHeight) { max($0) }
Making it safe(r)
The primary problem I see with passing an UnsafeBufferPointer is that it's, well, not safe. I'm not talking about all the operations you shouldn't do here, like trying to deallocate the buffer when you don't own it. No, the concern I have is about someone saving the buffer pointer past the end of the call, when it's no longer valid—something you can do with a simple assignment in Swift. Can we disallow that somehow? Not today, but we can in the future with move-only types.
moveonly struct BorrowedBuffer<Element> {
var rawBuffer: UnsafeBufferPointer<Element>
// forwarding APIs like subscript
}
func max(_ values: __shared BorrowedBuffer<Int>) -> Int
Note that the parameter to max
is "shared" rather than "owned". If it were "owned", it would still be possible (if unlikely) for the implementer of max
to save the BorrowedBuffer off somewhere. Fortunately, I think the plan is for "shared" to be the default for function arguments even within move-only types, which means the most common path will still be the safe one.
(I'm not going to go over these ownership terms here; you can read the Ownership Manifesto, or you can think of __shared
as &
in Rust and const &
in C++, and __owned
as an unannotated value in both languages.)
What about the call side, though? How do we make a borrowed buffer? If we want to be safe all around, we can't just add an initializer that takes an UnsafeBufferPointer and call it a day. We want to make sure that a BorrowedBuffer in "safe" code can't accidentally reference memory past its lifetime. After all, it doesn't have "unsafe" in the name.
Fortunately, one of the goals of move-only types is to allow resource management; we should be able to do something like this:
moveonly struct BorrowedBuffer<Element> {
private var owner: Unmanaged<AnyObject>?
var rawBuffer: UnsafeBufferPointer<Element>
deinit {
owner?.release()
}
// forwarding APIs like subscript
}
extension BorrowedBuffer: Collection { … }
I've written the owner reference as Unmanaged
and done some manual memory management to make it clear that the reference does not escape; unfortunately the compiler is (correctly) conservative and still refuses to promote the owning array to the stack. (It's not wrong, either; even with a private
field someone could still get the value out by reflection.) So we'd have to add additional logic to the compiler to understand that this field's value never escapes, which this Unmanaged logic is standing in for.
@_transparent
deinit {
precondition(
isKnownUniquelyReferenced(&owner),
"you mustn't access the 'owner' field of a BorrowedBuffer")
}
(I don't think a bare precondition
would work today, but since there is no memory-unsafety introduced by stack-promoting if you abort whenever the stack isn't the only reference, this ought to be optimizable.)
But without this additional "owner" field, we've only made stack buffers "safer" but not "safe", at least without not making the caller side clunky again. If we could solve this problem, though…
Making it pretty
…I'd suggest lifting an idea from an old proposal from @Haravikk3 (no longer around on the site, I guess): [Proposal] Variadics as Attribute (with later [Discussion] Variadics as an Attribute)
func max(_ values: @variadic BorrowedBuffer<Int>) -> Int
// Hooray, we're back to the original syntax!
let requiredHeightOfShowerhead = max(myHeight, yourHeight, theirHeight, plumbingHeight)
Specifically, a "desugared" syntax for variadics that allows choosing the type used for the variadic argument, as long as it is ExpressibleByArrayLiteral. If we can safely make BorrowedBuffer ExpressibleByArrayLiteral, that gives us all the tools we need.
Oh, and I just snuck that in there, but it ought to work for regular old array literals too. BorrowedBuffer ensures that the backing storage is kept alive as long as the BorrowedBuffer is alive, and that as long as the BorrowedBuffer instance isn't passed "owned", the backing storage can be stack-promoted. That's true whether the array literal's passed as an argument or stored in a variable first—it can even be used multiple times.
I don't want to discuss the syntax for this, just whether people think it's reasonable to allow other types to be used for variadic parameters besides Array. (I only briefly skimmed the old discussions so I should probably read those again.)
Conclusion
Thoughts? In particular, any better ideas on how to make BorrowedBuffer safe on the creation side?
Appendix: Because This Post Isn't Long Enough
There's one other possible way to handle the "lifetime extension" needed for borrowed buffer: generalized coroutines. That would look something like this:
moveonly struct BorrowedBuffer<Element> {
var rawBuffer: UnsafeBufferPointer<Element>
// as usual, please ignore syntax
static func borrow(
from: AnyObject?,
rawBuffer: UnsafeBufferPointer<Element>
) -> __shared BorrowedBuffer {
// Normally we'd use withExtendedLifetime here,
// but I'm not sure how yields work from inside a closure.
defer { _fixLifetime(owner) }
self.rawBuffer = rawBuffer
yield &self
}
}
What this means is that anyone who calls borrow(from:rawBuffer:)
gets back a BorrowedBuffer, but also a sort of "cleanup" call that needs to happen when they're done using it. It's the same basic behavior as passing a callback to withUnsafeBufferPointer
, but without the nesting. That alone isn't enough to do this, though; we'd also need to change ExpressibleByArrayLiteral to use it:
__shared init(arrayLiteral elements: Element...)
And, you know, maintain backwards compatibility somehow. I don't know if generalized coroutines are ever going to happen, though—even if they're implementable, the core team or community might decide they make the caller side's behavior too subtle. So I don't want to pin the idea of stack-allocated variadic arguments to getting generalized coroutines.