SE-0322: Temporary Uninitialized Buffers

scanon · September 10, 2021, 12:00am

That's fair, but boundTo: still feels like a red herring. We don't talk about "binding" anywhere else with buffer types unless we're explicitly manipulating the binding, and that's not happening here; the bound type is statically known and does not change. This is an API for "experts", but the details of binding are confusing even to a lot of the people who will want to use these API (just like the analogous C and C++ "effective type" rules, which are understood by approximately nobody).

Nevin · September 10, 2021, 12:14am

Maybe:

withUnsafeBuffer(Int.self, capacity: 24) {
  foo($0)
}

grynspan · September 10, 2021, 12:22am

At this time, I think we're going to have to kick the naming can down the road and have the Swift core team iterate on it, since consensus doesn't really seem to be forming here.

Not to shut down discussion, but it might be helpful to focus our attention on any other open questions about this proposal.

scanon · September 10, 2021, 1:23am

Generally speaking the core team doesn't do that (when it has, doing so has tended to annoy people). In theory at least, the proposal is the proposal, and these sorts of things should already be hashed out (at least from the proposer's perspective).

So, scolding myself for getting dragged into bikeshedding, and reviewing what you've actually proposed:

withUnsafeUninitializedMutableBufferPointer(to:capacity:_)

I think this name is actually pretty good (it uses "buffer" instead of "memory" and doesn't mention "binding", all of which are important), but too verbose for my taste. A bunch of these words aren't adding anything to the meaning, as @beccadax noted. I don't see a good argument why it shouldn't be shortened at least to withUnsafeUninitializedBuffer( ), and possibly shorter, so I would expect to see the proposal updated with an explanation of why the longer name is appropriate. Absent that justification, I'm (weakly) opposed to the proposal as is, even though I think the functionality is important to have and the abstract API is a good model for it.

The proposal should also be updated to eliminate the "we could optionally provide additional free functions ..." section of the detailed design; either they're part of the proposal or they aren't. They should be specifically included or listed under "alternatives considered" with a note that while we may want to add them later, they are out-of-scope for the proposal. Including "we could also do X" in the detailed design section is a recipe for confusion--if the proposal is accepted by the core team, have those features been approved or not?

5sw · September 10, 2021, 2:15pm

I agree. I would find boundTo: confusing, even though I generally believe I understand what all this binding of pointer types does.

Did we consider "of" though?

withUnsafeBuffer(of: SomeType.self, capacity: c) { ... }

We use "of" all the time when speaking about containers. Its an array of int or a buffer of bytes.

grynspan · September 10, 2021, 2:45pm

I believe that's what I used in my original proposal before changing it to to.

Karl · September 13, 2021, 10:19am

So, I'm -1 on adding this new API. Since this new function does not guarantee stack allocation, I think we should be able to rely on the optimiser to perform the same heuristic described in the proposal and stack-allocate via the usual .allocate API as an optimisation.

Yes, escape analysis is, in the general case, non-decidable. That does not mean that all cases are not decidable. Here's a silly example of code which the compiler could very well reason about, but doesn't:

func test(_ n: Int) -> Int {
    let ptr = UnsafeMutableBufferPointer<Int>.allocate(capacity: 1)
    defer { ptr.deallocate() }
    ptr.baseAddress!.initialize(to: n)
    return ptr[0]
}

Here's what the (latest, nightly) compiler produces, at -O. Note the calls to swift_slowAlloc and swift_slowDealloc are not eliminated for some reason :

output.test(Swift.Int) -> Swift.Int:
        push    rbx
        mov     rbx, rdi
        mov     edi, 8
        mov     rsi, -1
        call    swift_slowAlloc@PLT
        mov     qword ptr [rax], rbx
        mov     rdi, rax
        mov     rsi, -1
        mov     rdx, -1
        call    swift_slowDealloc@PLT
        mov     rax, rbx
        pop     rbx
        ret

Godbolt

It's a silly example, for sure, but it illustrates the point. If we could rely on the compiler to optimise these obvious scoped allocations, the proposed function could simply be written as:

extension UnsafeMutableBufferPointer {

    static func withTemporaryBuffer<Result>(
        capacity: Int,
        _ body: (Self) -> Result
    ) -> Result {
        let ptr = allocate(capacity: capacity)
        defer { ptr.deallocate() }
        return body(ptr)
    }
}

So again - yeah, maybe an undecidable problem in the general case, but we could at least solve the same use-cases this proposal would solve in a more general way as an implementation enhancement, rather than by adding new APIs to the standard library.

My understanding is that, in C, this pattern of having a function which allocates memory, passes the pointer in to some opaque function, and frees it when the function returns, cannot be optimised because the opaque function might escape the pointer and longjmp out, thus escaping the outer function's control-flow and keeping the memory alive. AFAIK, that kind of use of longjmp (or C++ exceptions) is undefined behaviour in Swift, so we can be more aggressive about optimising the above pattern.

Andrew_Trick · September 13, 2021, 6:27pm

This API targets developers who need a guarantee of stack allocation (at reasonable sizes). The optimizer can't provide a clear guarantee unless we formalize some system of rules around that.

This doesn't need to be a perfect long-term solution. Ideally, these sort of APIs would be built with move-only values, but it seems reasonable to add yet another unchecked API as long as it has high enough near-term utility, which I think this does.

I actually think whatever the implementation picks for the stack allocation threshold will become de-facto API. I'd be in favor of exposing that threshold in the API.

The most important things to communicate to the API user are that the buffer is temporary and they are responsible for initializing the memory. In my mind, withUninitialized[Raw]Buffer communicates both those things and works for both typed and raw buffers.

I do think making it a static method nicely solves both the documentation problem and element type problem, but those aren't critical issues.

EDIT: If this ends up remaining a free-standing function, where the closure's argument type is inferred, then we might want to keep the "Unsafe" part of the name to more precisely denote the argument type. Then we can reuse the "Buffer" name for future "ownership-safe" move-only buffers. For that reason, I think withUnsafe[Raw]Buffer is also a fine name.

Braden_Scothern · September 14, 2021, 1:36am

Big +1 from me.

This fills a big gap in Swift performance when interfacing with other languages.

I have been following this since the pitch phase.

In general I’m not particularly picky on the name but I would rather it not have the notion of binding in it.

grynspan · September 14, 2021, 5:55pm

I hear your concerns about optimizing the existing API. That actually does sound like an area where the optimizer can be smarter. Would you please file a bug report at https://bugs.swift.org? Thanks!

The heuristic for stack allocation here may well end up being platform-dependent. For instance, when building for an embedded system or a system with small stacks, we might want to reduce the cap, while for a hypothetical system on which the stack cannot be practically exhausted (e.g. where each thread's stack has a size ≥ 2⁴⁸) we might want to just say "always stack-allocate." As well, the current implementation proposal allows the caller to customize the cap by passing an option to swiftc.

Convention for scoped unsafe access is that the functions' names should start with withUnsafe regardless of what comes after.

Joe_Groff · September 14, 2021, 10:07pm

The Core Team discussed this review today, and although the review still continues into next week, we wanted to provide our response to the discussion thus far. First of all, we agree that this is useful functionality for programmers who need stronger behavior guarantees from their code than the existing language can currently provide. Although in some circumstances the compiler can optimize heap allocations into stack allocations, and the optimizer will continue to improve in time, it is valuable to have an API that can guarantee that stack allocation occurs in more circumstances.

However, in the spirit of providing that control, some participants raised concerns about the level of control the proposed API offers in the stack/heap allocation threshold. As proposed, the API uses an unspecified threshold to decide whether the provided allocation is on the stack or not, leaving the answers to some questions unspecified:

What is the threshold?
Can the threshold be changed over time?
Can the threshold vary on different platforms, or be influenced by clients or the environment?
Is the threshold decided by the Swift runtime, where it could be changed by deploying a newer runtime, or is it inlined into callers?

The Core Team believes that, for applications that require maximum predictability, these APIs should provide an optional parameter for the maximum stack allocation threshold, for fully predictable behavior. For most applications, a reasonable default threshold ought to suffice, although there are many considerations that could go into picking that default, so we would like to see the review continue to explore that aspect of the design.

On the subject of naming, the Core Team sees this API as following the precedent of other top-level scoped memory manipulation APIs like withUnsafePointer(to:), so it too should remain a set of top-level functions. The Core Team also sees the Unsafe qualifier is also an important part of the name, but we agree that "Mutable" and "Pointer" are probably redundant, given that the only way Swift can interact with an uninitialized block of memory must be mutable and via pointers. We are in favor of a shorter name that still captures the essence of the API, that it provides a temporary buffer.

grynspan · September 15, 2021, 6:01pm

Even if the maximum stack allocation size is configurable, we still cannot safely guarantee that a stack allocation will occur for a given set of inputs in all cases. For instance, a value is passed for alignment that is not supported by the current architecture. So such a parameter remains a hint only.

I certainly understand why caller control may be desireable, but I also worry that if we provide this parameter then folks will set it without understanding the implications (i.e. the risk of a stack overflow.) Any thoughts on preventing misuse? Is it sufficient to document the optional parameter and say something like "most callers do not need to specify a value for this parameter"?

Edit: I've already implemented the ability to override, at compile-time, the heuristic value used (i.e. the "1024" we've talked about.) For developers that want to always stack-allocate, they can simply set this value to some large number and call it a day. If we then find that developers really need to configure this value at runtime, we can always augment the proposed functions in a future language update.

Ben_Cohen · September 16, 2021, 2:36pm

This seems hard to justify (unless I'm missing something about the motivation) – it's trivial to grab the baseAddress from a buffer to get the pointer. This could always be revisited if evidence of actual usage suggests that it's so frequent bit of boilerplate that a whole (top level) additional method is justified.

grynspan · September 16, 2021, 3:53pm

@here: Thanks for the feedback so far, everyone! Based on individual feedback as well as the core team's input (per @Joe_Groff), I'm going to propose different names for the functions. (Please keep in mind that there are a lot of strong opinions here and it won't be possible to satisfy everyone who has contributed.)

public func withUnsafeTemporaryAllocation<R>(
  byteCount: Int,
  alignment: Int,
  _ body: (UnsafeMutableRawBufferPointer) throws -> R
) rethrows -> R

public func withUnsafeTemporaryAllocation<T, R>(
  of type: T.Type,
  capacity: Int,
  _ body: (UnsafeMutableBufferPointer<T>) throws -> R
) rethrows -> R

public func withUnsafeTemporaryAllocation<T, R>(
  of type: T.Type,
  _ body: (UnsafeMutablePointer<T>) throws -> R
) rethrows -> R

I appreciate the desire to avoid simple one-liner functions but in this case I really do think we benefit from having a separate single-element-yielding function. Remember that UnsafeBufferPointer.baseAddress is an optional value, so all callers must say .baseAddress!, if let _ = .baseAddress, or similar. It's definitely extra boilerplate that's unnecessary since we know the allocation of a single value will always succeed in normal usage.

And there are going to be plenty of use cases that only need a pointer to a single temporary value. As a trivial example, consider the C function stat() on POSIX platforms. It takes the address of an uninitialized struct stat value and populates it:

try withUnsafeTemporaryAllocation(of: stat.self) { statValue in
  guard 0 == stat(path, statValue) else {
    throw POSIXError(...)
  }
}

But if the caller must work with a buffer pointer, they have to say:

try withUnsafeTemporaryAllocation(of: stat.self, capacity: 1) { statValue in
  guard 0 == stat(path, statValue.baseAddress!) else {
    throw POSIXError(...)
  }
}

The capacity parameter and .baseAddress! call (with force-unwrapping) serve to complicate the call to stat() and make it less readable.

Joe_Groff · September 16, 2021, 4:00pm

grynspan:

Even if the maximum stack allocation size is configurable, we still cannot safely guarantee that a stack allocation will occur for a given set of inputs in all cases. For instance, a value is passed for alignment that is not supported by the current architecture. So such a parameter remains a hint only.

I certainly understand why caller control may be desireable, but I also worry that if we provide this parameter then folks will set it without understanding the implications (i.e. the risk of a stack overflow.) Any thoughts on preventing misuse? Is it sufficient to document the optional parameter and say something like "most callers do not need to specify a value for this parameter"?

Edit: I've already implemented the ability to override, at compile-time, the heuristic value used (i.e. the "1024" we've talked about.) For developers that want to always stack-allocate, they can simply set this value to some large number and call it a day. If we then find that developers really need to configure this value at runtime, we can always augment the proposed functions in a future language update.

Having a way to adjust the threshold at call sites sounds sufficient to me (talking personally, not on behalf of the Core Team). In certain critical sections where falling back to malloc would be unacceptable, letting the attempt to stack-allocate trap is arguably preferable to falling back.

Ben_Cohen · September 16, 2021, 4:02pm

Specifically, they should write .baseAddress!. This is trivial to do and IMO nowhere close to enough justification for adding a whole second function. To anticipate a counter: ! is the right thing to use here. The pointer is guaranteed non-nil in this case. Projects that lint for ! and require if let instead should not do that.

Chris_Lattner3 · September 16, 2021, 4:08pm

Sure, it isn't required for "functionality", but your suggestion would be a pretty significant issue for ergonomics. It's roughly the same reason that C has the x->f sugar operator instead of requiring everyone to use x[0].f.

-Chris

Ben_Cohen · September 16, 2021, 4:11pm

That's not a good analogy because -> is fundamental to the very essence of C. It is everywhere, and it's part of the language. This function, OTOH, is a trivial bit of library sugar for a relatively niche bit of functionality.

Chris_Lattner3 · September 16, 2021, 4:13pm

Right, -> is a core operator so it has pervasive impact (pro and con). The addition we're talking about is an inconsequential library function that comes at effectively zero cost. Your assumption is that stack allocated arrays in the primary use-case, I can see many scalar use cases.

Ben_Cohen · September 16, 2021, 4:19pm

While the impact of adding yet another variant is low, I think it's much higher than the sugar benefit. "Why are there so many of these and how are they all different" is the first thing people ask about these micro-variants, of which we already have too many IMO, all sitting in prime real estate like the top level or methods on Array, and we need to cut back on adding them so freely given the tiny benefit they bring.