SE-0322: Temporary Uninitialized Buffers

Joe_Groff · September 7, 2021, 4:18pm

The review of SE-0322: Temporary Uninitialized Buffers, begins now and runs through September 20, 2021.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to the review manager, by email or by direct message in the Swift forums. Please include "SE-0322" somewhere in the subject text if sending review feedback by private correspondence.

What goes into a review of a proposal?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift.

When reviewing a proposal, here are some questions to consider:

What is your evaluation of the proposal?
Is the problem being addressed significant enough to warrant a change to Swift?
Does this proposal fit well with the feel and direction of Swift?
If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Thank you for helping improve Swift!

Joe Groff
Review Manager

spadafiva · September 7, 2021, 4:20pm

Hey @Joe_Groff I think that you accidentally linked to the wrong SE, as it's pointing to swift-evolution/0321-package-registry-publish.md at main · apple/swift-evolution · GitHub instead of the correct one.

Joe_Groff · September 7, 2021, 4:20pm

Thanks! I fixed the link.

spadafiva · September 7, 2021, 4:21pm

No problem! Always happy to help out a fellow Joe

Chris_Lattner3 · September 7, 2021, 5:28pm

+1, yes it is, yes it does. This is an important bit of missing functionality for working with imported APIs and otherwise for low level fiddling with things. More detailed thoughts/comments/questions below.

I've worked with C and C++ quite a bit, but I think that this approach fits well with existing Swift patterns. I read the proposal, but not the pitch phase or other discussions.

A nit about the proposal is "We could optionally provide". For clarity, it helps if the proposal would state one clear opinion about what is being proposed, instead of offering up multiple design points. Alternates can be discussed in the "alternatives considered" section.

My opinion is that we should take the scalar version of this as well. Arrays are important, but scalar aggregates are important too, and in the scalar case you want to get a UnsafeMutablePointer (not a buffer pointer) as you've written.

With respect to the new builtin, does this generate the llvm.lifetime.start/end intrinsics?

Is this enough to provide a new Array initializer that puts its elements on-the-stack ala llvm::SmallVector?

I think we should eventually discuss how we're going to un-pyramid-of-doom the with style methods, but that is definitely orthogonal to this proposal.

-Chris

grynspan · September 7, 2021, 6:07pm

Hey Chris! Some answers below:

A nit about the proposal is "We could optionally provide". For clarity, it helps if the proposal would state one clear opinion about what is being proposed, instead of offering up multiple design points. Alternates can be discussed in the "alternatives considered" section.

Note taken. It's my personal preference that we include all three proposed functions, but I know some members of the core team feel differently and I don't want my proposal to read like it's trying to shut down that discussion.

With respect to the new builtin, does this generate the llvm.lifetime.start/end intrinsics ?

The implementation will generate them, yes. As of right now the implementation branch is definitely generating incorrect IR.

Is this enough to provide a new Array initializer that puts its elements on-the-stack ala llvm::SmallVector ?

As discussed in the pitch thread, encapsulating a temporary buffer like this in a value triggers the need for escape analysis, which often prevents stack promotion. The scoped with functions avoid that issue by explicitly defining a lifetime for their temporary buffers.

I think we should eventually discuss how we're going to un-pyramid-of-doom the with style methods, but that is definitely orthogonal to this proposal.

Agreed. Please ping me if you get that discussion going!

xwu · September 7, 2021, 8:21pm

I agree with this feedback essentially in its entirety.

This proposal is well written and the motivation is persuasive. The solution addresses the problem and does so in a way that is consistent with the overall design of Swift’s other unsafe APIs.

I would echo Chris’s point that the proposal in the form as reviewed should stake out precisely what is proposed. The core team can disagree and adopt an alternative but there needs to be one design under review since at the end of the process only one design can be adopted. By presenting instead a set of options, the degree to which that presentation elicits broader agreement is exactly commensurate with the degree to which it encourages glossing over one or more possible controversial decisions, leaving the core team with less community feedback.

As it is I agree that all three APIs are worth adding.

Naming nit: since a UMBP (I’m not going to try typing it out on my phone) is a discrete thing, seems it’d be more consistent to call this API withUninitializedUnsafeMBP rather than to split the term and call it withUnsafeUninitializedMBP. I don’t think the unsafe-ness of the API will suffer when it’s named in that way.

Nobody1707 · September 7, 2021, 8:33pm

+1. The problem is definitely significant, and the proposal certainly seems to fit the direction and feel of Swift.

In comparison to C and C++, I feel that this makes uninitialized values much safer to use. The only possible improvement (for the single value case) would be the addition of out parameters.

I read the proposal in its entirety, and read through the original pitch thread.

P.S. Do we know how well the compiler is able to optimize out the return ptr.move()? Are there any test cases that guarantee some minimal level of copy elision?

Nickolas_Pohilets · September 7, 2021, 10:10pm

Nit:

`body` does not need to deallocate the buffer pointer.

This sounds as if body may deallocate the buffer pointer. "Should not"/"must not" probably would be a better wording.

Question:
What happens if passed meta-type is not T.self? Can it be some some U.self, statically typed as T.Type, where T and U have different sizes, strides or alignments? I was not able to construct any such examples. That's impossible, right?

grynspan · September 8, 2021, 12:10am

The only use of T in the signature is the T.Type parameter. (Well, or in the signature of body, but either way type inference is going to require that T == T.) There is no separate U for the function to be generic over.

So no, the language does not allow what you're describing.

grynspan · September 8, 2021, 12:11am

I'll see if I can reduce my weasel word count if/when I revise the proposal.

jrose · September 8, 2021, 12:23am

Nice work with this one, @grynspan! I’ll stay out of the naming discussions this time, but ask for one clarifying note: is the stack-vs-heap heuristic also emitted into client binaries, or is it a standard library function whose implementation may change at runtime? The former means you need to recompile to get a better heuristic, but the latter would need some kind of fallback anyway for backwards-deployment.

lorentey · September 8, 2021, 3:53am

These look desirable, especially when interacting with low-level system APIs.

Is it okay to bake the stack vs heap limit into executables?

The names are what they are; they derive directly from our existing naming scheme for these sorts of constructs. FWIW, I find top-level with… functions are both undiscoverable (when I need them) and are too much in my face (when I don’t), at the same time. I think I’d prefer if we moved these as static functions under their corresponding types, like @benrimmington suggested during the pitch:

UnsafeMutableBufferPointer<Foo>.withEphemeral(capacity: 42) { buffer in … }
UnsafeMutableRawBufferPointer.withEphemeral(byteCount: 42, alignment: 8) { buffer in }
UnsafeMutablePointer<Foo>.withEphemeral { pointer in … }

The new functions are a good thematic fit with the existing allocate members, and tucking these away in the preexisting unsafe types would reduce concerns about adding even more top-level unsafe constructs to the stdlib.

Nickolas_Pohilets · September 8, 2021, 8:11am

I was talking about something like this:

class C {}
class D: C {}
let t: C.Type = D.self
// T is deduced to be C, but value of `type` is D.self
withUnsafeUninitializedMutableBufferPointer(to: t, capacity: 42) { buffer in
}

In this example C and D are both single word class references. But if there exists an example for let t: T.Type = U.self where T and U have different memory layout, it will cause issues. But looks like no such example exists.

grynspan · September 8, 2021, 9:25am

In this example, these are reference types so the buffer element size will be the size of a reference, i.e. a word. For a protocol type, they'll be the size of an existential value (box), which IIRC is four words. So in either case, the memory layout of the buffer will be sufficient. Does that make sense?

grynspan · September 8, 2021, 9:31am

It's possible we'll come up with a better mechanism for heap fallback over in the implementation PR on GitHub, so I don't want to examine it too closely. My current implementation bakes in the 1024 value (configurable at compile-time.) If the size of the requested buffer is statically known, the branch elimination pass typically ends up removing it outright.

The optimal codegen here is basically just ADD SP, SP, #N, and I don't think we want to lose that by requiring a call to the stdlib, so at least some of the heuristic ought to be baked in. The implementation PR leaves room for standard library support in a future revision (requiring a recompile) if we think it would be beneficial.

I don't feel strongly here, but it would be good to get consensus from the core team.

1-877-547-7272 · September 8, 2021, 10:10am

+1

However, I don’t really like the name withUnsafeUninitializedMutableBufferPointer(to:capacity:_:). IMO it would be better to go with something like withUnsafeMutableBuffer(boundTo:capacity:_:). It’s more concise without losing clarity (IMO). In fact, I think it’s more clear that we aren’t creating a buffer from the metatype (which is what a function like withUnsafePointer(to:_:) would do), but instead binding memory to that type. The fact that the buffer is uninitialized can be put in the documentation, like with UnsafeMutable(Raw)(Buffer)Pointer.allocate.

I don’t like UnsafeMutable(Raw)(Buffer)Pointer.withEphemeral since “ephemeral” is an obscure term.

To address @Nickolas_Pohilets’s concerns:
This type of thing is already possible with current APIs.

class Foo {}
class Bar: Foo {}
let metatype: Foo.Type = Bar.self
withUnsafePointer(to: 0) {
    $0.withMemoryRebound(to: metatype, capacity: 1) {
        print(type(of: $0))
    }
}

However, if you change Foo to a protocol, then you will get an error that “generic parameter ‘T’ could not be inferred” for withMemoryRebound. Since this is only possible with classes and classes are always one word, there’s no risk of under/over-allocating.

(Also, small correction for @grynspan: protocol existentials are five words. IIRC there are three words for the instance, one word for the protocol witness table, and one word for the value witness table.)

I have a question: can the concept of dynamic stack allocations be used to remove the “allocation size must be known at compile-time” restriction on stack-promoting references?

grynspan · September 8, 2021, 10:24am

That's what I get for posting pre-coffee.

It is conceivable that if, at runtime, an object's size is determined to be "small" and is known not to escape the current context, it could be stack-promoted. However, this proposal does not cover such a change. You might want to start a pitch thread for it if you feel strongly!

AlexanderM · September 8, 2021, 11:27am

I’m definitely a little worried about this. There’s an ever-growing number of these, and their long (but similar) names can be hard to distinguish at-a-glance.

johannesweiss · September 8, 2021, 3:23pm

+1, this is important.

Also +1 to the concerns about further top-level with*Unsafe* APIs. I think nesting this into URBP makes sense.