[Pitch] Pointer family initialization improvements

Hello,

Here is the third in a series of pitches related to improvements the Pointer and BufferPointer family. This pitch is about initialization, and management of initialization state in your manual allocations.

Initialization improvements for UnsafePointer and UnsafeBufferPointer family

Introduction

The types in the UnsafeMutablePointer family typically require manual management of memory allocations, including the management of their initialization state. The states involved are, after allocation:

  1. Unbound and uninitialized (as returned from UnsafeMutableRawPointer.allocate())
  2. Bound to a type, and uninitialized (as returned from UnsafeMutablePointer<T>.allocate())
  3. Bound to a type, and initialized

Memory can be safely deallocated whenever it is uninitialized.

Unfortunately, not every relevant type in the family has the necessary functionality to fully manage the initialization state of its memory. We intend to address this issue in this proposal, and provide functionality to manage initialization state in a much expanded variety of situations.

Motivation

Memory allocated using UnsafeMutablePointer, UnsafeMutableRawPointer, UnsafeMutableBufferPointer and UnsafeMutableRawBufferPointer is passed to the user in an uninitialized state. In the general case, such memory needs to be initialized before it is used in Swift. Memory can be "initialized" or "uninitialized". We hereafter refer to this as a memory region's "initialization state".

The methods of UnsafeMutablePointer that interact with initialization state are:

  • func initialize(to value: Pointee)
  • func initialize(repeating repeatedValue: Pointee, count: Int)
  • func initialize(from source: UnsafePointer<Pointee>, count: Int)
  • func assign(repeating repeatedValue: Pointee, count: Int)
  • func assign(from source: UnsafePointer<Pointee>, count: Int)
  • func move() -> Pointee
  • func moveInitialize(from source: UnsafeMutablePointer<Pointee>, count: Int)
  • func moveAssign(from source: UnsafeMutablePointer<Pointee>, count: Int)
  • func deinitialize(count: Int) -> UnsafeMutableRawPointer

This is a fairly complete set.

  • The initialize functions change the state of memory locations from uninitialized to initialized, then assign the corresponding value(s).
  • The assign functions assign values into memory locations that are already initialized.
  • deinitialize changes the state of a range of memory from initialized to uninitialized.
  • The move() function deinitializes a memory location, then returns its current contents.
  • The move prefix means that the source range of memory will be deinitialized after the function returns.

In a complex use-case such as a custom-written data structure, a subrange of memory may transition between the initialized and uninitialized state multiple times during the life of a memory allocation. For example, if a mutable and contiguously allocated CustomArray is called with a sequence of alternating append and removeLast calls, one storage location will get repeatedly initialized and deinitialized. The implementor of CustomArray might want to represent the allocated buffer using UnsafeMutableBufferPointer, but that means they will have to use the UnsafeMutablePointer type instead for initialization and deinitialization.

We would like to have a full complement of corresponding functions to operate on UnsafeMutableBufferPointer, but we only have the following:

  • func initialize(repeating repeatedValue: Element)
  • func initialize<S: Sequence>(from source: S) -> (S.Iterator, Index)
  • func assign(repeating repeatedValue: Element)

Missing are methods to assign from a Sequence or a Collection, move elements from another UnsafeMutableBufferPointer, modify the initialization state of a range of memory for a particular index of the buffer, or to deinitialize (at all). Such functions would add some safety to these operations,
as they would add some bounds checking, unlike the equivalent operations on UnsafeMutablePointer, which have no concept of bounds checking.

Similarly, the functions that change the initialization state for UnsafeMutableRawPointer are:

  • func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
  • func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
  • func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>

Since initialized memory is bound to a type, these cover the essential operations.
(The assign and deinitialize operations only make sense on typed UnsafePointer<T>.)

On UnsafeMutableRawBufferPointer, we only have:

  • func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
  • func initializeMemory<S: Sequence>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>)

Missing is an equivalent to moveInitializeMemory, in particular.

Additionally, the buffer initialization functions from Sequence parameters are overly strict, and trap in many situations where the buffer length and the number of elements in a Collection do not match exactly. We can improve on this situation with initialization functions from Collections that behave more nicely.

Proposed solution

We propose to modify UnsafeMutableBufferPointer as follows:

extension UnsafeMutableBufferPointer {
    func initialize(repeating repeatedValue: Element)
    func initialize<S>(from source: S) -> (S.Iterator, Index) where S: Sequence, S.Element == Element
+++ func initialize<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
    func assign(repeating repeatedValue: Element)
+++ func assign<S>(from source: S) -> (unwritten: S.Iterator, assigned: Index) where S: Sequence, S.Element == Element
+++ func assign<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
+++ func moveInitialize(fromElements: `Self`) -> Index
+++ func moveInitialize(fromElements: Slice<`Self`>) -> Index
+++ func moveAssign(fromElements: `Self`) -> Index
+++ func moveAssign(fromElements: Slice<`Self`>) -> Index
+++ func deinitialize() -> UnsafeMutableRawBufferPointer

+++ func initializeElement(at index: Index, to value: Element)
+++ func assignElement(at index: Index, _ value: Element)
+++ func moveElement(at index: Index) -> Element
+++ func deinitializeElement(at index: Index)
}

The methods that initialize or assign from a Collection will have forgiving semantics, and copy the number of elements that they can, and return the next index in the buffer. Unlike the existing Sequence functions, they include no preconditions, with the understanding that if a user wishes stricter behaviour,
they can compose it from these functions.

The above additions include a method to assign a single element.
Evidently that is a synonym for the subscript(_ i: Index) setter.
We hope that documenting the assignment action specifically will help clarify the requirements of that action, which are evidently muddled when documented along with the subscript getter. Similarly, we propose adding to UnsafeMutablePointer and UnsafeMutableRawPointer:

extension UnsafeMutablePointer {
    func initialize(to value: Pointee)
    func initialize(repeating repeatedValue: Pointee, count: Int)
    func initialize(from source: UnsafePointer<Pointee>, count: Int)
+++ func assign(_ value: Pointee)
    func assign(repeating repeatedValue: Pointee, count: Int)
    func assign(from source: UnsafePointer<Pointee>, count: Int)
    func move() -> Pointee
    func moveInitialize(from source: UnsafeMutablePointer, count: Int)
    func moveAssign(from source: UnsafeMutablePointer, count: Int)
    func deinitialize(count: Int) -> UnsafeMutableRawPointer
}

extension UnsafeMutableRawPointer {
+++ func initializeMemory<T>(as type: T.Type, to value: T) -> UnsafeMutablePointer<T>
    func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
    func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
    func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>
}

Finally, we propose adding additional functions to initialize UnsafeMutableRawBufferPointers. The first will initialize from a Collection and have less stringent semantics than the existing function that initializes from a Sequence. The other two enable moving a range of memory into an UnsafeMutableRawBufferPointer while deinitializing a typed UnsafeMutableBufferPointer.

extension UnsafeMutableRawBufferPointer {
    func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
    func initializeMemory<S>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>) where S: Sequence
+++ func initializeMemory<C>(as type: C.Element.Type, fromElements: C) -> UnsafeMutableBufferPointer<C.Element> where C: Collection
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: UnsafeMutableBufferPointer<T>) -> UnsafeMutableBufferPointer<T>
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: Slice<UnsafeMutableBufferPointer<T>>) -> UnsafeMutableBufferPointer<T>
}

Detailed design

Note: please see the draft pull request or the full proposal for details.

Source compatibility

This proposal consists solely of additions.

Effect on ABI stability

The functions proposed here are generally small wrappers around existing functionality. We expect to implement them as @_alwaysEmitIntoClient functions, which means they would have no ABI impact.

Effect on API resilience

All functionality implemented as @_alwaysEmitIntoClient will back-deploy.

Alternatives considered

UnsafeMutableBufferPointer.moveElement(at index: Index) -> Element could use from as an argument label. from does read slightly better, but implies the existence of a to that does not exist. On the other hand, the three other related functions harmonize better with the at argument label, and having consistency with those three is good.

The single-element assignment functions, UnsafeMutablePointer.assign(_ value:) and UnsafeMutableBufferPointer.assignElement(_ value:at:), are synonyms for the setters of UnsafeMutablePointer.pointee and UnsafeMutableBufferPointer.subscript(_ i: Index), respectively. Clearly we can elect to not add them.
The setters in question, like the assignment functions, have a required precondition that the memory they refer to must be initialized. Somehow this precondition is often overlooked and that leads to bug reports. The proposed names and cross-references will hopefully help clarify the requirements to users.

The initialization and assignment functions that copy from Collection inputs use the argument label fromElements. This is different from the pre-existing functions that copy from Sequence inputs. We could use the same argument label (from) is with the Sequence inputs, but that would result in pairs of functions that are overloaded by their return-type. Functions with return-type overloads are often unwieldy, and we chose to avoid them in this situation.

Acknowledgments

Kelvin Ma (aka Taylor Swift)'s initial versions of the pitch that became SE-0184 included more functions to manipulate initialization state. These were deferred, but the functionality has not been pitched again until now.

10 Likes

A bit of naming discussion: on using the word "assign" in function names.

First of all: it is the statu quo.

But, in the documentation for these functions, effort is expended to say that what these functions do is update or replace an existing value. As @rxwei pointed out in a different channel, TSPL's description of the "assignment operator" says that it either initializes or updates depending on context. Given this, should all functions that use the word "assign" be renamed to use the word "update"?

1 Like

I have certainly run into the “must extract a non-buffer pointer from a buffer-pointer in order to interact with an element” annoyance, so I look forward to seeing that resolved.

I haven’t yet taken a deep dive into all the other proposed changes—in particular I have not yet considered how the various collection-related methods will look and feel in practical use—but overall the ideas seem sound.

I think that “reading well” trumps “using the same preposition as other methods that share a verb”.

1 Like

I'm not sure I understand correctly:

If the existing functions are overly strict (also: in what way are they overly strict?), then the solution seems to me should be to make them not overly strict.

If—on the other hand—the strictness may after all be necessary, then it cannot be said to "overly strict" and we should not be vending versions without preconditions, which would then be underly strict as it were.

In either case, I don't understand why it is pitched that initialize(from:) should be the possibly "overly" strict method but initialize(fromElements:) the possibly "underly" strict method. They all initialize "from elements."

Changing the behaviour of initialize(as:from: Sequence) would simply be source-breaking. Also, it isn’t really fixable, because it outsources its behaviour to an overridable Sequence method that behaves differently e.g. in Array than in UMBP. (Probably because it predates withContiguousStorageIfAvailable.) In practice this makes it incomprehensibly capricious. It’s generally fine with non-Collection Sequences, though.

This new proposed method has a well-defined behaviour that makes trapping assertions unnecessary. If the source is empty, it just returns startIndex. If the source has more elements than the buffer can hold, it just returns endIndex after filling the buffer.

The argument label is different in order to ensure source compatibility. The same argument label (fromElements) is proposed for the methods that have similar buffer-filling behaviour.

Can you describe how relaxing "overly strict" conditions is source-breaking in this scenario? In general, if a function allows a less restricted range of inputs than before, it is an improvement that doesn't break any currently working code.

Can you please elaborate with some details? What behaviors are capricious and what makes them generally fine with non-collection sequences? I think this would be important background to understand what’s being addressed here with the changes you’re pitching.

If the behavior before could not be relied upon or easily reasoned about, and the new proposed behavior is well-defined and doesn't trap where it currently traps unnecessarily, why shouldn't it replace the old behavior?

If not replacing the old behavior, then the label needs to provide a meaningful indication of the relevant difference in behavior—since after all the argument for not replacing would be that it is important for the user to be able to choose between the two different behaviors. Initiating "from" a sequence and initiating "from elements" don't provide any indication why they're different any more than "from1" and "from2".

And if a meaningful name can't be presented to the user, I think that's another hint that the new behavior should simply replace the old behavior. But perhaps there’s some missing background info that explains why they must co-exist.

Consider the following program:

let source = (0...10).map({"This string ends with an integer equal to \($0)"})

var destination = Array(unsafeUninitializedCapacity: 10) {
  buffer, initializedCount in
  source.withContiguousStorageIfAvailable({
    sourceAsBuffer in
    (_, initializedCount) = buffer.initialize(from: sourceAsBuffer)
  })
}

destination.forEach { print($0) }


destination = Array(unsafeUninitializedCapacity: 10) {
  buffer, initializedCount in
  (_, initializedCount) = buffer.initialize(from: IteratorSequence(source.makeIterator()))
}

destination.forEach { print($0) }


destination = Array(unsafeUninitializedCapacity: 10) {
  buffer, initializedCount in
  (_, initializedCount) = buffer.initialize(from: source)
}

destination.forEach { print($0) }

It calls the same function (initialize<S: Sequence>(from: S)) three times, with Sequences of different types that contain the same data. Can you explain why the third use crashes based on its documentation?

This happens because the function is effectively implemented 28 different times in the standard library (($ find stdlib/public -type f -exec egrep -H func.+_copyContents {} \; | wc -l). Some of these make different decisions than others, with the result above.

It is conceivable that we might manage to fix this "in place". However, this is all in inlinable code, so we would need to replace every version with an alwaysEmitIntoClient copy, and leave the old implementation in place with some awful trickery involving sil-gen names. Even if everything goes right, we would still have 28 different copies of that dreaded function. However, an unintentional source (or compatibility) break may be more likely than the ideal outcome.

The proposed replacement relies on withContiguousStorageIfAvailable for its fast path, a composable component. For the rest, it consists of a single implementation. In the long run, this is a win for maintainability.

The behaviour of the function with pure Sequences is generally correct, and the function signature is fine for Sequences. We could possibly add a deprecated, @_alwaysEmitIntoClient overload that takes a Collection implementor in order to nudge users towards the new Collection-taking initialization function.

(see above.)

Initializing from a Sequence should return the iterator, but the iterator is not relevant when doing the same with a Collection. In other words, the ideal embodiment of this function has a different return type based on whether it takes a Sequence or a Collection. Overloading on the return type is quite unwieldy, as a change in the argument can result in a change of the return type, resulting in unfortunate ergonomics. Forcing the Sequence return type on the Collection use case isn't a particularly good solution, either. The previous two sentences embody my opinion why it's better to have a different label. I'd like to see what others think.

1 Like

Thank you for the feedback, everyone. I will soon post an updated pitch with clarifications and a few changes.

The largest of which is a proposal to rename the "assign" family of functions to "update". "Update" better expresses the initialization requirement, since for data to be updated, there must be pre-existing data. There are 4 functions from Swift 5.5 and earlier affected, and 5 new additions. Updating the doc-comments for these functions to use the "update" terminology has made their documentation more understandable. This being said, it is clearly a change that could be omitted from the final proposal.

I'm interested in hearing the community's opinion about this.

Here is an updated pitch, below. The full document is here.

The changes are clarifications, the assign renaming mentioned in the previous post, and moveElement(at index:) to moveElement(from index:) (h/t Nevin).

Initialization improvements for UnsafePointer and UnsafeBufferPointer family

Introduction

The types in the UnsafeMutablePointer family typically require manual management of memory allocations, including the management of their initialization state. The states involved are, after allocation:

  1. Unbound and uninitialized (as returned from UnsafeMutableRawPointer.allocate())
  2. Bound to a type, and uninitialized (as returned from UnsafeMutablePointer<T>.allocate())
  3. Bound to a type, and initialized

Memory can be safely deallocated whenever it is uninitialized.

Unfortunately, not every relevant type in the family has the necessary functionality to fully manage the initialization state of its memory. We intend to address this issue in this proposal, and provide functionality to manage initialization state in a much expanded variety of situations.

Swift-evolution thread: Pitch thread

Motivation

Memory allocated using UnsafeMutablePointer, UnsafeMutableRawPointer, UnsafeMutableBufferPointer and UnsafeMutableRawBufferPointer is passed to the user in an uninitialized state. In the general case, such memory needs to be initialized before it is used in Swift. Memory can be "initialized" or "uninitialized". We hereafter refer to this as a memory region's "initialization state".

The methods of UnsafeMutablePointer that interact with initialization state are:

  • func initialize(to value: Pointee)
  • func initialize(repeating repeatedValue: Pointee, count: Int)
  • func initialize(from source: UnsafePointer<Pointee>, count: Int)
  • func assign(repeating repeatedValue: Pointee, count: Int)
  • func assign(from source: UnsafePointer<Pointee>, count: Int)
  • func move() -> Pointee
  • func moveInitialize(from source: UnsafeMutablePointer<Pointee>, count: Int)
  • func moveAssign(from source: UnsafeMutablePointer<Pointee>, count: Int)
  • func deinitialize(count: Int) -> UnsafeMutableRawPointer

This is a fairly complete set.

  • The initialize functions change the state of memory locations from uninitialized to initialized, then assign the corresponding value(s).
  • The assign functions update the values stored at memory locations that have previously been initialized.
  • deinitialize changes the state of a range of memory from initialized to uninitialized.
  • The move() function deinitializes a memory location, then returns its current contents.
  • The move prefix means that the source range of memory will be deinitialized after the function returns.

In a complex use-case such as a custom-written data structure, a subrange of memory may transition between the initialized and uninitialized state multiple times during the life of a memory allocation. For example, if a mutable and contiguously allocated CustomArray is called with a sequence of alternating append and removeLast calls, one storage location will get repeatedly initialized and deinitialized. The implementor of CustomArray might want to represent the allocated buffer using UnsafeMutableBufferPointer, but that means they will have to use the UnsafeMutablePointer type instead for initialization and deinitialization.

We would like to have a full complement of corresponding functions to operate on UnsafeMutableBufferPointer, but we only have the following:

  • func initialize(repeating repeatedValue: Element)
  • func initialize<S: Sequence>(from source: S) -> (S.Iterator, Index)
  • func assign(repeating repeatedValue: Element)

Missing are methods to assign from a Sequence or a Collection, move elements from another UnsafeMutableBufferPointer, modify the initialization state of a range of memory for a particular index of the buffer, or to deinitialize (at all). Such functions would add some safety to these operations, as they would add some bounds checking, unlike the equivalent operations on UnsafeMutablePointer, which have no concept of bounds checking.

Similarly, the functions that change the initialization state for UnsafeMutableRawPointer are:

  • func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
  • func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
  • func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>

Since initialized memory is bound to a type, these cover the essential operations.
(The assign and deinitialize operations only make sense on typed UnsafePointer<T>.)

On UnsafeMutableRawBufferPointer, we only have:

  • func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
  • func initializeMemory<S: Sequence>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>)

Missing is an equivalent to moveInitializeMemory, in particular.

Additionally, the buffer initialization functions from Sequence parameters are overly strict, and trap in many situations where the buffer length and the number of elements in a Collection do not match exactly. We can improve on this situation with initialization functions from Collections that behave more nicely.

There are four existing functions that use the assign (or moveAssign) name. This name is unfortunately not especially clear. In The Swift Programming Language, = is called the "assignment operator", and is said to either initialize or update a variable. The word "update" here is much clearer, as it implies the existence of a prior value, which communicates the requirement that a given memory location must have been previously initialized. For this reason, we propose to rename "assign" to "update". This would involve deprecating the existing (rarely-used) functions, with a straightforward fixit. The existing symbol can be reused for purposes of ABI stability.

Proposed solution

Note: in the pseudo-diffs presented in this section, +++ indicates an added symbol, while --- indicates a renamed symbol.

We propose to modify UnsafeMutableBufferPointer as follows:

extension UnsafeMutableBufferPointer {
    func initialize(repeating repeatedValue: Element)
    func initialize<S>(from source: S) -> (S.Iterator, Index) where S: Sequence, S.Element == Element
+++ func initialize<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
--- func assign(repeating repeatedValue: Element)
+++ func update(repeating repeatedValue: Element)
+++ func update<S>(from source: S) -> (unwritten: S.Iterator, updated: Index) where S: Sequence, S.Element == Element
+++ func update<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
+++ func moveInitialize(fromElements: UnsafeMutableBufferPointer) -> Index
+++ func moveInitialize(fromElements: Slice<UnsafeMutableBufferPointer>) -> Index
+++ func moveUpdate(fromElements: `Self`) -> Index
+++ func moveUpdate(fromElements: Slice<`Self`>) -> Index
+++ func deinitialize() -> UnsafeMutableRawBufferPointer

+++ func initializeElement(at index: Index, to value: Element)
+++ func updateElement(at index: Index, to value: Element)
+++ func moveElement(from index: Index) -> Element
+++ func deinitializeElement(at index: Index)
}

The methods that initialize or update from a Collection will have forgiving semantics, and copy the number of elements that they can, be that every available element or none, and then return the next index in the buffer. Unlike the existing Sequence functions, they include no preconditions beyond having a valid Collection and valid buffer, with the understanding that if a user wishes stricter behaviour, they can compose it from these functions.

The above changes include a method to assign a single element. Evidently that is a synonym for the subscript(_ i: Index) setter. We hope that documenting the assignment action specifically will help clarify the requirements of that action, which are evidently muddled when documented along with the subscript getter. Similarly, we propose adding to UnsafeMutablePointer and UnsafeMutableRawPointer:

extension UnsafeMutablePointer {
    func initialize(to value: Pointee)
    func initialize(repeating repeatedValue: Pointee, count: Int)
    func initialize(from source: UnsafePointer<Pointee>, count: Int)
+++ func update(to value: Pointee)
--- func assign(repeating repeatedValue: Pointee, count: Int)
+++ func update(repeating repeatedValue: Pointee, count: Int)
--- func assign(from source: UnsafePointer<Pointee>, count: Int)
+++ func update(from source: UnsafePointer<Pointee>, count: Int)
    func move() -> Pointee
    func moveInitialize(from source: UnsafeMutablePointer, count: Int)
--- func moveAssign(from source: UnsafeMutablePointer, count: Int)
+++ func moveUpdate(from source: UnsafeMutablePointer, count: Int)
    func deinitialize(count: Int) -> UnsafeMutableRawPointer
}

extension UnsafeMutableRawPointer {
+++ func initializeMemory<T>(as type: T.Type, to value: T) -> UnsafeMutablePointer<T>
    func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
    func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
    func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>
}

Finally, we propose adding additional functions to initialize UnsafeMutableRawBufferPointers. The first will initialize from a Collection and have less stringent semantics than the existing function that initializes from a Sequence. The other two enable moving a range of memory into an UnsafeMutableRawBufferPointer while deinitializing a typed UnsafeMutableBufferPointer.

extension UnsafeMutableRawBufferPointer {
    func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
    func initializeMemory<S>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>) where S: Sequence
+++ func initializeMemory<C>(as type: C.Element.Type, fromElements: C) -> UnsafeMutableBufferPointer<C.Element> where C: Collection
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: UnsafeMutableBufferPointer<T>) -> UnsafeMutableBufferPointer<T>
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: Slice<UnsafeMutableBufferPointer<T>>) -> UnsafeMutableBufferPointer<T>
}

Detailed design

Note: please see the draft pull request or the full proposal for details.

Source compatibility

This proposal consists mostly of additions.

The proposal includes the renaming of four existing functions from assign to update. The existing function names would be deprecated, producing a warning. A fixit will support an easy transition to the renamed versions of these functions.

Effect on ABI stability

The functions proposed here are generally small wrappers around existing functionality. We expect to implement them as @_alwaysEmitIntoClient functions, which means they would have no ABI impact.

The renamed functions can reuse the existing symbol, while the deprecated functions can use an @_alwaysEmitIntoClient support the functionality under its previous name. This would have no ABI impact.

Effect on API resilience

All functionality implemented as @_alwaysEmitIntoClient will back-deploy. Renamed functions that reuse a previous symbol will also back-deploy.

Alternatives considered

The single-element update functions, UnsafeMutablePointer.update(to:) and UnsafeMutableBufferPointer.updateElement(at:to:), are synonyms for the setters of UnsafeMutablePointer.pointee and UnsafeMutableBufferPointer.subscript(_ i: Index), respectively. Clearly we can elect to not add them. The setters in question, like the update functions, have a required precondition that the memory they refer to must be initialized. Somehow this precondition is often overlooked and leads to bug reports. The proposed names and cross-references should help clarify the requirements to users.

The renaming of assign to update could be omitted entirely, although we believe that update communicates intent much better than assign does. There are only four symbols affected by this renaming, and their replacements are easily migrated by a fixit. For context, this renaming would only 6 lines of code in the standard library, outside of the function definitions. If the renaming is omitted, the four new functions proposed in the family should use the name assign as well. The two single-element versions would be assign(_ value:) and assignElement(at:_ value:).

The initializing and updating functions that copy from Collection inputs use the argument label fromElements. This is different from the pre-existing functions that copy from Sequence inputs. We could use the same argument label (from) is with the Sequence inputs, but that would mean that we must return the Iterator for the Collection versions, and that is generally not desirable. If we did not return Iterator, then the Sequence and Collection versions of the initialize(from:) would be overloaded by their return type, and that would be source-breaking:
an existing use of the current function that doesn't immediately destructure the returned tuple could pick up the Collection overload, which would have a return value incompatible with the existing code that makes use the return value.

Acknowledgments

Kelvin Ma (aka Taylor Swift)'s initial versions of the pitch that became SE-0184 included functions to manipulate initialization state. These were deferred, but the functionality has not been pitched again until now.

2 Likes