Hello,
Here is the third in a series of pitches related to improvements the Pointer
and BufferPointer
family. This pitch is about initialization, and management of initialization state in your manual allocations.
Initialization improvements for UnsafePointer and UnsafeBufferPointer family
- Proposal: SE-NNNN Initialization improvements for UnsafePointer and UnsafeBufferPointer family
- Author: Guillaume Lessard
- Review Manager: TBD
- Implementation: Draft Pull Request
- Bugs: pending
- Previous Revision: none
Introduction
The types in the UnsafeMutablePointer
family typically require manual management of memory allocations, including the management of their initialization state. The states involved are, after allocation:
- Unbound and uninitialized (as returned from
UnsafeMutableRawPointer.allocate()
) - Bound to a type, and uninitialized (as returned from
UnsafeMutablePointer<T>.allocate()
) - Bound to a type, and initialized
Memory can be safely deallocated whenever it is uninitialized.
Unfortunately, not every relevant type in the family has the necessary functionality to fully manage the initialization state of its memory. We intend to address this issue in this proposal, and provide functionality to manage initialization state in a much expanded variety of situations.
Motivation
Memory allocated using UnsafeMutablePointer
, UnsafeMutableRawPointer
, UnsafeMutableBufferPointer
and UnsafeMutableRawBufferPointer
is passed to the user in an uninitialized state. In the general case, such memory needs to be initialized before it is used in Swift. Memory can be "initialized" or "uninitialized". We hereafter refer to this as a memory region's "initialization state".
The methods of UnsafeMutablePointer
that interact with initialization state are:
func initialize(to value: Pointee)
func initialize(repeating repeatedValue: Pointee, count: Int)
func initialize(from source: UnsafePointer<Pointee>, count: Int)
func assign(repeating repeatedValue: Pointee, count: Int)
func assign(from source: UnsafePointer<Pointee>, count: Int)
func move() -> Pointee
func moveInitialize(from source: UnsafeMutablePointer<Pointee>, count: Int)
func moveAssign(from source: UnsafeMutablePointer<Pointee>, count: Int)
func deinitialize(count: Int) -> UnsafeMutableRawPointer
This is a fairly complete set.
- The
initialize
functions change the state of memory locations from uninitialized to initialized, then assign the corresponding value(s). - The
assign
functions assign values into memory locations that are already initialized. -
deinitialize
changes the state of a range of memory from initialized to uninitialized. - The
move()
function deinitializes a memory location, then returns its current contents. - The
move
prefix means that thesource
range of memory will be deinitialized after the function returns.
In a complex use-case such as a custom-written data structure, a subrange of memory may transition between the initialized and uninitialized state multiple times during the life of a memory allocation. For example, if a mutable and contiguously allocated CustomArray
is called with a sequence of alternating append
and removeLast
calls, one storage location will get repeatedly initialized and deinitialized. The implementor of CustomArray
might want to represent the allocated buffer using UnsafeMutableBufferPointer
, but that means they will have to use the UnsafeMutablePointer
type instead for initialization and deinitialization.
We would like to have a full complement of corresponding functions to operate on UnsafeMutableBufferPointer
, but we only have the following:
func initialize(repeating repeatedValue: Element)
func initialize<S: Sequence>(from source: S) -> (S.Iterator, Index)
func assign(repeating repeatedValue: Element)
Missing are methods to assign from a Sequence
or a Collection
, move elements from another UnsafeMutableBufferPointer
, modify the initialization state of a range of memory for a particular index of the buffer, or to deinitialize (at all). Such functions would add some safety to these operations,
as they would add some bounds checking, unlike the equivalent operations on UnsafeMutablePointer
, which have no concept of bounds checking.
Similarly, the functions that change the initialization state for UnsafeMutableRawPointer
are:
func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>
Since initialized memory is bound to a type, these cover the essential operations.
(The assign
and deinitialize
operations only make sense on typed UnsafePointer<T>
.)
On UnsafeMutableRawBufferPointer
, we only have:
func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
func initializeMemory<S: Sequence>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>)
Missing is an equivalent to moveInitializeMemory
, in particular.
Additionally, the buffer initialization functions from Sequence
parameters are overly strict, and trap in many situations where the buffer length and the number of elements in a Collection
do not match exactly. We can improve on this situation with initialization functions from Collection
s that behave more nicely.
Proposed solution
We propose to modify UnsafeMutableBufferPointer
as follows:
extension UnsafeMutableBufferPointer {
func initialize(repeating repeatedValue: Element)
func initialize<S>(from source: S) -> (S.Iterator, Index) where S: Sequence, S.Element == Element
+++ func initialize<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
func assign(repeating repeatedValue: Element)
+++ func assign<S>(from source: S) -> (unwritten: S.Iterator, assigned: Index) where S: Sequence, S.Element == Element
+++ func assign<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
+++ func moveInitialize(fromElements: `Self`) -> Index
+++ func moveInitialize(fromElements: Slice<`Self`>) -> Index
+++ func moveAssign(fromElements: `Self`) -> Index
+++ func moveAssign(fromElements: Slice<`Self`>) -> Index
+++ func deinitialize() -> UnsafeMutableRawBufferPointer
+++ func initializeElement(at index: Index, to value: Element)
+++ func assignElement(at index: Index, _ value: Element)
+++ func moveElement(at index: Index) -> Element
+++ func deinitializeElement(at index: Index)
}
The methods that initialize or assign from a Collection
will have forgiving semantics, and copy the number of elements that they can, and return the next index in the buffer. Unlike the existing Sequence
functions, they include no preconditions, with the understanding that if a user wishes stricter behaviour,
they can compose it from these functions.
The above additions include a method to assign a single element.
Evidently that is a synonym for the subscript(_ i: Index)
setter.
We hope that documenting the assignment action specifically will help clarify the requirements of that action, which are evidently muddled when documented along with the subscript getter. Similarly, we propose adding to UnsafeMutablePointer
and UnsafeMutableRawPointer
:
extension UnsafeMutablePointer {
func initialize(to value: Pointee)
func initialize(repeating repeatedValue: Pointee, count: Int)
func initialize(from source: UnsafePointer<Pointee>, count: Int)
+++ func assign(_ value: Pointee)
func assign(repeating repeatedValue: Pointee, count: Int)
func assign(from source: UnsafePointer<Pointee>, count: Int)
func move() -> Pointee
func moveInitialize(from source: UnsafeMutablePointer, count: Int)
func moveAssign(from source: UnsafeMutablePointer, count: Int)
func deinitialize(count: Int) -> UnsafeMutableRawPointer
}
extension UnsafeMutableRawPointer {
+++ func initializeMemory<T>(as type: T.Type, to value: T) -> UnsafeMutablePointer<T>
func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T, count: Int) -> UnsafeMutablePointer<T>
func initializeMemory<T>(as type: T.Type, from source: UnsafePointer<T>, count: Int) -> UnsafeMutablePointer<T>
func moveInitializeMemory<T>(as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int) -> UnsafeMutablePointer<T>
}
Finally, we propose adding additional functions to initialize UnsafeMutableRawBufferPointer
s. The first will initialize from a Collection
and have less stringent semantics than the existing function that initializes from a Sequence
. The other two enable moving a range of memory into an UnsafeMutableRawBufferPointer
while deinitializing a typed UnsafeMutableBufferPointer
.
extension UnsafeMutableRawBufferPointer {
func initializeMemory<T>(as type: T.Type, repeating repeatedValue: T) -> UnsafeMutableBufferPointer<T>
func initializeMemory<S>(as type: S.Element.Type, from source: S) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>) where S: Sequence
+++ func initializeMemory<C>(as type: C.Element.Type, fromElements: C) -> UnsafeMutableBufferPointer<C.Element> where C: Collection
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: UnsafeMutableBufferPointer<T>) -> UnsafeMutableBufferPointer<T>
+++ func moveInitializeMemory<T>(as type: T.Type, fromElements: Slice<UnsafeMutableBufferPointer<T>>) -> UnsafeMutableBufferPointer<T>
}
Detailed design
Note: please see the draft pull request or the full proposal for details.
Source compatibility
This proposal consists solely of additions.
Effect on ABI stability
The functions proposed here are generally small wrappers around existing functionality. We expect to implement them as @_alwaysEmitIntoClient
functions, which means they would have no ABI impact.
Effect on API resilience
All functionality implemented as @_alwaysEmitIntoClient
will back-deploy.
Alternatives considered
UnsafeMutableBufferPointer.moveElement(at index: Index) -> Element
could use from
as an argument label. from
does read slightly better, but implies the existence of a to
that does not exist. On the other hand, the three other related functions harmonize better with the at
argument label, and having consistency with those three is good.
The single-element assignment functions, UnsafeMutablePointer.assign(_ value:)
and UnsafeMutableBufferPointer.assignElement(_ value:at:)
, are synonyms for the setters of UnsafeMutablePointer.pointee
and UnsafeMutableBufferPointer.subscript(_ i: Index)
, respectively. Clearly we can elect to not add them.
The setters in question, like the assignment functions, have a required precondition that the memory they refer to must be initialized. Somehow this precondition is often overlooked and that leads to bug reports. The proposed names and cross-references will hopefully help clarify the requirements to users.
The initialization and assignment functions that copy from Collection
inputs use the argument label fromElements
. This is different from the pre-existing functions that copy from Sequence
inputs. We could use the same argument label (from
) is with the Sequence
inputs, but that would result in pairs of functions that are overloaded by their return-type. Functions with return-type overloads are often unwieldy, and we chose to avoid them in this situation.
Acknowledgments
Kelvin Ma (aka Taylor Swift)'s initial versions of the pitch that became SE-0184 included more functions to manipulate initialization state. These were deferred, but the functionality has not been pitched again until now.