Rounding out (for the moment) a series of proposed improvements to the UnsafePointer
and UnsafeBufferPointer
families, this pitch merges two previous pitches that were a little too similar. The result is rather large, with apologies.
Besides merging those two pitches, the changes here consist mainly of improvements in the motivations and discussions.
Pointer Family Initialization Improvements & Better Buffer Slices
- Proposal: SE-NNNN Pointer Family Initialization Improvements & Better Buffer Slices
- Author: Guillaume Lessard
- Review Manager: TBD
- Status: pending
- Implementation: Draft Pull Request
- Bugs: rdar://51817146
- Previous Revision: pitch A, pitch B
Introduction
The types in the UnsafeMutablePointer
family typically require manual management of memory allocations, including the management of their initialization state. Unfortunately, not every relevant type in the family has the necessary functionality to fully manage the initialization state of the memory it represents. The states involved are, after allocation:
- Unbound and uninitialized (as returned from
UnsafeMutableRawPointer.allocate()
) - Bound to a type, and uninitialized (as returned from
UnsafeMutablePointer<T>.allocate()
) - Bound to a type, and initialized
Memory can be safely deallocated whenever it is uninitialized.
We intend to round out initialization functionality for every relevant member of that family: UnsafeMutablePointer
, UnsafeMutableRawPointer
, UnsafeMutableBufferPointer
, UnsafeMutableRawBufferPointer
, Slice<UnsafeMutableBufferPointer>
and Slice<UnsafeMutableRawBufferPointer>
. The functionality will allow managing initialization state in a much greater variety of situations, including easier handling of partially-initialized buffers.
Motivation
Memory allocated using UnsafeMutablePointer
, UnsafeMutableRawPointer
, UnsafeMutableBufferPointer
and UnsafeMutableRawBufferPointer
is passed to the user in an uninitialized state. In the general case, such memory needs to be initialized before it is used in Swift. Memory can be "initialized" or "uninitialized". We hereafter refer to this as a memory region's "initialization state".
The methods of UnsafeMutablePointer
that interact with initialization state are:
func initialize(to value: Pointee)
func initialize(repeating repeatedValue: Pointee, count: Int)
func initialize(from source: UnsafePointer<Pointee>, count: Int)
func assign(repeating repeatedValue: Pointee, count: Int)
func assign(from source: UnsafePointer<Pointee>, count: Int)
func move() -> Pointee
func moveInitialize(from source: UnsafeMutablePointer<Pointee>, count: Int)
func moveAssign(from source: UnsafeMutablePointer<Pointee>, count: Int)
func deinitialize(count: Int) -> UnsafeMutableRawPointer
This is a fairly complete set.
- The
initialize
functions change the state of memory locations from uninitialized to initialized,
then assign the corresponding value(s). - The
assign
functions update the values stored at memory locations that have previously been initialized. -
deinitialize
changes the state of a range of memory from initialized to uninitialized. - The
move()
function deinitializes a memory location, then returns its current contents. - The
move
prefix means that thesource
range of memory will be deinitialized after the function returns.
Unfortunately, UnsafeMutablePointer
is the only one of the list of types listed in the introduction to allow full control of initialization state, and this means that complex use cases such as partial initialization of a buffer become overly complicated.
An example of partial initialization is the insertion of elements in the middle of a collection. This is one of the possible operations needed in an implementation of RangeReplaceableCollection.replaceSubrange(_:with:)
. Given a RangeReplaceableCollection
whose unique storage can be represented by a partially-initialized UnsafeMutableBufferPointer
:
mutating func replaceSubrange<C>(_ subrange: Range<Index>, with newElements: C)
where C: Collection, Element == C.Element {
// obtain unique storage as UnsafeMutableBufferPointer
let buffer: UnsafeMutableBufferPointer<Element> = self.myUniqueStorage()
let oldCount = self.count
let growth = newElements.count - subrange.count
let newCount = oldCount + growth
if growth > 0 {
assert(newCount < buffer.count)
let oldTail = subrange.upperBound..<oldCount
let newTail = subrange.upperBound+growth..<newCount
let oldTailBase = buffer.baseAddress!.advanced(by: oldTail.lowerBound)
let newTailBase = buffer.baseAddress!.advanced(by: newTail.lowerBound)
newTailBase.moveInitialize(from: oldTailBase,
count: oldCount - subrange.upperBound)
// Update still-initialized values in the original subrange
var j = newElements.startIndex
for i in subrange {
buffer[i] = newElements[j]
newElements.formIndex(after: &j)
}
// Initialize the remaining range
for i in subrange.upperBound..<newTail.lowerBound {
buffer.baseAddress!.advanced(by: i).initialize(to: newElements[j])
newElements.formIndex(after: &j)
}
assert(newElements.distance(from: newElements.startIndex, to: j) == newElements.count)
}
...
}
Here, we had to convert to UnsafeMutablePointer
to use some of its API, as well as resort to element-by-element copying and initialization. With API enabling buffer operations on the slices of buffers, we could simplify things greatly:
mutating func replaceSubrange<C>(_ subrange: Range<Index>, with newElements: C)
where C: Collection, Element == C.Element {
// obtain unique storage as UnsafeMutableBufferPointer
let buffer: UnsafeMutableBufferPointer<Element> = self.myUniqueStorage()
let oldCount = self.count
let growth = newElements.count - subrange.count
let newCount = oldCount + growth
if growth > 0 {
assert(newCount < buffer.count)
let oldTail = subrange.upperBound..<count
let newTail = subrange.upperBound+growth..<newCount
var m = buffer[newTail].moveInitialize(fromElements: buffer[oldTail])
assert(m == newTail.upperBound)
// Update still-initialized values in the original subrange
m = buffer[subrange].update(fromElements: newElements)
// Initialize the remaining range
m = buffer[m..<newTail.lowerBound].initialize(
fromElements: newElements.dropFirst(m - subrange.lowerBound)
)
assert(m == newTail.lowerBound)
}
...
}
In addition to simplifying the implementation, the new methods have the advantage of having the same bounds-checking behaviour as UnsafeMutableBufferPointer
, relieving the implementation from being required to do its own bounds checking.
This proposal aims to add API to control initialization state and improve multiple-element copies for UnsafeMutableBufferPointer
, UnsafeMutableRawBufferPointer
, Slice<UnsafeMutableBufferPointer>
and Slice<UnsafeMutableRawBufferPointer>
.
Proposed solution
Note: the pseudo-diffs presented in this section denotes added functions with +++
and renamed functions with ---
. Unmarked functions are unchanged.
UnsafeMutableBufferPointer
We propose to modify UnsafeMutableBufferPointer
as follows:
extension UnsafeMutableBufferPointer {
func initialize(repeating repeatedValue: Element)
func initialize<S>(from source: S) -> (S.Iterator, Index) where S: Sequence, S.Element == Element
+++ func initialize<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
--- func assign(repeating repeatedValue: Element)
+++ func update(repeating repeatedValue: Element)
+++ func update<S>(from source: S) -> (unwritten: S.Iterator, updated: Index) where S: Sequence, S.Element == Element
+++ func update<C>(fromElements: C) -> Index where C: Collection, C.Element == Element
+++ func moveInitialize(fromElements: UnsafeMutableBufferPointer) -> Index
+++ func moveInitialize(fromElements: Slice<UnsafeMutableBufferPointer>) -> Index
+++ func moveUpdate(fromElements: `Self`) -> Index
+++ func moveUpdate(fromElements: Slice<`Self`>) -> Index
+++ func deinitialize() -> UnsafeMutableRawBufferPointer
+++ func initializeElement(at index: Index, to value: Element)
+++ func updateElement(at index: Index, to value: Element)
+++ func moveElement(from index: Index) -> Element
+++ func deinitializeElement(at index: Index)
}
We would like to use the verb update
instead of assign
, in order to better communicate the intent of the API. It is currently a common programmer error to use one of the existing assign
functions for uninitialized memory; using the verb update
instead would express the precondition in the API itself.
The methods that initialize or update from a Collection
will have forgiving semantics, and copy the number of elements that they can, be that every available element or none, and then return the index in the buffer that follows the last element copied, which is cheaper than returning an iterator and a count. Unlike the existing Sequence
functions, they include no preconditions beyond having a valid Collection
and valid buffer, with the understanding that if a user needs stricter behaviour, it can be composed from these functions.
The above changes include a method to update a single element. Evidently that is a synonym for the subscript(_ i: Index)
setter. We hope that documenting the update action specifically will help clarify the requirements of that action, namely that the buffer element must already be initialized. Experience shows that the initialization requirement of the subscript setter is frequently missed by users in the current situation, where it is only documented along with the subscript getter.
UnsafeMutablePointer
The proposed modifications to UnsafeMutablePointer
are renamings:
extension UnsafeMutablePointer {
func initialize(to value: Pointee)
func initialize(repeating repeatedValue: Pointee, count: Int)
func initialize(from source: UnsafePointer<Pointee>, count: Int)
+++ func update(to value: Pointee)
--- func assign(repeating repeatedValue: Pointee, count: Int)
+++ func update(repeating repeatedValue: Pointee, count: Int)
--- func assign(from source: UnsafePointer<Pointee>, count: Int)
+++ func update(from source: UnsafePointer<Pointee>, count: Int)
func move() -> Pointee
func moveInitialize(from source: UnsafeMutablePointer, count: Int)
--- func moveAssign(from source: UnsafeMutablePointer, count: Int)
+++ func moveUpdate(from source: UnsafeMutablePointer, count: Int)
func deinitialize(count: Int) -> UnsafeMutableRawPointer
}
The motivation for these renamings are explained above.
UnsafeMutableRawBufferPointer
We propose to add new functions to initialize memory referenced by UnsafeMutableRawBufferPointer
instances.
extension UnsafeMutableRawBufferPointer {
func initializeMemory<T>(
as type: T.Type, repeating repeatedValue: T
) -> UnsafeMutableBufferPointer<T>
func initializeMemory<S>(
as type: S.Element.Type, from source: S
) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>) where S: Sequence
+++ func initializeMemory<C>(
as type: C.Element.Type, fromElements: C
) -> UnsafeMutableBufferPointer<C.Element> where C: Collection
+++ func moveInitializeMemory<T>(
as type: T.Type, fromElements: UnsafeMutableBufferPointer<T>
) -> UnsafeMutableBufferPointer<T>
+++ func moveInitializeMemory<T>(
as type: T.Type, fromElements: Slice<UnsafeMutableBufferPointer<T>>
) -> UnsafeMutableBufferPointer<T>
}
The first addition will initialize raw memory from a Collection
and have similar behaviour as UnsafeMutableBufferPointer.initialize(fromElements:)
, described above. The other two initialize raw memory by moving data from another range of memory, leaving that other range of memory deinitialized.
UnsafeMutableRawPointer
extension UnsafeMutableRawPointer {
+++ func initializeMemory<T>(as type: T.Type, to value: T) -> UnsafeMutablePointer<T>
func initializeMemory<T>(
as type: T.Type, repeating repeatedValue: T, count: Int
) -> UnsafeMutablePointer<T>
func initializeMemory<T>(
as type: T.Type, from source: UnsafePointer<T>, count: Int
) -> UnsafeMutablePointer<T>
func moveInitializeMemory<T>(
as type: T.Type, from source: UnsafeMutablePointer<T>, count: Int
) -> UnsafeMutablePointer<T>
}
The addition here initializes a single value.
Slices of BufferPointer
We propose to add to slices of Unsafe[Mutable][Raw]BufferPointer
all the BufferPointer
-specific methods of their Base
. The following declarations detail the additions, which are all intended to behave exactly as the functions on the base BufferPointer types:
extension Slice<UnsafeBufferPointer<T>> {
public func withMemoryRebound<T, Result>(
to type: T.Type,
_ body: (UnsafeBufferPointer<T>) throws -> Result
) rethrows -> Result
}
extension Slice<UnsafeMutableBufferPointer<T>> {
func initialize(repeating repeatedValue: Element)
func initialize<S: Sequence>(from source: S) -> (S.Iterator, Index)
where S.Element == Element
func initialize<C: Collection>(fromElements: C) -> Index
where C.Element == Element
func update(repeating repeatedValue: Element)
func update<S: Sequence>(
from source: S
) -> (iterator: S.Iterator, updated: Index) where S.Element == Element
func update<C: Collection>(
fromElements: C
) -> Index where C.Element == Element
func moveInitialize(fromElements source: UnsafeMutableBufferPointer<Element>) -> Index
func moveInitialize(fromElements source: Slice<UnsafeMutableBufferPointer<Element>>) -> Index
func moveUpdate(fromElements source: UnsafeMutableBufferPointer<Element>) -> Index
func moveUpdate(fromElements source: Slice<UnsafeMutableBufferPointer<Element>>) -> Index
func deinitialize() -> UnsafeMutableRawBufferPointer
func initializeElement(at index: Index, to value: Element)
func updateElement(at index: Index, to value: Element)
func moveElement(at index: Index) -> Element
func deinitializeElement(at index: Index)
func withMemoryRebound<T, Result>(
to type: T.Type,
_ body: (UnsafeMutableBufferPointer<T>) throws -> Result
) rethrows -> Result
}
extension Slice<UnsafeRawBufferPointer> {
func bindMemory<T>(to type: T.Type) -> UnsafeBufferPointer<T>
func assumingMemoryBound<T>(to type: T.Type) -> UnsafeBufferPointer<T>
func withMemoryRebound<T, Result>(
to type: T.Type, _ body: (UnsafeBufferPointer<T>) throws -> Result
) rethrows -> Result
}
extension Slice<UnsafeMutableRawBufferPointer> {
func copyMemory(from source: UnsafeRawBufferPointer)
func copyBytes<C: Collection>(from source: C) where C.Element == UInt8
func initializeMemory<T>(
as type: T.Type, repeating repeatedValue: T
) -> UnsafeMutableBufferPointer<T>
func initializeMemory<S: Sequence>(
as type: S.Element.Type, from source: S
) -> (unwritten: S.Iterator, initialized: UnsafeMutableBufferPointer<S.Element>)
func initializeMemory<C: Collection>(
as type: C.Element.Type, fromElements: C
) -> UnsafeMutableBufferPointer<C.Element>
func moveInitializeMemory<T>(
as type: T.Type, fromElements: UnsafeMutableBufferPointer<T>
) -> UnsafeMutableBufferPointer<T>
func moveInitializeMemory<T>(
as type: T.Type, fromElements: Slice<UnsafeMutableBufferPointer<T>>
) -> UnsafeMutableBufferPointer<T>
func bindMemory<T>(to type: T.Type) -> UnsafeMutableBufferPointer<T>
func assumingMemoryBound<T>(to type: T.Type) -> UnsafeMutableBufferPointer<T>
func withMemoryRebound<T, Result>(
to type: T.Type,
_ body: (UnsafeMutableBufferPointer<T>) throws -> Result
) rethrows -> Result
}
Detailed design
Note: please see the draft PR or the full proposal for details.
Source compatibility
This proposal consists mostly of additions, which are by definition source compatible.
The proposal includes the renaming of four existing functions from assign
to update
. The existing function names would be deprecated, producing a warning. A fixit will support an easy transition to the renamed versions of these functions.
Effect on ABI stability
The functions proposed here are generally small wrappers around existing functionality. We expect to implement them as @_alwaysEmitIntoClient
functions, which means they would have no ABI impact.
The renamed functions can reuse the existing symbol, while the deprecated functions can forward using an @_alwaysEmitIntoClient
stub to support the functionality under its previous name. This means they would have no ABI impact.
Effect on API resilience
All functionality implemented as @_alwaysEmitIntoClient
will back-deploy. Renamed functions that reuse a previous symbol will also back-deploy.
Alternatives considered
Single element update functions
The single-element update functions, UnsafeMutablePointer.update(to:)
and UnsafeMutableBufferPointer.updateElement(at:to:)
, are synonyms for the setters of UnsafeMutablePointer.pointee
and UnsafeMutableBufferPointer.subscript(_ i: Index)
, respectively. Clearly we can elect to not add them.
The setters in question, like the update functions, have a required precondition that the memory they refer to must be initialized. Somehow this precondition is often overlooked and leads to bug reports. The proposed names and cross-references should help clarify the requirements to users.
Renaming assign
to update
The renaming of assign
to update
could be omitted entirely, although we believe that update
communicates intent much better than assign
does. In The Swift Programming Language, the =
symbol is named "the assignment operator", and its function is described as to either initialize or to update a value. The current name (assign
) is not as clear as the documentation in TSPL, while the proposed name (update
) builds on it.
There are only four current symbols to be renamed by this proposal, and their replacements are easily migrated by a fixit. For context, this renaming would change only 6 lines of code in the standard library, outside of the function definitions. If the renaming is omitted, the four new functions proposed in the family should use the name assign
as well. The two single-element versions would be assign(_ value:)
and assignElement(at:_ value:)
.
Element-by-element copies from Collection
inputs
The initialization and updating functions that copy from Collection
inputs use the argument label fromElements
. This is different from the pre-existing functions that copy from Sequence
inputs. We could use the same argument label (from
) as with the Sequence
inputs, but that would mean that we must return the Iterator
for the Collection
versions, and that is generally not desirable, especially if a particular Iterator
cannot be copied cheaply. If we did not return Iterator
, then the Sequence
and Collection
versions of the initialize(from:)
would be overloaded by their return type, and that would be source-breaking:
an existing use of the current function that doesn't destructure the returned tuple on assignment could now pick up the Collection
overload, which would have a return value incompatible with the existing code which assumes that the return value is of type (Iterator, Int)
.
Acknowledgments
Kelvin Ma (aka Taylor Swift)'s initial versions of the pitch that became SE-0184 included more functions to manipulate initialization state. These were deferred, but much of the deferred functionality has not been pitched again until now.
Members of the Swift Standard Library team for valuable discussions.