Hello all,
I've been thinking about a way of creating an array that would let us access the buffer of uninitialized memory, so that we could implement algorithms that don't necessarily know the size in advance or need access to noncontiguous parts of an array. Here's a draft of my proposed solution:
There is currently no way to work with a buffer of uninitialized memory that then becomes a fully memory-managed Array
. This limitation means that operations that don't know the final size of their data in advance, or that need to access noncontiguous parts of an array, are less efficient than necessary, often requiring an extra copy. This proposal suggests a new initializer for Array
and ContiguousArray
that would provide access to a newly created array's entire storage buffer.
Motivation
Some collection operations require working on a fixed-size buffer of uninitialized memory. For example, one O(n) algorithm for performing a stable partition of an array is as follows:
- Create a new array the same size as the original array.
- Iterate over the original array, copying matching elements to the beginning of the new array and non-matching elements to the end.
- When finished iterating, reverse the slice of non-matching elements.
Unfortunately, the standard library provides no way to create an array of a particular size without initializing every element, or to copy elements to the end of an array's buffer without initializing every preceding element. Even if we manually allocate the memory using an UnsafeMutableBufferPointer
, there's no way to convert that buffer into an array without copying the contents. There simply isn't a way to implement this particular algorithm with maximum efficiency in Swift.
Proposed solution
Adding a new Array
initializer that lets a program manipulate an array's uninitialized buffer would fill in this missing functionality. The new initializer takes the capacity of the array as a parameter as well as a closure that operates on an UnsafeMutableBufferPointer
. This closure has access to the uninitialized contents of the full capacity of the newly created array's storage, and returns the intialized count of the array.
let myArray = Array<Int>(uninitializedCapacity: 10) { buffer in
for x in 1..<5 {
buffer[x] = x
}
buffer[0] = 10
return 5
}
// myArray == [10, 1, 2, 3, 4]
// myArray.capacity == 10
With this new initializer, it's possible to implement the stable partition as an extension to the Collection
protocol, without any unnecessary copies:
func stablyPartitioned(by belongsInFirstPartition: (Element) -> Bool) -> [Element] {
var _count = self.count
return Array(uninitializedCapacity: _count) { buffer in
var lowIndex = 0
var highIndex = _count
for element in self {
if belongsInFirstPartition(element) {
buffer[lowIndex] = element
lowIndex += 1
} else {
highIndex -= 1
buffer[highIndex] = element
}
}
buffer[highIndex..<_count].reverse()
return _count
}
}
Detailed design
A single new initializer is added to both Array
and ContiguousArray
:
extension Array {
/// Creates a new array with the specified capacity, then calls the given
/// closure to initialize its contents.
///
/// You use this initializer when you need to access noncontiguous positions
/// of an array while initializing its elements. Pass the required capacity
/// and a closure that can intialize the array's elements. The closure must
/// return `c`, the number of initialized elements in the buffer, such that
/// the elements in the range `0..<c` are initialized and the elements
/// in the range `c..<capacity` are uninitialized. The resulting array
/// has a `count` equal to `c`.
///
/// - Parameters:
/// - capacity: The capacity of the new array.
/// - body: A closure that can initialize the array's elements. This
/// closure must return the count of the initialized elements, starting
/// at the beginning of the buffer.
init(
uninitializedCapacity capacity: Int,
initializingWith body: (UnsafeMutableBufferPointer<Element>) -> Int
)
}
extension ContiguousArray {
// same documentation as above
init(
uninitializedCapacity capacity: Int,
initializingWith body: (UnsafeMutableBufferPointer<Element>) -> Int
)
}
Source compatibility
This is an additive change to the standard library, so there is no effect on source compatibility.
Effect on ABI stability
This addition has no effect on ABI stability.
Effect on API resilience
This addition will be a permanent part of the standard library, and will need to remain public API.
Alternatives considered
- An
Array
initializer that simply converts anUnsafeMutableBufferPointer
into an array's backing storage seems like it would be another solution. However, an array's storage includes information about the count and capacity at the beginning of its buffer, so anUnsafeMutableBufferPointer
created from scratch isn't usable.