Here is the second in a series of proposed improvements to the UnsafePointer
and UnsafeBufferPointer
families. This one concentrates on the pointers.
Note: an updated pitch document can be found below.
Pointer API Usability Improvements
- Proposal: SE-NNNN full proposal draft
- Authors: Guillaume Lessard, Andrew Trick
- Review Manager: TBD
- Status: Draft pull request
- Implementation: pending
- Bugs: rdar://64342031, SR-11156 (rdar://53272880), rdar://22541346
- Previous Revision: none
Introduction
This proposal introduces some quality-of-life improvements for UnsafePointer
and its Mutable
and Raw
variants.
- Add an API to obtain an
UnsafeRawPointer
instance that is advanced to a given alignment from its starting point. - Add an API to obtain a pointer to a stored property of an aggregate
T
, given anUnsafePointer<T>
. - Rename the unchecked subscript of
Unsafe[Mutable]Pointer
to include the argument labelunchecked
. - Add the ability to compare pointers of any two types.
Motivation
The everyday use of UnsafePointer
and its variants comes with many difficulties unrelated to the unsafeness of the type. We can improve the ergonomics of these types without hiding the unsafeness.
For example, if one needs to advance a pointer to a given alignment, there is no need to force the programmer to derive the proper calculation (or consult a textbook, or copy an answer from stack overflow.) An API that provides this utility would not take away from the fact that the type is called "unsafe".
Similarly, it is rather difficult to pass a pointer to a property of a struct to (e.g.) a C function. In such cases, the poor ergonomics lead to code that is less safe than it should be.
From another perspective, the integer subscript on UnsafePointer
is different from other subscripts in Swift. Normally, similar-looking subscripts perform bounds checking. The UnsafePointer
version does not warn that it does not check its parameter, even though it looks similar to a Collection
subscript at the point of use. It would be an improvement to give a name to the way that subscript is unsafe: it is unchecked.
Finally, when dealing with pointers of different types, we can often get in situations where Swift's type system gets in the way. Regardless of their type, pointers represent one unique storage location in memory. As such, casting the type of a pointer in order to be able to compare it to another is not a useful exercise.
Proposed solution
Ability to obtain a pointer properly aligned to store a given type
When using pointers into untyped (raw) memory, it is often desirable to obtain another pointer that is advanced to a given alignment, rather than advanced by a particular offset. The current API provides no help in performing this task, even though the calculation isn't entirely obvious. The programmer should not need to derive the proper calculation, or to consult a textbook.
For example, consider implementing a complex data structure whose nodes include atomic pointers to other nodes in the graph. In order to avoid two allocations per node, we allocate a range of raw memory and manually bind subranges of the allocation. Our example node allocates space for one atomic pointer value and one value of type T
:
import SwiftAtomics
struct Node<T>: RawRepresentable, AtomicValue, AtomicOptionalWrappable {
typealias AtomicRepresentation = AtomicRawRepresentableStorage<Self>
typealias AtomicOptionalRepresentation =
AtomicOptionalRawRepresentableStorage<Self>
typealias NodeStorage = (AtomicOptionalRepresentation, T)
let rawValue: UnsafeMutableRawPointer
init(_ element: T) {
rawValue = .allocate(byteCount: MemoryLayout<NodeStorage>.size,
alignment: MemoryLayout<NodeStorage>.alignment)
// bind and initialize atomic storage
rawValue.initializeMemory(as: AtomicOptionalRepresentation.self,
repeating: AtomicOptionalRepresentation(nil),
count: 1)
// bind and initialize payload storage
let tMask = MemoryLayout<T>.alignment - 1
let tOffset = (MemoryLayout<AtomicOptionalRepresentation>.size + tMask) & ~tMask
let t = rawValue.advanced(by: tOffset)
.initializeMemory(as: T.self, repeating: element, count: 1)
}
}
The calculation of tOffset
above is overly complex. Calculating the offset between the start of the data structure to the field of type T
should be straightforward!
We propose to add a function to help perform this operation on raw pointer types:
extension UnsafeRawPointer {
public func roundedUp<T>(toAlignmentOf: T.type) -> Self
}
This function would round the current pointer up to the next address that satisfies the alignment of T
. UnsafeRawPointer.roundedUp(toAlignmentOf:)
would not return a different value when applied to a pointer that is already aligned with T
.
The new function would make identifying the storage location of T
much more straightforward than in the example above:
init(_ element: T) {
rawValue = .allocate(byteCount: MemoryLayout<NodeStorage>.size,
alignment: MemoryLayout<NodeStorage>.alignment)
// bind and initialize atomic storage
rawValue.initializeMemory(as: AtomicOptionalRepresentation.self,
repeating: AtomicOptionalRepresentation(nil),
count: 1)
// bind and initialize payload storage
rawValue.advanced(by: MemoryLayout<AtomicOptionalRepresentation>.size)
.roundedUp(toAlignmentOf: T.self)
.initializeMemory(as: T.self, repeating: element, count: 1)
}
Ability to obtain a pointer to a member of an aggregate value
When using a pointer to a struct with multiple stored properties, it isn't obvious how to obtain pointers to more than one of the stored properties. For example, consider using the pthreads library, a major C API. The pthreads library uses the return value to indicate error conditions,
and modifies values through pointers it receives as parameters. It has many APIs with multiple pointer arguments. One would query a thread's scheduling parameters using pthread_getschedparam`, which has the following prototype:
int pthread_getschedparam(pthread_t tid, int *policy, struct sched_param *param);
A swift user, concerned with keeping related data packaged together, might have elected to define a struct thusly:
struct ThreadSchedulingParameters {
var policy: Int
var parameters: sched_param
var priority: Int { parameters.sched_priority }
}
Updating a ThreadSchedulingParameters
instance using the above C function is not obvious:
var scheduling = ThreadSchedulingParameters()
var tid = pthread_create(...)
var e = withUnsafeMutableBytes(of: &scheduling) { bytes in
let o1 = MemoryLayout<ThreadSchedulingParameters>.offset(of: \.policy)!
let policy_p = bytes.baseAddress!.advanced(by: o1).assumingMemoryBound(to: Int32.self)
let o2 = MemoryLayout<ThreadSchedulingParameters>.offset(of: \.parameters)!
let params_p = bytes.baseAddress!.advanced(by: o2).assumingMemoryBound(to: sched_param.self)
return pthread_getschedparam(thread, policy_p, params_p)
}
We must first reach for the non-obvious withUnsafeMutableBytes
rather than for withUnsafePointer. In so doing, we suppress statically-known type information, only to immediately assert the type using
assumingMemoryBound. We can use
KeyPathto do better. We shall add a new subscript to
UnsafePointerand
UnsafeMutablePointer`:
extension UnsafeMutablePointer {
subscript<Property>(property: WritableKeyPath<Pointee, Property>) -> UnsafeMutablePointer<Property>? { get }
}
The return value of this subscript must be optional, because a KeyPath
represents a property regardless of its kind (stored or computed). In the case of a computed property, there is no pointer to return and we must return nil
.
With this new subscript, a correct call to pthread_getschedparam
becomes the much simpler:
var e = withUnsafeMutablePointer(to: &scheduling) {
pthread_getschedparam(thread, $0[\.policy]!, $0[\.parameters]!)
}
Add unchecked
argument label to UnsafePointer
's integer subscript
In Swift, it is customary for subscripts to have a precondition that their argument be valid. It is reasonable and expected that UnsafePointer
should have a less-safe subscript. Unfortunately, the unsafe usage is unmarked at the point of use.
We propose to replace (via deprecation) the existing subscript of UnsafePointer
with a subscript that adds an argument label (unchecked
). The label will help visually distinguish the unchecked pointer subscript from a "normal" (checked) subscript.
extension UnsafeMutablePointer {
public subscript(unchecked i: Int) -> Pointee { get set }
}
There is precedent for using of the word "unchecked" in the standard library. It is frequently used in internal names: the word currently appears as part of a Swift symbol on 197 lines of the standard library source code. It is also used to mark unchecked preconditions in these public API:
@unchecked Sendable
and Range.init(uncheckedBounds:)
.
Allow comparisons of pointers of any type
Pointers are effectively an index into the fundamental collection that is the computer's memory. Regardless of their type, they represent a unique storage location in memory. As such, having to cast the type of a pointer in order to be able to compare it to another is not a useful exercise.
It's very common to end up with a combination of Mutable
and non-Mutable
pointers into the same buffer, and the programmer needs to write conversions that satisfy the compiler but have no real effect in the generated code.
To remedy this, we propose to add the following static functions, scoped to the existing _Pointer
protocol:
extension _Pointer {
public static func == <Other: _Pointer>(lhs: Self, rhs: Other) -> Bool
public static func < <Other: _Pointer>(lhs: Self, rhs: Other) -> Bool
public static func <= <Other: _Pointer>(lhs: Self, rhs: Other) -> Bool
public static func > <Other: _Pointer>(lhs: Self, rhs: Other) -> Bool
public static func >= <Other: _Pointer>(lhs: Self, rhs: Other) -> Bool
}
Note that it is always possible to enclose both pointers in a conversion to UnsafeRawPointer
. This addition simply removes the necessity to insert conversions that are always legal.
Detailed design
Note: please see the draft pull request or the full proposal for details.
Source compatibility
Most of the proposed changes are additive, and therefore are source-compatible.
The existing pointer subscript would be deprecated,
and a fixit will support an easy transition.
Effect on ABI stability
We intend to implement these changes in an ABI-neutral manner.
Effect on API resilience
The proposed additions will be public API,
and will all be marked @_alwaysEmitIntoClient
to support back-deployability.
The deprecated integer subscript will remain in place,
and will therefore support pre-existing binaries.
Alternatives considered
API to obtain a pointer properly aligned to store a given type
Instead of the proposed function, we could add an API that simply takes an integer, and rounds the value of the pointer to a multiple of that number. We believe that having a type parameter is the correct default. The disadvantage is that it is not possible at this juncture to define a type whose alignment is greater than 16. Consequently this function cannot be used to obtain a pointer aligned to a cache line, for example. On the other hand, this API does not increase the difficulty to obtain such a pointer.
The name of the function could simply be advanced<T>(toAlignmentOf: T.type)
. This pairs well with the existing pointer advancement functions, but implies that it the returned value is always different from self
. The name roundedUp
correctly describes the idempotent behaviour.
There is a pre-existing internal API to obtain pointers aligned with a type's alignment, consisting of static members of MemoryLayout<T>
. We believe that the functionality is a more natural fit as methods of Unsafe[Mutable]RawPointer
.
API to obtain a pointer to a member of an aggregate value
It might be possible to use the @dynamicMemberLookup
functionality to make this even more elegant. It isn't clear to the authors what the ABI impact of that approach would be. On the other hand, we know that the approach suggested above can be ABI-neutral.
We could provide the same functionality as a function instead:
func pointer<Property>(to: KeyPath<Pointee, Property>) -> UnsafePointer<Property>?
A subscript could be misconstrued as providing access directly to the stored property, but we feel that the subscript is nevertheless a more elegant solution,
Add unchecked
argument label to UnsafePointer
's integer subscript
The community could decide not to do this. The authors believe that unsafe API would be improved by indicating the nature of their unsafety at the point of use, and this pitch is a first step for such improvements.
In addition to changing the UnsafePointer
subscript, we could also add a subscript to UnsafeBufferPointer
that includes the unchecked
argument label. The behaviour of this additional subscript would be different from the behaviour of the existing integer subscript,
and would not be a replacement. As a reminder, UnsafeBufferPointer.subscript(_ i: Int)
performs bounds-checking in debug mode, and skips bounds-checking in release mode.
This behaviour leads to optimization issues when there are three compilation units (the standard library, user code, and a third-party library that uses UnsafeBufferPointer
), limiting the optimizations available to the library code.
Adding an unchecked
subscript to UnsafeBufferPointer
could help the ultimate performance of such third-party libraries. Changing the default behaviour of UnsafeBufferPointer
's subscript with regards to bounds-checking is out of scope for this proposal.
Allow comparisons of pointers of any type
Compiler performance is a concern, and operator overloads have been the cause of performance issues in the past. Preliminary compiler performance testing suggests that this addition does not appreciably affect performance.
Acknowledgements
Thanks to Kyle Macomber and the Swift Standard Library team for valuable feedback.