[Pitch] Expand usability of `withMemoryRebound`

Thanks for everyone's feedback. I made modest updates to the proposal document. The updated version is pasted below:

Expand usability of withMemoryRebound

Introduction

The function withMemoryRebound(to:capacity:_ body:)
executes a closure while temporarily binding a range of memory to a different type than the callee is bound to.
We propose to lift some notable limitations of withMemoryRebound and enable rebinding to a larger set of types,
as well as rebinding from raw memory pointers and buffers.

Swift-evolution thread: Pitch thread

Motivation

When using Swift in a systems programming context or using Swift with libraries written in C,
we occasionally need to temporarily access a range of memory as instances of a different type than has been declared
(the pointer's Pointee type parameter).
In those cases, withMemoryRebound is the tool to reach for,
allowing scoped access to the range of memory as another type.

As a reminder, the function is declared as follows on the type UnsafePointer<Pointee>:

func withMemoryRebound<T, Result>(
  to type: T.Type,
  capacity count: Int,
  _ body: (UnsafePointer<T>) throws -> Result
) rethrows -> Result

This function is currently more limited than necessary.
It requires that the stride of Pointee and T be equal.
This requirement makes many legitimate use cases technically illegal,
even though they could be supported by the compiler.

We propose to allow temporarily binding to a type T whose stride is
a whole fraction or whole multiple of Pointee's stride,
when the starting address is properly aligned for type T.
As before, T's memory layout must be compatible with that ofPointee.

For example, suppose that a buffer of Double consisting of a series of (x,y) pairs is returned from data analysis code written in C.
The next step might be to display it in a preview graph, which needs to read CGPoint values.
We need to copy the Double values as pairs to values of type CGPoint (when executing on a 64-bit platform):

var count = 0
let pointer: UnsafePointer<Double> = calculation(&count)

var points = Array<CGPoint>(unsafeUninitializedCapacity: count/2) {
  buffer, initializedCount in
  var p = pointer
  for i in buffer.indices where p+1 < pointer+count {
    buffer.baseAddress!.advanced(by: i).initialize(to: CGPoint(x: p[0], y: p[1]))
    p += 2
  }
  initializedCount = pointer.distance(to: p)/2
}

We could do better with an improved version of withMemoryRebound.
Since CGPoint values consist of a pair of CGFloat values,
and CGFloat values are themselves layout-compatible with Double (when executing on a 64-bit platform):

var points = Array<CGPoint>(unsafeUninitializedCapacity: data.count/2) {
  buffer, initializedCount in
  pointer.withMemoryRebound(to: CGPoint.self, capacity: buffer.count) {
    buffer.baseAddress!.initialize(from: $0, count: buffer.count)
  }
  initializedCount = buffer.count
}

Alternately, the data could have been received as bytes from a network request, wrapped in a Data instance.
Previously we would have needed to do:

let data: Data = ...

var points = Array<CGPoint>(unsafeUninitializedCapacity: data.count/MemoryLayout<CGPoint>.stride) {
  buffer, initializedCount in
  data.withUnsafeBytes { data in
    var read = 0
    for i in buffer.indices where (read+2*MemoryLayout<CGFloat>.stride)<=data.count {
      let x = data.load(fromByteOffset: read, as: CGFloat.self)
      read += MemoryLayout<CGFloat>.stride
      let y = data.load(fromByteOffset: read, as: CGFloat.self)
      read += MemoryLayout<CGFloat>.stride
      buffer.baseAddress!.advanced(by: i).initialize(to: CGPoint(x: x, y: y))
    }
    initializedCount = read / MemoryLayout<CGPoint>.stride
  }
}

In this case having the ability to use withMemoryRebound with UnsafeRawBuffer improves readability in a similar manner as in the example above:

var points = Array<CGPoint>(unsafeUninitializedCapacity: data.count/MemoryLayout<CGPoint>.stride) {
  buffer, initializedCount in
  data.withUnsafeBytes {
    $0.withMemoryRebound(to: CGPoint.self) {
      (_, initializedCount) = buffer.initialize(from: $0)
    }
  }
}

Proposed solution

We propose to lift the restriction that the strides of T and Pointee must be equal.
This means that it will now be considered correct to re-bind from a homogeneous aggregate type to the type of its constitutive elements,
as they are layout compatible, even though their stride is different.

Instance methods of UnsafePointer<Pointee> and UnsafeMutablePointer<Pointee>

We propose to lift the restriction that the strides of T and Pointee must be equal, when calling withMemoryRebound.
The function declarations remain the same on these two types,
though given the relaxed restriction,
we must clarify the meaning of the capacity argument.
capacity shall mean the number of strides of elements of the temporary type (T) to be temporarily bound.
The documentation will be updated to reflect the changed behaviour.
We will also add parameter labels to the closure type declaration to benefit code completion (a source compatible change.)

extension UnsafePointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    capacity count: Int,
    _ body: (_ pointer: UnsafePointer<T>) throws -> Result
  ) rethrows -> Result
}

extension UnsafeMutablePointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    capacity count: Int,
    _ body: (_ pointer: UnsafeMutablePointer<T>) throws -> Result
  ) rethrows -> Result
}

Instance methods of UnsafeRawPointer and UnsafeMutableRawPointer

We propose adding a withMemoryRebound method, which currently does not exist on these types.
Since it operates on raw memory, this version of withMemoryRebound places no restriction on the temporary type (T).
It is therefore up to the program author to ensure type safety when using these methods.
As in the UnsafePointer case, capacity means the number of strides of elements of the temporary type (T) to be temporarily bound.

extension UnsafeRawPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    capacity count: Int,
    _ body: (_ pointer: UnsafePointer<T>) throws -> Result
  ) rethrows -> Result
}

extension UnsafeMutableRawPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    capacity count: Int,
    _ body: (_ pointer: UnsafeMutablePointer<T>) throws -> Result
  ) rethrows -> Result
}

Instance methods of UnsafeBufferPointer and UnsafeMutableBufferPointer

We propose to lift the restriction that the strides of T and Pointee must be equal, when calling withMemoryRebound.
The function declarations remain the same on these two types.
The capacity of the buffer to the temporary type will be calculated using the length of the UnsafeBufferPointer<Element> and the stride of the temporary type.
The documentation will be updated to reflect the changed behaviour.
We will add parameter labels to the closure type declaration to benefit code completion (a source compatible change.)

extension UnsafeBufferPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    _ body: (_ buffer: UnsafeBufferPointer<T>) throws -> Result
  ) rethrows -> Result
}

extension UnsafeMutableBufferPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    _ body: (_ buffer: UnsafeMutableBufferPointer<T>) throws -> Result
  ) rethrows -> Result
}

Instance methods of UnsafeRawBufferPointer and UnsafeMutableRawBufferPointer

We propose adding a withMemoryRebound method, which currently does not exist on these types.
Since it operates on raw memory, this version of withMemoryRebound places no restriction on the temporary type (T).
It is therefore up to the program author to ensure type safety when using these methods.
The capacity of the buffer to the temporary type will be calculated using the length of the UnsafeRawBufferPointer and the stride of the temporary type.

Finally the set, we propose to add an assumingMemoryBound function that calculates the capacity of the returned UnsafeBufferPointer.

extension UnsafeRawBufferPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    _ body: (_ buffer: UnsafeBufferPointer<T>) throws -> Result
  ) rethrows -> Result
  
  public func assumingMemoryBound<T>(to type: T.Type) -> UnsafeBufferPointer<T>
}

extension UnsafeMutableRawBufferPointer {
  public func withMemoryRebound<T, Result>(
    to type: T.Type,
    _ body: (_ buffer: UnsafeMutableBufferPointer<T>) throws -> Result
  ) rethrows -> Result

  public func assumingMemoryBound<T>(to type: T.Type) -> UnsafeMutableBufferPointer<T>
}

Detailed design

Note: please see the draft PR or the full proposal for details.

Source compatibility

This proposal is source-compatible.
Some changes are compatible with existing correct uses of the API,
while others are additive.

Effect on ABI stability

This proposal consists of ABI-preserving changes and ABI-additive changes.

Effect on API resilience

The behaviour change for the withMemoryRebound is compatible with previous uses,
since restrictions were lifted.
Code that depends on the new semantics may not be compatible with old versions of these functions.
Back-deployment of new binaries will be supported by making the updated versions @_alwaysEmitIntoClient.
Compatibility of old binaries with a new standard library will be supported by ensuring that a compatible entry point remains.

Alternatives considered

One alternative is to implement none of this change, and leave withMemoryRebound as is.
The usability problems of withMemoryRebound would remain.

Another alternative is to leave the type layout restrictions as they are for the typed Pointer and BufferPointer types,
but add the withMemoryRebound functions to the RawPointer and RawBufferPointer variants.
In that case, the stride restriction would be no more than a speedbump,
because it would be straightforward to bypass it by transiting through the appropriate Raw variant.

4 Likes