After digging deeper...
The most problematic APIs that I've seen w.r.t. pointer types are Foundation.Data and SwiftNIO. I'll go through a lot of the issues with those below.
In short, withUnsafeTypePunnedPointer
would only be useful when implementing Swift wrappers on top of some "difficult" C APIs. It does not solve the vast majority of the usability and correctness issues that I've seen in practice.
I still think it's fine to add such a thing, as long as it doesn't make those correctness issues worse. Along those lines, something called withUnsafeTypePunnedPointer
should not actually bind memory as proposed:
func withUnsafeTypePunnedPointer<T, U, R>(
to: T,
do body: (UnsafePointer<U>) -> R
) -> R {
return withUnsafePointer(to: to) {
return $0.withMemoryRebound(
to: U.self,
capacity: MemoryLayout<T>.size / MemoryLayout<U>.size,
body)
}
}
The above semantics will conflict with later support for type punned pointers. Given the name, people would naturally think it's ok use type punning within the closure, but that would be undefined behavior because the closure actually takes a strongly typed UnsafePointer.
For now, we could add the helper, but call it withUnsafeTypeConvertedPointer
. Later we can introduce the safer variant, withUnsafeTypePunnedPointer
, once we have an UnsafeTypePunnedPointer
type.
Here are the actual issues that I see today with developers using memory binding APIs:
Foundation.Data
In Foundation.Data, we have a raw buffer that may be shared with other code, and we want to temporarily view that buffer as some user-provided type. Memory binding is not designed for this. Memory binding only makes sense when the code doing the binding has exclusive ownership of the memory at the time and knows the memory's current type. When you bind memory to a type, you either persistently bind uninitialized memory to a type, or temporarily rebind memory to a different type.
You cannot temporarily rebind memory of unknown type that is used somewhere else, then rebind it back to that unknown type when the closure completes.
Another way to look at this is that the memory model gives each memory location a single global type state. Verification of the model can make use of this property. We could add more complexity to the model with the addition of "memory type scopes" in SIL. I just don't think that's desirable.
Really, the only way to fix Foundation.Data's API is for the user-provided closure to either take UnsafeRawPointer (which can be accessed with its type safe API), or take a weakly typed UnsafeTypePunnedPointer. Specifically, Data
should definitely declare this method:
public func withUnsafeBytes<ResultType>(
_ body: (UnsafeRawBufferPointer) throws -> ResultType
) rethrows -> ResultType
And it should possibly eventually have this method too:
public func withUnsafeTypePunnedPointer<ResultType, ContentType>(
_ body: (UnsafeTypePunnedBufferPointer<ContentType>) throws -> ResultType
) rethrows -> ResultType
Alternatively, we could deprecate Data
's bytesNoCopy
initializer so that the only way Data's memory can be typed is if Data binds the type itself.
SwiftNIO
SwiftNIO has a lot of code that works with raw byte buffers and needs to call out to various
typed C APIs. Here are the basic use cases...
Nested TLDR: There's a lot of code here that I think should be simplified. @johannesweiss pointed out that, with full knowledge of the codebase, most of these cases are actually correct, in that memory model verification will succeed once we have it. However, the safe way to write the code is always simpler.
Load and store bytes of a raw buffer
Current Swift
let address = buffer.baseAddress!.assumingMemoryBound(to: UInt8.self)
for { //...
let byte = address.advanced(by: idx).pointee
}
Safe Swift
for { //...
let byte = buffer[idx]
}
And...
Current Swift
func returnStorage() -> UnsafeMutablePointer<UInt8> {
return self.bytes.advanced(by: idx).assumingMemoryBound(to: UInt8.self)
}
while { //...
base = returnStorage()
base[idx] = byte
idx += 1
}
Safe Swift
while { //...
self.bytes[idx] = byte
idx += 1
}
UnsafeRawBufferPointer is already a collection of UInt8 bytes.
Copy any UInt8 Collection into memory
Current Swift
let base = outBytes.assumingMemoryBound(to: UInt8.self)
inCollection.withUnsafeBytes { srcPtr in
base.assign(from: srcPtr.baseAddress!.assumingMemoryBound(to: UInt8.self), count: n)
}
Safe Swift
outBytes.copyBytes(from: inCollection)
Initialize a UInt8 Array
Current Swift
return Array.init(UnsafeBufferPointer<UInt8>(
start: rawptr.baseAddress?.advanced(by: index).assumingMemoryBound(to: UInt8.self),
count: length))
Safe Swift
return [UInt8](rawptr[index..<(index+ length)])
Decode a String
Current Swift
String(decoding: UnsafeBufferPointer(
start: rawptr.baseAddress?.assumingMemoryBound(to: UInt8.self).advanced(by: index),
count: length),
as: UTF8.self)
Safe Swift
String(decoding: rawptr[index..<(index + length)], as: UTF8.self)
Store a typed pointer pointing to the interior of a raw buffer
C
struct z_stream {
unsigned *next_in;
//...
}
Current Swift
let typedPtr = dataPtr.baseAddress!.assumingMemoryBound(to: UInt8.self)
let typedDataPtr = UnsafeMutableBufferPointer(start: typedPtr, count: dataPtr.count)
zstream.next_in = typedDataPtr.baseAddress!
Safe Swift
zstream.next_in = dataPtr.bindMemory(to: UInt8.self)
The is one of the rare cases where binding memory is appropriate. We need a persistent, strongly typed pointer into a raw byte buffer. (NIO is essentially implementing its own memory allocator.)
The only alternative I can think of would be to provide some annotation on top of zlib to indicate that the struct fields should be imported as a weakly typed pointer, like the proposed UnsafeTypePunnedPointer. Then the byte buffer's memory never needs to be given a type.
Call an arbitrary C API (through a wrapper)
Current Swift
public static func read(pointer: UnsafeMutablePointer<UInt8>, size: size_t)
buf.writeWithUnsafeMutableBytes { ptr in
read(pointer: ptr.baseAddress!.assumingMemoryBound(to: UInt8.self), size: n)
}
Safe Swift
public static func read(pointer: UnsafeMutableRawPointer, size: size_t)
read(pointer: pointer, size: n)
There's already a Swift wrapper around the C API. Just make its byte buffer argument type UnsafeRawPointer
. Defining a wrapper is a way better approach than always calling it via withUnsafeTypePunnedPointer(to)
or `withMemoryRebound(to:).
The pthread API
Current and Safe Swift
let res = pthread_create(
&pt,
nil,
{ p in
let box = Unmanaged<ThreadBox>.fromOpaque((p as UnsafeMutableRawPointer?)!
.assumingMemoryBound(to:ThreadBox.self)).takeRetainedValue()
// ... The rest of the thread start routine
},
Unmanaged.passRetained(box).toOpaque())
Wow, this is horrible. Exactly as horrible as it should be! This is a great use of the assumingMemoryBound
API that proves that we still need it even though it's massively misused. Notice that, within the same statement, we pass a typed pointer in, which is then passed to a void *
callback. Naturally, within the callback, we can "assume" the type of the pointer!
The socket API
Current Swift
class Socket {
func read(pointer: UnsafeMutablePointer<UInt8>, size: Int)
}
mutating func withMutableWritePointer(body: (UnsafeMutablePointer<UInt8>, Int) {
//...
let localWriteResult = try body(rawptr.baseAddress!.assumingMemoryBound(to: UInt8.self), ptr.count)
//...
}
buffer.withMutableWritePointer(body: socket.read(pointer:size:))
Safe Swift
class Socket {
func read(pointer: UnsafeMutableRawBufferPointer)
}
mutating func withMutableWritePointer(body: (UnsafeMutableRawBufferPointer) {
//...
let localWriteResult = try body(rawptr)
//...
}
buffer.withMutableWritePointer(body: socket.read(pointer:size:))
I think it's extremely rare to directly call a C API like socket or pthread directly from Swift. There's always going to be a wrapper. Just use raw pointers in the wrapper, which is really untyped. And use buffer pointers rather than passing around counts!
Extending sock_addr_in
Current Swift
extension sockaddr_in: SockAddrProtocol {
mutating func withSockAddr<R>(_ body: (UnsafePointer<sockaddr>, Int) throws -> R) rethrows -> R {
var me = self
return try withUnsafeBytes(of: &me) { p in
try body(p.baseAddress!.assumingMemoryBound(to: sockaddr.self), p.count)
}
}
}
func doBind(ptr: UnsafePointer<sockaddr>, bytes: Int) throws {
try Posix.bind(descriptor: fd, ptr: ptr, bytes: bytes)
}
switch address {
case .v4(let address):
address.withSockAddr(doBind)
Safe Swift
extension sockaddr_in: SockAddrProtocol {
mutating func withSockAddr<R>(_ body: (UnsafePointer<sockaddr>, Int) throws -> R) rethrows -> R {
return withUnsafePointer(to: self) {
$0.withMemoryRebound(to: sockaddr.self, capacity: 1) { p in
try body(p, MemoryLayout<sockaddr_in>.size)
}
}
}
}
func doBind(ptr: UnsafePointer<sockaddr>, bytes: Int) throws {
try Posix.bind(descriptor: fd, ptr: ptr, bytes: bytes)
}
switch address {
case .v4(let address):
address.withSockAddr(doBind)
Here you can finally see a legitimate use of withMemoryRebound(to:)
. We have a pointer to memory, of a known type sockaddr_in
. We need to bridge out to an API that needs to view the same memory as a different type for the duration of the call. The scope is nicely contained with no chance of accessing the same memory from differently typed pointers.
Here's where withUnsafeTypePunnedPointer
could be used--at the lowest level of C interop. But let's call it withUnsafeTypeConvertedPointer
.