withUnsafeTypePunnedPointer

Andrew_Trick · July 23, 2018, 2:03am

@johannesweiss is correct. If the code always binds the buffer to UInt8 before it initializes the memory (via either initializeMemory or bindMemory), and the memory is never bound to a different type, then all of your assumingBoundMemory(to: UInt8.self) calls would pass a hypothetical memory model verifier. Wherever I said "correct" in my previous post, I should have said "safe" (except maybe the sockaddr_in example).

I know you fully understand how the Swift APIs works and why those uses of assumingMemoryBound(to) were dangerous as written. But for the sake of a larger audience...

UnsafeRaw[Buffer]Pointer is an important family of types for working with byte buffers and streams. Regular developers should feel comfortable with and confident about using them safely. Developers should not need to reason about the type that memory is "bound" to, and they should not normally need to use the explicit memory binding APIs: bindMemory(to:capacity), withMemoryRebound(to:capacity:), and assumingMemoryBound(to:).

It's validating for me to see after reviewing SwiftNIO that it's exactly the same C APIs causing the need for memory binding workarounds today as originally motivated those memory binding API workarounds here UnsafeRawPointer Migration Guide.

The usability of raw pointers was improved quickly after the initial Swift 3 release by introducing
UnsafeRawBufferPointer. That reduced the temptation for pure Swift code to bind raw memory to a type just as a convenience.

`assumingMemoryBound(to:)`

I'm very concerned that developers seeing code that uses assumingMemoryBound will think it is a generally safe and intended mechanism for pointer type conversion in Swift. It's not that. It is entirely a backdoor for writing wrappers around C APIs.

I consider any Swift code calling assumingMemoryBound to be unsafe unless:

It is accompanied by a comment explaining why the assumption holds in this particular case.
That logic can be easily confirmed by locally reasoning about the code.

I know of two legitimate use cases for assumingMemoryBound:

Writing a wrapper around a C API taking a void * callback. The pthread_create example from SwiftNIO that I posted above is a perfect example of that.
Passing a Swift Unsafe[Mutable]RawPointer to a C API taking char * or unsigned char *.

(The original Swift 3 migration guide justified the need for this API using exacly the same two pthread and CString examples.)

In both cases, the typed pointer produced by assumingMemoryBound(to:) should be passed directly to the imported C code without accessing it in Swift. There's always a danger that the imported C function could by replaced by a Swift shim. However, I don't know how to avoid assumingMemoryBound(to:) in these cases without special support in the Swift type system, which is a fairly high bar.

`withMemoryRebound(to:capacity:)`

withMemoryRebound(to:capacity:) is safer than assumingMemoryBound, but also only intended for C interop when two distinct imported C type are known to have a compatible layout. Again, the canonical example of this from the original Swift 3 migration guide is exactly the same case where it's needed in SwiftNIO:

var addr = sockaddr_in()
let sock = socket(PF_INET, SOCK_STREAM, 0)

let result = withUnsafePointer(to: &addr) {
  // Temporarily bind the memory at &addr to a single instance of type sockaddr.
  $0.withMemoryRebound(to: sockaddr.self, capacity: 1) {
    connect(sock, $0, socklen_t(MemoryLayout<sockaddr_in>.stride))
  }
}

The subject of this thread is (or was) introducing a new API to make this particular case more convenient:

// Temporarily bind the memory at &addr to a single instance of type sockaddr.
let result = withUnsafeTypeConvertedPointer(to: &addr, as: sockaddr.self) {
  connect(sock, $0, socklen_t(MemoryLayout<sockaddr_in>.stride))
}

`bindMemory(to:capacity)`

bindMemory(to:capacity) is the only of the three meant for use in pure Swift. It allows Swift code to decouple memory allocation from knowledge of the type that memory will hold. So, it's useful for layering a raw memory allocator, or raw byte buffer underneath a typed view of the buffer.

Interestingly, SwiftNIO reveals a situation where bindMemory(to:capacity:) is also needed for C interop. The C struct z_stream exposes a typed pointer into a raw byte buffer.

struct z_stream {
  unsigned char *next_in;
  //...
}

zstream.next_in = dataPtr.bindMemory(to: UInt8.self, capacity: count)

I'm not sure how else to get around this without defining your own layout compatible z_stream struct in C that takes void *. Then you would need to deal with the additional type mismatch when calling zlib.

If you ask me, this is a pretty good example of why it could be worthwhile to add some type system support for importing [unsigned] char * "differently", at least in some cases.