The key to this is that you can read arbitrary data from a raw pointer without the need to bind it. The right terminology here would be that you can load
data from the pointer, as with UnsafeRawBufferPointer.load(fromByteOffset:as:)
.
Reframed slightly: when you bind memory in Swift, you are asserting to the compiler and optimizer that that memory can only contain data of a certain type, be it UInt8
, Double
, String
, or MyCustomType
. Memory can only be bound to a single type at a time, cannot be arbitrarily rebound (see UnsafePointer.withMemoryRebound(to:capacity:)
for some more nuance), and cannot be unbound without being deallocated.
Raw memory, on the other hand, does not have these restrictions, and can be accessed byte-wise as any type, so long as you get the stride and alignment correct. From the UnsafeRawBufferPointer
docs:
Each byte in memory is viewed as a UInt8
value independent of the type of values held in that memory. Reading from memory through a raw buffer is an untyped operation.
In addition to its collection interface, an UnsafeRawBufferPointer
instance also supports the load(fromByteOffset:as:)
method provided by UnsafeRawPointer
, including bounds checks in debug mode.
Leaving memory unbound makes reading from it significantly more manual (you have to correctly manage byte offsets and ensure your stride and alignment are correct), but it means that you can safely read anything you want out of it.
So this is how both UnsafeRawBufferPointer
and Data
can be sequences of UInt8
— the UInt8
Element
type is really a "byte" type which you're getting raw access to. The difference between that and UnsafeBufferPointer<UInt8>
is... subtle... if there is truly a meaningful difference. The language can guarantee that the compiler and optimizer treat UnsafeBufferPointer<UInt8>
as UnsafeRawBufferPointer
and vice versa, and guarantee that it is always safe that rebinding that way is safe — it just doesn't, yet. If and when it does, Data
can certainly hand out an UnsafeBufferPointer<UInt8>
safely; until then, the safest thing to do is have it hand out UnsafeRawBufferPointer
exclusively.
This whole thing started for me when looking at trying to fwrite
some information. Sometimes that information comes in the form of a String
sometimes in the form of Data
output of JSONEncoder
. I tried to unify that by way of withContiguousStorageIfAvailable
, clearly unsuccessfully. In light of this discussion, it feels like JSONEncoder
should be returning UnsafeBufferPointer<UInt8>
and not Data
.
I don't have access to a machine with Swift on it at the moment to verify, but IIRC, fwrite
takes a const void *
, which I believe should export to Swift as an UnsafeRawPointer
. If this is the case, you should still be able to abstract over these types using a custom protocol — because UnsafeBufferPointer<UInt8>
itself conforms to ContiguousBytes
, you should be able to get a consistent buffer from both String
and Data
, and write that out.
FWIW, this is definitely a really thorny topic! I think life would be a lot simpler if the language could make some clearer guarantees about convertibility between UInt8
pointers and Raw
pointers, but with the goal here being safety and increasing the bar from how easy it is to make memory-aliasing mistakes in other languages like C, it can be a bit tough to prevent easy mistakes.