Unaligned pointers: Why doesn't this crash?

So, accessing an unaligned pointer this way in debug mode causes a crash, as you'd expect:

import Foundation

@inline(never)
func foo(_ ptr: UnsafeRawBufferPointer) -> UInt64 {
  let q = ptr.load(as: UInt64.self) // <- KABOOM!

  return q
}

let q = Data(0..<255).withUnsafeBytes {
  foo(UnsafeRawBufferPointer(rebasing: $0.dropFirst(1))) 
}

print(String(q, radix: 16))

No surprise there.

But bind it to a typed pointer instead, and it actually works:

import Foundation

@inline(never)
func foo(_ ptr: UnsafeRawBufferPointer) -> UInt64 {
  let q = ptr.bindMemory(to: UInt64.self)[0]

  return q
}

let q = Data(0..<255).withUnsafeBytes {
  foo(UnsafeRawBufferPointer(rebasing: $0.dropFirst(1))) 
}

print(String(q, radix: 16)) // <- outputs "807060504030201"

In release mode, both these and the proper variant using ptr.loadUnaligned basically compile down to a ldr instruction (the .load() and .loadUnaligned() versions also add a nil check against the buffer's baseAddress).

My question: Is the bindMemory version incorrect, since it's doing an unaligned memory access, or does bindMemory imply loadUnaligned on subsequent accesses of the resulting pointer? I want to say it would be the former (in C that would be undefined behavior for sure), but the fact that it's not crashing in debug mode is making me doubt myself a bit. Is the lack of crash there just an oversight, or is this actually legal somehow?

I suspect the alignment check is simply elided when building for Release. Unaligned loads to integer registers are legal on both x86 and Apple Silicon.

3 Likes

Most likely, but I'm wondering which is more correct according to the rules of the language, and whether unaligned typed pointers are legal or whether they constitute undefined behavior that just happens to work (as this would be in C).

1 Like

IIRC access through an unaligned Unsafe{Mutable}Pointer<T> does whatever LLVM does by default. So, yes, I'd consider this UB that just happens to work.

Many UnsafePointer APIs explicitly require alignment and will trap (at least in debug mode) if you attempt to create an unaligned pointer. For example, even UnsafePointer(bitPattern: Int) requires alignment.

Many UnsafePointer APIs have undefined behaviour if you somehow manage to create an unaligned pointer anyway. For example, the .pointee property. So even if you can construct one, almost every API would become a major hazard -- even more than is typical for an unsafe pointer.

And this is inherent in what you are asking the computer to do - if you say "I have a pointer to an Int, now get the pointee", you are saying that the pointer is valid for accessing data of type Int and the computer should just load the data directly. Types have alignment so of course your pointer needs to be aligned.

I think what you're trying to do is more general type-punning, and typed pointers are not really a great tool for that. It would be better to create some kind of ReinterpretedBuffer/ReinterpretedSpan type which explicitly does not require any particular alignment or memory binding (just stores a RawPointer/RawSpan internally) and performs unaligned loads as required.

Basically, there is a difference between "this is a pointer to an Int" and "this is a pointer to bytes that can be interpreted as an Int". Typed pointers model the former, and there is nothing built-in (yet) to model the latter.

1 Like

bindMemory(to:) requires that the memory address is suitably aligned. This should be spelled out more explicitly in the documentation, like it is for assumingMemoryBound(to:) (@glessard, can you add this note, please?). As with all unsafe API, not all preconditions are checked--this is what makes them unsafe. You want .loadUnaligned().

3 Likes

It seems strange that debug builds trap for unaligned pointer access, when unaligned binding would be a much less frequent but equally effective check closer to the root cause.

1 Like

We have tried adding debug-mode checks in a variety of places, but they tend to be overly expensive for code that depends on unsafe pointers for the right reasons.

1 Like

We have been gradually adding more debug checks to these operations in the last few years. It’s a thing we do when we touch something, rather than an organized effort to do them all at once. Feel free to put up a PR for any you find.

As Guillaume noted, we only add them when we can do so with essentially zero impact on correct code. Sometimes there are compiler limitations that prevent adding them yet. Ultimately code that uses unsafe API is responsible for validating the required invariants—any checks that are added are merely a convenience.

A typed pointer is an aligned pointer, regardless of the type of Pointee. Until loadUnaligned(), abusing the pointee property like this was the only option. Putting aside compatibility, adding a check in that accessor is bad for performance. Another option seems to be adding checks that make it harder to construct unaligned pointers (a picket fence approach), but those also suffer from perf issues. I may not have tried reinforcing bindMemory in particular though.

1 Like

What is the right incantation for loading an unaligned integer in Swift? It cannot be bindMemory(to:) + loadUnaligned(), according to @scanon above.

You can loadUnaligned directly from an UnsafeRaw[Buffer]Pointer without binding (loadUnaligned(fromByteOffset:as:) | Apple Developer Documentation)

1 Like

Ah, I’m sorry. I thought @glessard was referring to some UnsafePointer<T>.loadUnaligned() method, but such a method doesn’t exist.

That would never happen, because UnsafePointer<T> is required to be correctly aligned.

1 Like
let ptr: UnsafeRawPointer
let q = ptr.loadUnaligned(fromByteOffset: anything, as: UInt64.self)

let buf: UnsafeRawBufferPointer
let r = buf.loadUnaligned(fromByteOffset: anythingInBounds, as: UInt64.self)

Yes, that’s why I was confused. I should have double-checked where loadUnaligned() was declared before asking.

1 Like

I suppose the documentation isn't entirely clear, as it says:

You use instances of the UnsafePointer type to access data of a specific type in memory. The type of data that a pointer can access is the pointer’s Pointee type. UnsafePointer provides no automated memory management or alignment guarantees. You are responsible for handling the life cycle of any memory you work with through unsafe pointers to avoid leaks or undefined behavior.

You could potentially read "UnsafePointer provides no alignment guarantees" as meaning it supports unaligned pointers.

I don't think that is what is meant -- it means that it does not (always) check, not that it does not require these things like liveness and alignment. For example, it also does not support dead pointers. But while there is some elaboration about manually managing lifetimes, there is no stronger type-level statement about having to manually satisfy alignment requirements.

It could be a bit clearer.

2 Likes

Agreed. Thanks for pointing this sentence out.

I'm not really trying to do anything here; this is mainly intellectual curiosity on my part. I noticed that the documentation for bindMemory didn't mention alignment, and neither did the documentation for the pointer types, so I decided to play with it in debug mode expecting it to crash, and when it didn't, I started wondering if I'd been assuming the wrong thing about Swift's memory model all this time due to my history with C. This thread answered the questions I had nicely.

Personally, if it were up to me, I'd probably make bindMemory trap in debug mode, since after all, debug mode is expected to be slow (that's why we build in release mode for distribution!), but it looks like we're getting a documentation fix out of this thread, and I think that's a win overall.

2 Likes