Why is `UnsafeBufferPointer`'s `baseAddress` optional?

lukasa · March 20, 2023, 7:47pm

I don't dispute that for a moment. I do dispute that you can have the following three things be true:

All 64-bit bit patterns are valid addresses.
MemoryLayout<UnsafePointer<T>?>.size == 64
UnsafePointer<T>.nil is distinguishable from all non-nil pointer values.

If you have only 64 bits to play with, and all 64 are part of your valid address space, then there is nowhere to put the information that distinguishes nil and .zero.

To @ksluder's example, if nil and .zero have the same bit pattern, then it is impossible to have a situation where the pointer value is .zero but not nil, or vice versa. There is just nowhere to put that information. Unless @ksluder has invented a perfect compression algorithm, his premise cannot work.

lukasa · March 20, 2023, 7:56pm

While following up, I'll also add: a lot of people's code sure does.

jrose · March 20, 2023, 8:10pm

You are correct, but nobody said all 64-bit bit patterns are valid addresses, only that all patterns on the null page might be valid addresses. FFFFFFFF`00000000 is a perfectly cromulent null pointer bit pattern if everyone agrees it is.

EDIT: yes, yes, obviously porting the standard library to that environment would mean changing the implementation of init(bitPattern:), but that’s a normal possibility for porting any stdlib code to a new environment.

ksluder · March 20, 2023, 8:20pm

You can’t have all three all the time, but you can have a different subset at different points of the program’s execution.

To reuse the above example:

if let ptr = returnsAnOptionalPointer(),
       ptr == .zero
{
    return 1
} else {
    return 0
}

for the if let ptr = returnsAnOptionalPointer() portion, you can temporarily break constraint #2 and use an alternative representation of Optional<UnsafeRawPointer> that can distinguish returning nil from UnsafeRawPointer(bitPattern: 0). Once the conditional pattern has been matched, the inner branches can use the narrower representations because the type system guarantees a single interpretation of the all-zeroes bit pattern, which means constraint #3 can be ignored without consequence.

Looking at the implementation of returnsAnOptionalPointer:

func returnsAnOptionalPointer() -> UnsafeRawPointer? {
    let optionalPtr: UnsafeRawPointer?
    if coinFlip() == .heads {
        optionalPtr = nil
    } else {
        optionalPtr = .init(bitPattern: 0)
    }
    return optionalPtr
}

The compiler would need to track optionalPtr’s optionality as a hidden variable. That does mean taking a pointer to an optional pointer with such a representation would be unsafe, and the compiler should warn about it:

let optionalPtr: UnsafeRawPointer?
withUnsafePointer(to: optionalPtr) { }
// warning: cannot distinguish between an nil pointee and a zero pointee when taking a pointer to an optional pointer type
// fixit: unwrap optionalPtr before taking a pointer to it

lukasa · March 20, 2023, 8:32pm

This would be a fantastic argument except for the fact that it has entirely lost sight of what we're talking about. I'd like to rewind to where we started:

This entire conversation is predicated on the question of whether it is safer to dereference nil than any other pointer value. If we allow that there exists a pointer value that is always assigned to nil/NULL/nullptr, then it doesn't matter one iota what its bit pattern is, it is always safer to deference that than not to. I'm gravitating to the question of the all-zero bit-pattern because @ksluder is, but nothing in the point I made 18 posts ago rests on the question of that bit-pattern.

Yes, this 100% works. So does a CHERI-like solution, or having a side-table full of pointer provenance values. We can always widen the value of the pointer. I agree with you. But at this point, a pointer is no longer a 64-bit bit pattern: nil and the zero valued pointer no longer have the same representation. The hidden variable is an intrinsic part of their bit-pattern.

scanon · March 20, 2023, 9:37pm

I would suggest that this thread has come entirely derailed and is not making any meaningful progress on the actual question posed.

ksluder · March 21, 2023, 12:39am

Question posed:

The answer is “when Swift becomes usable in environments where the bit representation of nil clashes with the bit-representation of pointer-to-zero.”

Seems quite on-topic to me.

jrose · March 21, 2023, 1:00am

This is incorrect. In such an environment, {nil, 0} would be a valid empty buffer, and nil would be “no buffer”, exactly as it is in existing environments. The representation of a null pointer does not affect this decision either way; both choices for UnsafeBufferPointer behave exactly the same whatever the representation of nil is.

Swift will never support an environment where every 64-bit value is a valid, non-nil UnsafeRawPointer.

(P.S. I think it could, though it would be a fair amount of work because a lot of the compiler and runtime assume that’s not the case. But that still wouldn’t affect whether {nil, 0} was a valid UnsafeBufferPointer.)

John_McCall · March 21, 2023, 1:45am

Yeah, I agree with the folks saying that the discussion here is no longer particularly useful as a response to the original question. If people would like to start a thread about using Swift for system implementation, that could be very interesting. I am closing this thread, though.