The behavior of UnsafeMutablePointer<Foo>.pointee.bar

I have this low level data type:

struct HieararchicalHashGrid {
    private var cellsPointer: UnsafeMutablePointer<Cell>
    // ...
}

Where Cell is this:

/// A cell in a hierarchical hash grid. It's size and stride is 32 bytes.
struct Cell {
    /// This 32-bit header holds the element count, grid index, cell coordinate,
    /// and 9 remaining bits reserved for future use. 
    /// ```
    /// 0b•••••••••_yyyyyyyy_xxxxxxxx_GGG_CCCC
    ///                                | | '--'- element count
    ///                                '-'------ grid index
    /// ```
    private var header: UInt32
    private var elements: (UInt16, UInt16, UInt16, UInt16, UInt16,
                           UInt16, UInt16, UInt16, UInt16, UInt16,
                           UInt16, UInt16, UInt16, UInt16)
    var count: UInt8 {
        get { return UInt8(truncatingIfNeeded: header & 0b1111) }
        set {
            let mask = ~UInt32(0b1111)
            header = (header & mask) | UInt32(newValue)
        }
    }
    var gridIndex: UInt8 {
        get {
            return UInt8(truncatingIfNeeded: (header &>> 4) & 0b111)
        }
        set {
            let mask = ~UInt32(0b111_0000)
            header = (header & mask) | (UInt32(newValue) &<< 4)
        }
    }
    var coordinate: SIMD2<UInt8> {
        get {
            return unsafeBitCast(
                UInt16(truncatingIfNeeded: (header &>> 7) & 0xff_ff),
                to: SIMD2<UInt8>.self)
        }
        set {
            let mask = ~(UInt32(0xff_ff) &<< 7)
            let ui16 = unsafeBitCast(newValue, to: UInt16.self)
            header = (header & mask) | (UInt32(ui16) &<< 7)
        }
    }
    subscript(index: UInt8) -> UInt16 {
        get {
            return withUnsafePointer(to: elements) {
                UnsafeRawPointer($0)
                    .assumingMemoryBound(to: UInt16.self)[Int(index)]
            }
        }
        set {
            withUnsafeMutablePointer(to: &elements) {
                UnsafeMutableRawPointer($0)
                    .assumingMemoryBound(to: UInt16.self)[Int(index)]
                    = newValue
            }
        }
    }
    // ...
}

Now, assuming that:

ptr = cellsPointer.advanced(by: someGlobalCellIndex)

I'd like to know if there's a more efficient way of accessing the computed properties and subscript of a Cell than this:

ptr.pointee.gridIndex = gridIndex
ptr.pointee.coordinate = SIMD2<UInt8>(x, y)
ptr.pointee[ptr.pointee.count] = someValue
ptr.pointee.count &+= 1

For example, will each .pointee result in a separate read/write of an entire Cell value (of 32 bytes)?

Would it be more, less or equally efficient if I wrote it like this instead:

var cell = ptr.pointee
cell.gridIndex = gridIndex
cell.coordinate = SIMD2<UInt8>(x, y)
cell[cell.count] = somveValue
cell.count &+= 1
ptr.pointee = cell

?

Note that ideally, I'd be accessing just a few (4+2=6) of the Cell's 32 bytes there. But perhaps it doesn't matter because all 32 bytes will end up being in the L1 cache or on the stack anyway?

No. You can test this out on godbolt.

Less efficient, but it's also saying something subtly different. For example, if you only accessed the cell through a pointer (like the previous example), it's possible that somebody else might mutate it via another pointer. This version, protects against such things by making an independent copy (including all the elements, because the subscript uses that value).

As for what the real implications of that are, that depends on the broader context of what you're doing, the architecture you're running it on and overall resource pressure, etc. It's very hard to say without actually measuring; there are tools like LLVM-MCA which can try to predict performance, but it's really hard to make it accurate.

1 Like