Clarification on initializing Unsafe…Pointers

I think I’m starting to get the hang (finally) of the various Unsafe…Pointer types. I’d like some clarification:

  1. Is it necessary to initialize a RawBufferPointer or RawPointer if I’m going to write to it write to it like this? This code seems to work correctly.

    var fd: FileDescriptor
    
    func
    get<T>()
        throws
        -> T where T : FixedWidthInteger
    {
        let pointer = UnsafeMutablePointer<T>.allocate(capacity: 1)
        defer
        {
            pointer.deallocate()
        }
    
        let buf = UnsafeMutableRawBufferPointer(start: pointer, count: MemoryLayout<T>.size)
        let bytesRead = try self.fd.read(into: buf)
        if bytesRead < MemoryLayout<T>.size) { throw Errors.unexpectedEOF }
    
        return pointer.pointee      //  Not sure if this is safe because of the
                                    //  defer { pointer.deallocate() }; I'm actually
                                    //  doing return T(bigEndian: pointer.pointee)
    }
    
  2. Is the return above safe as-is?

  3. Is there a way to wrap an UnsafeMutableRawBufferPointer around a trivial type without first initializing it? I don’t want to take the hit when I’m just going to read from disk to fill it. I'd like to do the equivalent of this:

    let fd: FileDescriptor = ...
    var bigArray = [UInt16](count: <many elements>)     //  Note: don't waste time initializing the values
    let buf = UnsafeMutableRawBufferPointer(start: &bigArray, count: bigArray.count * MemoryLayout<UInt16>.stride)
    try fd.read(into: buf)
    <use bigArray and not worry about deallocating it>
    

    I tried this, but it crashes in the print() statement:

    let count = 1024
    let buf = UnsafeMutableRawBufferPointer.allocate(byteCount: count * MemoryLayout<UInt16>.stride, alignment: MemoryLayout<UInt16>.alignment)
    try fd.read(into: buf)
    let values = buf.bindMemory(to: [UInt16].self)
    print("Got \(values.count) values: \(values[0]), \(values[1])")
    

My Core Image code to read an 11 GB, 100,000 x 50,000 pixel GeoTIFF image takes nearly a minute to load it. QGIS can load it, scale it, and display it in under one second. And it manages to find the minimum and maximum pixel values in the file. Question 3 above is more academic, since I don’t need typed arrays for the large data, but I’d like to be able to do that.

As always, thanks!

1 Like
  1. No, read will initialise the bytes. As T is a trivial type no further initialization is required.

  2. Yes. pointer.pointee copies the value our of the pointer. The defer executes after that statement finishes.

  3. You have two examples. In the first, your code example there is wrong, and it should emit a warning. Specifically, you are producing a pointer that immediately dangles. You cannot allocate an array and then use & on it in the constructor to a pointer. It is very important to remember that the & operator on an Array in this context is equivalent to wrapping the exact statement you call in array.withUnsafeBufferPointer. Thus, your code desugars to:

    let buf = bigArray.withUnsafeBufferPointer {
        UnsafeMutableRawBufferPointer(start: $0.baseAddress, count: bigArray.count * MemoryLayout<UInt16>.stride)
    }
    

    Assuming we care about your second example, the problem is that you bound it to the wrong type. You want buf.bindMemory(to: UInt16.self). You're storing UInt16s, not arrays of UInt16s.

2 Likes

For part 3 if you what you really wanted was an array, then what you actually want to use is Array's initializer init(unsafeUninitializedCapacity:initializingWith:). This will vend you a buffer pointer to uninitialised memory that will become the Array storage.

3 Likes

Ah! This is super cool! This worked:

let count = 1024
var values = [UInt16](unsafeUninitializedCapacity: count)
                { (ioBuf: inout UnsafeMutableBufferPointer<UInt16>, ioCount: inout Int) in
                    let buffer = UnsafeMutableRawBufferPointer(ioBuf)
                    let bytesRead = try fd.read(into: buffer)
                    ioCount = bytesRead / MemoryLayout<UInt16>.size
                }
print("Got \(values.count) values: \(values[0]), \(values[1])")

I find that quite elegant.

Interestingly, when I ran that code as-is in a Playground, it executed flawlessly. If I run it in an Xcode unit test, the compiler complains that the array init call can throw but is not marked with try. That makes me feel better, since I was wondering what would happen to an error in there.

Thank you!

1 Like

Pity there's no corresponding initializer for Data, or am I missing something? I tried to do

let fd = …
let offset = …
var block = Data(capacity: blockSize)
let bytesRead = try block.withUnsafeMutableBytes { ioBuffer in
    return try fd.read(fromAbsoluteOffset: offset, into: ioBuffer)
}

But it reads zero bytes (as you would expect, since the Data is empty).

For Data the corresponding initializer is .init(bytesNoCopy:deallocator:) if you don't need to be able to mutate the Data. If you do, then yes, there is no corresponding initializer.

Oh I did see that. I guess the technique there would be something like:

let buffer = UnsafeMutableRawPointer.allocate(byteCount: blockSize, alignment: MemoryLayout<UInt8>.alignment)
let bp = UnsafeMutableRawBufferPointer(start: buffer, count: blockSize)
let bytesRead = try fd.read(fromAbsoluteOffset: 0, into: bp)
let data = Data(bytesNoCopy: buffer, count: blockSize, deallocator: .custom({ b,c in b.deallocate() }))

Not quite as elegant, is it?

I took a stab at implementing the initializer for Data. Not sure if I’ve caught all the behaviors of Array, and I changed the kind of buffer I pass because I think it makes more sense (and saves me a step when using FileDescriptor.read()), but:

extension
Data
{
    init(unsafeUninitializedCapacity inCapacity: Int,
            initializingWith initializer: (inout UnsafeMutableRawBufferPointer, inout Int) throws -> Void)
        rethrows
    {
        let buffer = UnsafeMutableRawPointer.allocate(byteCount: inCapacity, alignment: MemoryLayout<UInt8>.alignment)
        var bp = UnsafeMutableRawBufferPointer(start: buffer, count: inCapacity)
        let originalAddress = bp.baseAddress
        var count: Int = 0
        defer
        {
            precondition(count <= inCapacity, "Initialized count set to greater than specified capacity.")
            precondition(bp.baseAddress == originalAddress, "Can't reassign buffer in Array(unsafeUninitializedCapacity:initializingWith:)")
        }
        
        do
        {
            try initializer(&bp, &count)
        }
        
        catch (let e)
        {
            self = Data(bytesNoCopy: buffer, count: count, deallocator: .custom({ b,c in b.deallocate() }))
            throw e
        }
        
        self = Data(bytesNoCopy: buffer, count: count, deallocator: .custom({ b,c in b.deallocate() }))
    }
}