How To Read UInt32 from a Data?

emag · August 3, 2022, 12:19pm

Hello, I have downloaded a binary file in a Data buffer now

How to get UInt32 from the Data? there is only getBytes available

thank you!
emag

michelf · August 3, 2022, 12:36pm

You can read four bytes and combine them, like this:

let value =
  (UInt32(data[0]) << (0*8)) | // shifted by zero bits (not shifted)
  (UInt32(data[1]) << (1*8)) | // shifted by 8 bits
  (UInt32(data[2]) << (2*8)) | // shifted by 16 bits
  (UInt32(data[3]) << (3*8))   // shifted by 24 bits

This works if bytes for this UInt32 were written in little endian order. For big endian, you need to reverse the the order of the shift values since the most significant part comes first.

emag · August 3, 2022, 12:42pm

Thank you michelf, this is a solution! But it's strange there isn't a direct way to read with different types from a buffer.

(I have already explainend that I m a newbie )

itaiferber · August 3, 2022, 1:16pm

There is actually a more direct to read the data, using UnsafeRawBufferPointer.load(fromByteOffset:as:):

let data: Data = ...
let integer = data.withUnsafeBytes { rawBuffer in
    rawBuffer.load(as: UInt32.self)
}

This takes advantage of the fact that Data.withUnsafeBytes can hand you an UnsafeRawBufferPointer representing the raw memory it wraps, and that buffer pointer allows you to load data of a given type directly from that memory.

Two caveats:

This still has the same endianness concerns that @michelf pointed out
You have to be very careful to only load trivial types from memory this way (i.e., integers, Float/Double, but not arbitrary structs like String or Array, or similar — their in-memory representation is not transferrable in this way)

nixberg · August 3, 2022, 1:39pm

Since I have to deal with this fairly often: endianbytes-swift.

import EndianBytes
assert(data.count == 4)
UInt32(littleEndianBytes: data)

emag · August 3, 2022, 1:43pm

Hello, I m a bit confused. Definitely to just read sequentially from a Data UInt32 values?

integer contains all the values in UInt32????

Thank you

itaiferber · August 3, 2022, 1:52pm

Good point, let's back up a step. What are you looking to do?

Read a single UInt32 value from a Data blob, such that

// Example data
[0x0A, 0x12, 0x32, 0x15, 0x1F, 0x53, 0x73, 0x2B, 0xCA, 0xBB, 0x00, 0x15, ...]

produces a single value 0x0A123215

Read many UInt32 values from a Data blob such that

[0x0A, 0x12, 0x32, 0x15, 0x1F, 0x53, 0x73, 0x2B, 0xCA, 0xBB, 0x00, 0x15, ...]

is read as

[0x0A123215, 0x1F53732B, 0xCABB0015, ...]

(i.e., reinterpret the data buffer as a buffer of 32-bit integers)

Read many UInt32 values from a Data blob such that

[0x0A, 0x12, 0x32, 0x15, 0x1F, 0x53, 0x73, 0x2B, 0xCA, 0xBB, 0x00, 0x15, ...]

is read as

[0x0000000A, 0x00000012, 0x00000032, 0x00000015, 0x0000001F, ...]

(i.e., read each individual byte from the data as a UInt32 value)

Something else?

The solution I provided will solve problem (1), but if you're looking to solve a different problem, it will help to know what you need so we can provide a solution.

emag · August 3, 2022, 2:39pm

the problem is that I tried to adapt your snipped

var value: UInt32

value = bin.withUnsafeBytes { rawBuffer in
rawBuffer.load(as: UInt32.self)}

i call it every cycle but I read always the same value how to advance in the buffer ?

itaiferber · August 3, 2022, 3:52pm

The first parameter to UnsafeRawBufferPointer.load(fromByteOffset:as:) indicates what offset in the buffer you want to read from — the default value is 0, and was ommitted from my snippet, so if you call rawBuffer.load(as: UInt32.self), it will always read from the same location, and give the same result.

If you want to iterate over the buffer through various points, you can maintain an offset variable, and call the method as

rawBuffer.load(fromByteOffset: offset, as: UInt32.self)

One thing to know is that if you expect to read the entire buffer sequentially as UInt32 values this way, there are more efficient ways to do that (e.g., you can convert the UnsafeRawBufferPointer to an UnsafeBufferPointer<UInt32> and iterate over that in a more effective way), but to know specifically what to recommend, we'd need a higher-level understanding of the whole problem you're trying to solve.

If you get the above working (reading from various offsets), feel free to share the updated code and we can offer more suggestions for improvement.

emag · August 3, 2022, 4:08pm

Thank you for your support!!!

tera · August 4, 2022, 1:00am

This is a high level implementation that doesn't use "Unsafe" in its implementation.
Supports signed/unsigned integers of various sizes, convenient subscript API, reading from offset, range checking, and optional endian conversion.

import Foundation

extension Data {
    
    subscript<T: BinaryInteger>(at offset: Int, convertEndian convertEndian: Bool = false) -> T? {
        value(ofType: T.self, at: offset, convertEndian: convertEndian)
    }
    
    func value<T: BinaryInteger>(ofType: T.Type, at offset: Int, convertEndian: Bool = false) -> T? {
        let right = offset &+ MemoryLayout<T>.size
        guard offset >= 0 && right > offset && right <= count else {
            return nil
        }
        let bytes = self[offset ..< right]
        if convertEndian {
            return bytes.reversed().reduce(0) { T($0) << 8 + T($1) }
        } else {
            return bytes.reduce(0) { T($0) << 8 + T($1) }
        }
    }
}

Usage example:

let value: UInt32 = data[at: 123]!
// or
let value = data.value(ofType: UInt32.self, at: 123)!

let value: Int16 = data[at: 123, convertEndian: true]!

Tests

func test() {
    let data = Data([0, 1, 2, 3, 4, 5, 6, 7])
    
    // subscript API:
    let value1: UInt32 = data[at: 3]!
    precondition(value1 == 0x03040506)
    let value2: UInt32 = data[at: 3, convertEndian: true]!
    precondition(value2 == 0x06050403)
    let value3: Int? = data[at: 1234]
    precondition(value3 == nil)
    let value4: UInt64 = data[at: 0]!
    precondition(value4 == 0x0001020304050607)
    let value5: UInt16 = data[at: 1]!
    precondition(value5 == 0x0102)

    // value API:
    let val1 = data.value(ofType: UInt32.self, at: 3)!
    precondition(val1 == 0x03040506)
    let val2 = data.value(ofType: UInt32.self, at: 3, convertEndian: true)!
    precondition(val2 == 0x06050403)
    let val3 = data.value(ofType: Int.self, at: 1234)
    precondition(val3 == nil)
    let val4 = data.value(ofType: UInt64.self, at: 0)!
    precondition(val4 == 0x0001020304050607)
    let val5 = data.value(ofType: UInt16.self, at: 1)!
    precondition(val5 == 0x0102)

    print("done")
}

test()

emag · August 5, 2022, 4:38pm

Thank you! It works good.

But is there a MemoryStream class with seek, write, read that imports bytes from a Data ?

I need to read a file from an URL in binary then I need to move at an offset and change some bytes.

Then I need to read again the modified stream as a sequence of UInt32 it's for that I was looking for a read of UInt32.

I used to work in Delphi and with a TMemoryStream is very easy to do that.

BRs

michelf · August 5, 2022, 6:46pm

I suggest you make your own memory stream type. It's just a Data and a position. Maybe this can get you started. You can add other integer types as needed.

struct MemoryStream {
  var data: Data
  var position: Int = 0

  mutating func readBytes(count: Int) throws -> Data {
    guard position+count <= data.count else {
      throw MemoryStreamError.endOfStream
    }
    let bytes = data[position..<position+count]
    position += count
    return bytes
  }

  mutating func readUInt32LE() throws -> UInt32 {
    let bytes = try readBytes(count: 4)
    return
      (UInt32(bytes[0]) << (0*8)) | // shifted by zero bits (not shifted)
      (UInt32(bytes[1]) << (1*8)) | // shifted by 8 bits
      (UInt32(bytes[2]) << (2*8)) | // shifted by 16 bits
      (UInt32(bytes[3]) << (3*8))   // shifted by 24 bits
  }

  mutating func writeByte(_ byte: UInt8) {
    if position == data.count {
      data.append(byte) // extend data
    } else {
      data[position] = byte // overwrite
    }
    position += 1
  }

  mutating func writeUInt32LE(_ value: UInt32) {
    writeByte(UInt8(truncatingIfNeeded: value >> (0*8))) // shifted by zero bits (not shifted)
    writeByte(UInt8(truncatingIfNeeded: value >> (1*8))) // shifted by 8 bits
    writeByte(UInt8(truncatingIfNeeded: value >> (2*8))) // shifted by 16 bits
    writeByte(UInt8(truncatingIfNeeded: value >> (3*8))) // shifted by 24 bits
  }
}

enum MemoryStreamError: Error {
  case endOfStream
}

Warning: this code is not really tested.

eskimo · August 6, 2022, 11:29am

I tend to exploit the fact that Data is its own slice type here. That means that removing bytes from from the front of a Data value is super cheap. So I don’t both keeping track of position, I just remove the bytes I’ve parsed. And I keep a copy of the original Data value around if I want to ‘reset’ the stream.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

eskimo · August 6, 2022, 11:32am

Oh, I knew I’d post an example of this previously, but it wasn’t here but over on DevForums.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

michelf · August 6, 2022, 11:37am

You're making me realize the code I posted has a flaw: Data, unlike Array, does not guaranty the first index is zero. Slices of Data in particular don't start with a zero index (because they're slices). So if it so happen that if you set MemoryStream's data to a slice, it'll fall apart. Easy mistake to make and not notice since in most cases Data first index will be zero. To fix this, position needs to start at data.firstIndex and not be compared with data.count but with data.lastIndex.

Keeping a slice of the remaining data (instead of position) is a good idea for reading, but won't work well for the writing case.