Windspeed in 2 bytes litte endian unsigned short uint16

I receive the data via Bluetooth LE in 2 bytes litte endian unsigned short uint16.
But how to transform it in SwiftUI in an easy way in Int?
Example 39 02 -> 569 -> 5,69 m/s

let data = characteristic.value
let wind: Int = data.withUnsafeBytes{ $0.pointee }

...brings the wrong result

Thanks for your help

When dealing with endianness, usually the best thing to do is just stick to basics:

Int(data[0]) | (Int(data[1]) << 8)

Thatโ€™s pretty much the definition of โ€œlittle-endian 16-bit valueโ€, the only tricky part being that you have to convert to the wider type first. (We get to ignore Int being signed because itโ€™s bigger than 16 bits.)

If this ends up being a hot path, you can put back the withUnsafeBytes and subscript the pointer instead of the Data struct, so the compiler can combine the two loads in an optimized build. But then you better be extra careful that you actually got at least two bytes!

EDIT: I do think it would be reasonable for e.g. UInt16 to provide something like init(littleEndian: some Collection<UInt8>), or possibly Vector<UInt8, 2> or however that ends up getting spelled. But the initializers that are there today are more harmful than helpful.

3 Likes

OK, Danke

Leider bin ich noch ziemlich am Anfang und tue mir noch etwas schwer.
Deshalb fรคllte es mir schwer das einzubauen.

Swift already has endian conversion stuff in the stdlib:

let data = UInt16(0x1234)
let value = Int(data.littleEndian)
print(String(value, radix: 16))

(Using littleEndian to convert from little-endian as opposed to to little-endian is a bit weird, but works fine โ€” all you need is "byte-swapped if the native endianness is not little-endian")

1 Like

Yes, those are the ones I don't like.

  • They don't help when converting from raw bytes (you still have to get an initial value)
  • They weaken the idea of a "UInt16 value", since both 0x1234 and 0x1234.littleEndian have the same type. (They shouldn't because they both represent the numeric value 0x1234, just in different ways. (Well, they would be different on a big-endian system. But the most reliable endian code does not care about the current system's endianness.))
1 Like

I want to further emphasise Jordan's point here: in my view it is almost always a mistake to have an integer type floating around in your program whose value has the "wrong" endianness. Raw bytes may represent an integer with a given endianness, but UInt16(0x1234) is always 0x1234: endianness isn't a feature of it.

That implies that thinking about endianness should always be restricted to the point of serialization/deserialization. This is the pattern NIO follows with its ByteBuffer type, where we provide an endianness parameter on readInteger/writeInteger. The user should always hold native endianness: what the ByteBuffer holds is an entirely different question.

4 Likes

To be clear, this would count as serialization, right?

// fills a buffer with random bytes, the bytes are reproducible if
// the provided generator is reproducible.
func fill(_ buffer: UnsafeMutableRawBufferPointer, using rng: inout some RandomNumberGenerator) {
    guard var base = buffer.baseAddress else { return }
    var count = buffer.count
    while count > 0 {
        let stride = min(count, MemoryLayout<UInt64>.size)
        withUnsafeBytes(of: rng.next().littleEndian) {
            let bytes = $0.baseAddress.unsafelyUnwrapped
            base.copyMemory(from: bytes, byteCount: stride)
        }
        (base, count) = (base + stride, count &- stride)
    }
}

Endianness concerns how a numeric value is represented in bytes.

In this case your UInt64 is only being used as fixed-size byte storage - its numeric value isn't important, so you don't need to care about endianness. Just copy the bytes of the UInt64 to the destination.

But then won't it give a different byte sequence if you run the same code with the same generator with the same state on a machine with different endianness?

The function claims to work at the byte level.

Every machine will observe the same bytes, in the same order. If they decide to interpret those bytes as multi-byte integers, they may consider them to have different numeric values (e.g. one system says 300, another says -23938923 - but they're still the same bytes). But the function doesn't promise anything about that.

Sorry, I thought you were saying that the .littleEndian inside the implementation could be removed. That's what I was speaking of when I asked if this counted as serialization.

I am - I'm saying it can be removed and should be removed. By fiddling with the endianness, it is not exactly reproducing the byte sequence from the generator in the buffer.

Instead, reading from the buffer becomes needlessly complex - where you need to read in specifically-sized chunks and then un-fiddle with the endianness to recover the original bytes. In fact, no endianness operations are actually necessary here, because the function describes itself in terms of a byte sequence, and individual bytes do not have endianness - just dump the bytes in to the buffer as you get them, without fiddling with them at all.

In summary: if you need machines to agree on the numeric value of a multi-byte integer, you need to define the endianness. If you only need machines to agree on a byte sequence, you do not.

In your case, the numeric value does not matter. In OP's case, it does.

But the byte streams are generated from numeric values (which will presumably be the same numeric values regardless of endianness), so surely if you read the bytes from those values in native endian order you get a different stream depending on the endianness of the machine. Which can even result in a different set of bytes being written out if the size of the buffer isn't a multiple of 8.

RandomNumberGenerator.next() describes the result as binary data, and your function describes itself as working on binary data. If you want to preserve the numeric value you can, but it isn't necessary, and in that case it would be more accurate to say // fills a buffer with a string of random little-endian UInt64s.

If you don't do the endian stuff and treat this just as binary data, all machines see the same bytes, in the same order, in their UInt64s (even if one thinks that is number 300 and another thinks it is number 231234424). They write them to the buffer in the same order.

Then, you should be able to use the following generator (which also does no endianness manipulation) to reproduce the exact same byte sequence from that buffer, regardless of the endianness of the machine it runs on:

struct ReplayRNG: RandomNumberGenerator {
    var remaining: Slice<UnsafeRawBufferPointer>

    mutating func next() -> UInt64 {
        guard remaining.count >= 8 else { fatalError("Empty") }
        let val = remaining.loadUnaligned(as: UInt64.self)
        remaining.removeFirst(8)
        return val    
    }
}

Those bytes, of course, may be interpreted as different numeric values, but both machines see the same bytes, and you can fill -> replay -> fill -> replay, again across machines of different endianness, and always get the same bytes.

No, because the integers contain the same bytes, in the same order. The only place endianness becomes apparent is when you interpret it as a number (e.g. performing math operations or printing the numeric value).

I don't understand how that can be true. Could you please explain why this is not proof that the bytes are in a different order depending on native endianness? I'm clearly missing something here.

Your example contains 2 wrapper RNGs, their next() functions return wrapped.next().littleEndian and .bigEndian respectively, and you are comparing the resulting byte streams. And of course they are different - big/little endian values do have different byte orders.

But that's not the issue - I'm saying that you don't need .littleEndian or .bigEndian at all to serialise or deserialise a byte stream.

Let's say the RNG gives your LE machine a binary integer which happens to have the numeric value 36029346783166592. Let's inspect its bytes:

// On a LE machine

func didGenerate(value: UInt64) {
  withUnsafeBytes(of: value) { print(Array($0)) }
  // [128, 0, 128, 0, 128, 0, 128, 0] <- byte sequence we want to preserve.

  print(value)
  // 36029346783166592 <- how this machine interprets it as a number. We don't really care about this; it's basically arbitrary.
}

That byte stream is the thing we actually want to preserve - not the number 36029346783166592. So I write those bytes in to a buffer, in the order they are, and pass them over to a BE machine. Like I showed in the ReplayRNG, when the BE machine reads, it just loads those exact bytes in to UInt64, except now:

// On a BE machine

func didRead(value: UInt64) {
  withUnsafeBytes(of: value) { print(Array($0)) }
  // [128, 0, 128, 0, 128, 0, 128, 0]  <- correct byte sequence! That's what we want!

  print(value)
  // 9223512776490647552 <- interpreted as a different numeric value, but who cares?
}

This really illustrates the choice that you have: you can have the same bytes, or the same numeric values, but not both. That is just inherent in the fact that the two machines use different byte patterns to represent the same number.

In your case, your function talks about preserving bytes, so it's a byte-stream. In OP's case, it's a windspeed reading from a sensor -- clearly, a numeric value, so endianness is relevant for OP but not for your use-case.

The fact that byte streams are endianness-independent is a major advantage. For example, it means that UTF-8 is endianness-independent while UTF-16 comes in big/little variants.

1 Like