Two Data instances not Equal

I have 2 Data types that are != but when comparing the data as string, the values are equal:

data != originalData

but

data.map { String(format: "%02hhX", $0) }.joined() == originalData.map { String(format: "%02hhX", $0) }.joined()

Does anyone have any pointers to determine why the data types are not equal?

(lldb) po data
▿ 1878 bytes
  - count : 1878
  ▿ pointer : 0x000000016f067a00
    - pointerValue : 6157662720
(lldb) po originalData
▿ 1878 bytes
  - count : 1878
  ▿ pointer : 0x000000016f891200
    - pointerValue : 6166221312

I see the pointer values are different, which could be the reason?

I would guess that if data != originalData then they are not equal and perhaps something is lost in the string formatting?

Out of curiosity what happens if you add the following:

var index = 0
for (byte1, byte2) in zip(data, originalData) {
	if byte1 != byte2 {
		print("\(byte1), != \(byte2) at index \(index)")
	}
	index += 1
}

I assume that Data's equality check checks that the contents are the same, and not that both data have the same pointer. That seems to be true for the Foundation rewrite anyway...

2 Likes

It's quite easy for strings constructed from non identical data end up being equal.

Edited out: see my post below.

I'd recommend to go through data byte by byte and see where the difference is:

// could be typos as I'm typing this in a browser window
precondition(data1.count == data2.count)
for i in 0 ..< data1.count {
    if data1[i] != data2[i] {
        print("first difference found at \(i) offset, \(data[i] != data2[i])")
        break // comment this out to see all differences
    }
}
2 Likes

AFAICT the code you use to render the data to a string is correct, which makes it a bit of a mystery as to why this is failing. I’d also like to see the results of the byte-by-byte comparison test suggested by Diggory and tera.

I see the pointer values are different, which could be the reason?

That shouldn’t be the reason. Consider this code:

let d1 = Data("Hello Cruel World!".utf8)
let d2 = Data("Hello Cruel World!".utf8)

The resulting pointers are different but the values compare equal:

(lldb) po d1 
▿ 18 bytes
  - count : 18
  ▿ pointer : 0x0000600001ee5a60
    - pointerValue : 105553148664416
  ▿ bytes : 18 elements
	…
(lldb) po d2
▿ 18 bytes
  - count : 18
  ▿ pointer : 0x0000600001ee5a80
    - pointerValue : 105553148664448
  ▿ bytes : 18 elements
	…
(lldb) p d1 == d2
(Bool) true

Data is supposed to implement byte-by-byte equality.

Share and Enjoy

Quinn “The Eskimo!” @ DTS @ Apple

2 Likes

Quinn is right. I didn't pay attention to the format string: what you written initially should work as you are converting data to a hex string. I'm curious why it doesn't work for you.

BTW, I've seen instances when debugger lied to be, e.g. showing true for false or vice versa. "print" the result of comparison to be sure.

1 Like

Comparing the data byte by byte results in the data also not being equal. The magic must be in the decoding from data to string?

I copied used the Foundation implementation and it also returns false.

If you can present this as a reproducible standalone test app I'd love to see that.

What does base64 encoding comparison yield?

If it differs perhaps you can share the two base64 encoded strings and we can play around with the bytes ourselves

print(data.base64EncodedString() == originalData.base64EncodedString())

prints false as well.

Can you share the two base64Encoded strings?

Or is it sensitive data?

1 Like

When I wrote my tiny test code I found that my two datas had the same pointerValue - is that because when the data are small enough, they can be packed together into the size of one pointer? (like small strings?)

Or is this a bug in Xcode?


import Foundation

let hello = "Hello"
var hello2 = "Hell"
hello2.append("a")

let data1 = Data(hello.utf8)
let data2 = Data(hello2.utf8)

if data1 == data2 {
	print("data1 == data2 is true")
}

//  BREAKPOINT HERE

var index = 0
for (byte1, byte2) in zip(data1, data2) {
	if byte1 != byte2 {
		print("\(byte1), != \(byte2) at index \(index)")
	}
	index += 1
}

LLDB PO

(lldb) po data1
▿ 5 bytes
  - count : 5
  ▿ pointer : 0x000000016fdfebe8
    - pointerValue : 6171913192
  ▿ bytes : 5 elements
    - 0 : 72
    - 1 : 101
    - 2 : 108
    - 3 : 108
    - 4 : 111
(lldb) po data2
▿ 5 bytes
  - count : 5
  ▿ pointer : 0x000000016fdfebe8
    - pointerValue : 6171913192
  ▿ bytes : 5 elements
    - 0 : 72
    - 1 : 101
    - 2 : 108
    - 3 : 108
    - 4 : 97

Small Datas are not on the heap indeed. Check this thread for more information: Swift 5: How to test Data(bytesNoCopy:count:deallocator:)? - #2 by benrimmington

2 Likes

No. There is a length threshold beyond which the pointers become distinct.