[Pitch] `Distinguishable` protocol for Quick Comparisons

I have some reservation about linking this to Equatable. It certainly is convenient because it avoids creating a new protocol and the semantics are often aligned with equality. But there's an implication that isIdentical is some sort of lesser test for equality, while it probably should not. For instance +0 and -0 in floating point values are equal (and easily determined in constant time), but should not be considered identical. I can come with other examples too.

If isKnownIdentical is not necessarily a shortcut for equality, then why is it defined inside of Equatable? Maybe it's convenient, but it doesn't really make sense, and it'll inevitably lead to some people defining it in term of equality when it should not.


Alternative: bitwise comparison by default, but customizable

What about doing it the other way around: do a bitwise comparison by default, but allow customization with a protocol. This would work similarly to how CustomStringConvertible changes the default for string conversions. Here's an implementation:

/// Type that provides a replacement implementation for checking whether two values are identical
/// when the default bitwise comparison is not appropriate.
protocol CustomKnownIdenticalComparable {
	/// Two values are identical when there is no observable difference between them.
	///
	/// This function can return `false` when it is too costly to determine whether two
	/// values are identical, or when it can't be determined due to uncertainty.
	///
	/// - Returns: `true` if two values are identical, `false`if not identical or undetermined.
	func isKnownIdentical(to other: Self) -> Bool
}

/// Compare two values checking if they are identical. If the type implements
/// `CustomKnownIdenticalComparable` then `isKnownIdentical(to:)` is called,
/// otherwise a bitwise comparison is performed (up to the specified byte limit).
///
/// Two values are identical when swapping the two makes no observable difference. This differs from
/// equality. For instance `Float` has a distinct `+0` and `-0` values which are equal but not identical.
///
/// This function can return `false` when it is too costly to determine whether two
/// values are identical, or when it can't be determined due to uncertainty.
///
/// - Returns: `true` if two values are identical, `false`if not identical or undetermined.
func areKnownIdentical<T>(_ a: T, _ b: T, bitwiseComparisonByteLimit: Int = .max) -> Bool {
	if let a = a as? any CustomKnownIdenticalComparable, let b = b as? any CustomKnownIdenticalComparable {
		_areKnownCustomIdentical(a, b)
	} else {
		areKnownBitwiseIdentical(a, b, byteLimit: bitwiseComparisonByteLimit)
	}
}

/// Compare two values checking if they are identical using a bitwise comparaison (ignoring any
/// conformance to `CustomKnownIdenticalComparable`.
///
/// This function can return `false` when it is too costly to determine whether two
/// values are identical (according to `byteLimit`) or when weak references are invalidated
/// concurrently during the comparison.
func areKnownBitwiseIdentical<T>(_ a: T, _ b: T, byteLimit: Int = .max) -> Bool {
	guard MemoryLayout<T>.size <= byteLimit else { return false }
	return withUnsafeBytes(of: a) { ptrA in
		withUnsafeBytes(of: b) { ptrB in
			memcmp(ptrA.baseAddress, ptrB.baseAddress, MemoryLayout<T>.size) == 0
		}
	}
}

/// Helper for calling `a.isKnownIdentical(to: b)`. `a` and `b` must be the same type.
private func _areKnownCustomIdentical<T, U>(_ a: T, _ b: U) -> Bool where T: CustomKnownIdenticalComparable, U: CustomKnownIdenticalComparable {
	let b = b as! T
	return a.isKnownIdentical(to: b)
}

Caching comparator

By itself, areKnownIdentical isn't going to be too performant (thanks to dynamic casts). But whether or not the type conforms to CustomKnownIdenticalComparable can be cached, avoiding unnecessary casts the next time. Here's an updated version of the protocol that goes along with a caching comparator:

protocol CustomKnownIdenticalComparable {
	func isKnownIdentical(to other: Self) -> Bool

	/// Returns `true` if two values of this type can be identical, otherwise returns `false`.
	/// This can be used to skip calls to `isKnownIdentical` for types that are never 
	/// identical. Default implantation returns `true`.
	static var canBeIdentical: Bool { get }
}
extension CustomKnownIdenticalComparable {
	static var canBeIdentical: Bool { true }
}

struct KnownIdenticalComparator<T> {
	let customCanBeIdentical = (T.self as? any CustomKnownIdenticalComparable.Type)?.canBeIdentical

	func areKnownIdentical(_ a: T, _ b: T, bitwiseComparisonByteLimit: Int = .max) -> Bool {
		switch customCanBeIdentical {
		case false:
			return false
		case true:
			let a = a as! any CustomKnownIdenticalComparable
			let b = b as! any CustomKnownIdenticalComparable
			return _areKnownCustomIdentical(a, b)
		default: // nil: no conformance to CustomKnownIdenticalComparable
			return areKnownBitwiseIdentical(a, b, byteLimit: bitwiseComparisonByteLimit)
		}
	}
}
1 Like

@lorentey Would you by any chance be able to forward some documentation that might explain this behavior more? Could you think of any recent SE proposals that might have experienced a similar situation and how that might have affected the review process (including why it was or was not a blocker on shipping)?

Cross quoting as this was spread across different posts:

Yes, but that's not good enough... e.g. those custom implementations will not participate in contains or other standard library algorithms that could benefit from it. People will have to re-implement their own versions of contains, etc, and that's not a small ask.

Different order of checks should not matter (please give an example if it does). It's true that isDefinitelyNotEqual could be implemented differently by different people (e.g. some would choose to compare the first bytes of Data value in addition to "count" check. But so is isDefinitelyEqual- it could also contain a custom logic unrelated to checking if the bits of the underlying references are equal. However the end result due to such differences is not catastrophic - with some implementations there would be more false positives (negatives?) and depending upon the end goal, just more calls to a full blown "isEqual" will follow, slightly affecting the resulting performance.

This should not be a show stopper! We've been through many of those naming choices.

1 Like

My personal POV here is that yes I do believe there would be some utility from copy-on-write collections shipping some flavor of isDefinitelyNotEqual. I just don't believe that shipping isDefinitelyNotEqual should necessarily need to block shipping the isIdentical/isDefinitelyEqual/isKnownIdentical.

I agree with the feedback from @lorentey. If we did ship isDefinitelyNotEqual my preference would be that it would ship as a separate member. As long as we're shipping it on a separate member… we could also then choose to discuss that as a separate design review that would not need to block a review on isIdentical.