[Pitch] `Distinguishable` protocol for Quick Comparisons

michelf · April 16, 2025, 1:21pm

lorentey:

public protocol Equatable {
   static func ==(left: Self, right: Self) -> Bool

    // Returns if `self` can be quickly determined to be identical to `other`.
    // 
    // - A `nil` result indicates that the type does not implement a fast test for 
    //   this condition, and that it only provides the full `==` implementation.
    // - A `true` result indicates that the two values are definitely identical 
    //   (for example, they might share their hidden reference to the same 
    //   storage representation). By reflexivity, `==` is guaranteed to return 
    //   `true` in this case.
    // - A `false` result indicates that the two values aren't identical. Their
    //   contents may or may not still compare equal in this case.
    //
    // Complexity: O(1). 
    func isKnownIdentical(to other: Self) -> Bool?
}

I have some reservation about linking this to Equatable. It certainly is convenient because it avoids creating a new protocol and the semantics are often aligned with equality. But there's an implication that isIdentical is some sort of lesser test for equality, while it probably should not. For instance +0 and -0 in floating point values are equal (and easily determined in constant time), but should not be considered identical. I can come with other examples too.

If isKnownIdentical is not necessarily a shortcut for equality, then why is it defined inside of Equatable? Maybe it's convenient, but it doesn't really make sense, and it'll inevitably lead to some people defining it in term of equality when it should not.

Alternative: bitwise comparison by default, but customizable

What about doing it the other way around: do a bitwise comparison by default, but allow customization with a protocol. This would work similarly to how CustomStringConvertible changes the default for string conversions. Here's an implementation:

/// Type that provides a replacement implementation for checking whether two values are identical
/// when the default bitwise comparison is not appropriate.
protocol CustomKnownIdenticalComparable {
	/// Two values are identical when there is no observable difference between them.
	///
	/// This function can return `false` when it is too costly to determine whether two
	/// values are identical, or when it can't be determined due to uncertainty.
	///
	/// - Returns: `true` if two values are identical, `false`if not identical or undetermined.
	func isKnownIdentical(to other: Self) -> Bool
}

/// Compare two values checking if they are identical. If the type implements
/// `CustomKnownIdenticalComparable` then `isKnownIdentical(to:)` is called,
/// otherwise a bitwise comparison is performed (up to the specified byte limit).
///
/// Two values are identical when swapping the two makes no observable difference. This differs from
/// equality. For instance `Float` has a distinct `+0` and `-0` values which are equal but not identical.
///
/// This function can return `false` when it is too costly to determine whether two
/// values are identical, or when it can't be determined due to uncertainty.
///
/// - Returns: `true` if two values are identical, `false`if not identical or undetermined.
func areKnownIdentical<T>(_ a: T, _ b: T, bitwiseComparisonByteLimit: Int = .max) -> Bool {
	if let a = a as? any CustomKnownIdenticalComparable, let b = b as? any CustomKnownIdenticalComparable {
		_areKnownCustomIdentical(a, b)
	} else {
		areKnownBitwiseIdentical(a, b, byteLimit: bitwiseComparisonByteLimit)
	}
}

/// Compare two values checking if they are identical using a bitwise comparaison (ignoring any
/// conformance to `CustomKnownIdenticalComparable`.
///
/// This function can return `false` when it is too costly to determine whether two
/// values are identical (according to `byteLimit`) or when weak references are invalidated
/// concurrently during the comparison.
func areKnownBitwiseIdentical<T>(_ a: T, _ b: T, byteLimit: Int = .max) -> Bool {
	guard MemoryLayout<T>.size <= byteLimit else { return false }
	return withUnsafeBytes(of: a) { ptrA in
		withUnsafeBytes(of: b) { ptrB in
			memcmp(ptrA.baseAddress, ptrB.baseAddress, MemoryLayout<T>.size) == 0
		}
	}
}

/// Helper for calling `a.isKnownIdentical(to: b)`. `a` and `b` must be the same type.
private func _areKnownCustomIdentical<T, U>(_ a: T, _ b: U) -> Bool where T: CustomKnownIdenticalComparable, U: CustomKnownIdenticalComparable {
	let b = b as! T
	return a.isKnownIdentical(to: b)
}

Caching comparator

By itself, areKnownIdentical isn't going to be too performant (thanks to dynamic casts). But whether or not the type conforms to CustomKnownIdenticalComparable can be cached, avoiding unnecessary casts the next time. Here's an updated version of the protocol that goes along with a caching comparator:

protocol CustomKnownIdenticalComparable {
	func isKnownIdentical(to other: Self) -> Bool

	/// Returns `true` if two values of this type can be identical, otherwise returns `false`.
	/// This can be used to skip calls to `isKnownIdentical` for types that are never 
	/// identical. Default implantation returns `true`.
	static var canBeIdentical: Bool { get }
}
extension CustomKnownIdenticalComparable {
	static var canBeIdentical: Bool { true }
}

struct KnownIdenticalComparator<T> {
	let customCanBeIdentical = (T.self as? any CustomKnownIdenticalComparable.Type)?.canBeIdentical

	func areKnownIdentical(_ a: T, _ b: T, bitwiseComparisonByteLimit: Int = .max) -> Bool {
		switch customCanBeIdentical {
		case false:
			return false
		case true:
			let a = a as! any CustomKnownIdenticalComparable
			let b = b as! any CustomKnownIdenticalComparable
			return _areKnownCustomIdentical(a, b)
		default: // nil: no conformance to CustomKnownIdenticalComparable
			return areKnownBitwiseIdentical(a, b, byteLimit: bitwiseComparisonByteLimit)
		}
	}
}

vanvoorden · April 21, 2025, 5:40pm

@lorentey Would you by any chance be able to forward some documentation that might explain this behavior more? Could you think of any recent SE proposals that might have experienced a similar situation and how that might have affected the review process (including why it was or was not a blocker on shipping)?

tera · April 21, 2025, 7:21pm

Cross quoting as this was spread across different posts:

Yes, but that's not good enough... e.g. those custom implementations will not participate in contains or other standard library algorithms that could benefit from it. People will have to re-implement their own versions of contains, etc, and that's not a small ask.

Different order of checks should not matter (please give an example if it does). It's true that isDefinitelyNotEqual could be implemented differently by different people (e.g. some would choose to compare the first bytes of Data value in addition to "count" check. But so is isDefinitelyEqual- it could also contain a custom logic unrelated to checking if the bits of the underlying references are equal. However the end result due to such differences is not catastrophic - with some implementations there would be more false positives (negatives?) and depending upon the end goal, just more calls to a full blown "isEqual" will follow, slightly affecting the resulting performance.

This should not be a show stopper! We've been through many of those naming choices.

vanvoorden · April 21, 2025, 7:31pm

My personal POV here is that yes I do believe there would be some utility from copy-on-write collections shipping some flavor of isDefinitelyNotEqual. I just don't believe that shipping isDefinitelyNotEqual should necessarily need to block shipping the isIdentical/isDefinitelyEqual/isKnownIdentical.

I agree with the feedback from @lorentey. If we did ship isDefinitelyNotEqual my preference would be that it would ship as a separate member. As long as we're shipping it on a separate member… we could also then choose to discuss that as a separate design review that would not need to block a review on isIdentical.

lorentey · May 22, 2025, 1:07am

I am not aware of this being documented anywhere; I'm not even sure if it's general knowledge amongst engineers working on ABI stable Swift code.

This is simply a consequence of the compiler being very eager to avoid having to dispatch through the protocol witness table, especially when the default implementation is marked @inlinable (which it almost always is, especially within the stdlib). "@inlinable" enables generic specialization (and sometimes it triggers inlining as well). The compiler is well within its rights to do that -- but types often end up with default implementations for protocol requirements by accident, rather than a deliberate design decision.

(This can matter if the default implementation is being replaced to fix a correctness issue -- only newly recompiled binaries will be guaranteed to get the corrected behavior.)

lorentey · May 22, 2025, 1:31am

My apologies, but I see no reason for the standard contains algorithm to invoke any of these identity tests. contains just needs to invoke the regular ==, exactly like it does today. The implementation of == is perfectly capable of taking identity shortcuts on its own, without exposing them as distinct entry points.

I expect the implementation you quoted to be generally slower than the simple loop we're currently shipping.

(For sure, it is a fine variant to ship in a package, under some other name -- but I don't think it would be a good idea to replace the standard contains with it. It's in a similar position, as, say, a variant that sliced the array and distributed the search into tasks executing in parallel: it would make sense for a narrow set of use cases, and it'd be actively hurting others.)

tera · May 22, 2025, 11:40am

I believe there are benefits of having a distinct entry point that could give quick positive or negative comparison results in O(1) time and return "can't determine" when it can't.

A simple test

protocol QuickEquatable: Equatable {
    func quickEqual(_ rhs: Self) -> Bool?
}

struct S: QuickEquatable {
    var payload: [Int]
    func quickEqual(_ rhs: Self) -> Bool? {
        if payload.count != rhs.payload.count {
            return false
        }
        precondition(MemoryLayout<Self>.size == MemoryLayout<Int>.size)
        if unsafeBitCast(self, to: Int.self) == unsafeBitCast(rhs, to: Int.self) {
            return true
        }
        return nil
    }
}

extension Array where Element: QuickEquatable {
    func quickContains(_ element: Element) -> Bool {
        var notEqualCount = 0
        
        for v in self {
            if let equal = element.quickEqual(v) { // O(1)
                if equal { // identical, thus equal
                    // quick path -> yes
                    return true
                } else { // quick not equal
                    notEqualCount += 1
                }
            }
        }
        if notEqualCount == count {
            // quick path -> no
            return false
        }
        
        // slow path
        return contains(element)
    }
}

func test() {
    let payloadSize = 100000
    let arraySize = 1000
    
    var s = S(payload: [Int](repeating: 0, count: payloadSize))
    s.payload[s.payload.count - 1] = 1
    
    var array = (0 ..< arraySize).map { _ in
        S(payload: [Int](repeating: 0, count: payloadSize))
    }
    array.append(s)
    
    let (quickResult, quickTime) = measure {
        array.quickContains(s)
    }
    let (normalResult, normalTime) = measure {
        array.contains(s)
    }
    precondition(quickResult == normalResult)
    print("normalTime: \(normalTime), quickTime: \(quickTime), ratio: \(normalTime / quickTime)")
    print()
}

// normalTime: 0.07006704807281494, quickTime: 0.0003390312194824219, ratio: 206.66842475386778

Compared to normal contains, quickContains first tries to find the element quickly. Only when it fails to find the element quickly it does normal (potentially slower) comparing.

FlorianPircher · May 22, 2025, 12:12pm

I would prefer to see this implemented as a global function. That way, autocomplete does not get polluted, but developers that want and know about this optimization still have easy access.

func areKnownIdentical<T: KnownIdenticalComparable>(_ lhs: T, _ rhs: T) -> Bool

The main benefit is that this allows working with optionals, where there are overloads to areKnownIdentical with T? on either side. If both values are nil, return true. If only one is nil, return false. If both are non-nil, call protocol function (e.g. a.isKnownIdentical(to: b) or static areKnownIdentical on T).

sveinhal · May 22, 2025, 1:53pm

I think the terms "product engineer" and "infra engineer" in this thread refers to the "API user" and "API author". What's up with these terms?

vanvoorden · May 22, 2025, 5:44pm

Without asking you to get too far down into exploratory coding at this point… could you brainstorm under what conditions this potential quickContains would perform slower? What would the Array.Element look like to slow perf down?

FWIW if there are legit perf wins in the standard library that can be unlocked once we have this "identical" test across generic contexts this would be impactful to reference in a proposal design review.

tera · May 22, 2025, 5:54pm

I could think of this use case when it has to iterate through the elements twice, for example:

    let payloadSize = 10
    let arraySize = 10_000_000
    
    var s = S(payload: [Int](repeating: 0, count: 2*payloadSize)) // different size!

    .....

    // warmup:
    _ = array.contains(s)

    let (quickResult, quickTime) = measure {
        array.quickContains(s)
    }
    let (normalResult, normalTime) = measure {
        array.contains(s)
    }
    // the rest as before

(I've added a warmup call, otherwise the second call was always the winner).

Result:

normalTime: 0.030986905097961426, quickTime: 0.03194892406463623, ratio: 0.9698888461868533

Note that it's only marginally slower.

vanvoorden · May 22, 2025, 11:11pm

Hmm… so if I understand correctly… it sounds like if we compile a protocol P with a default implementation for P.f and ship that on Swift 6.2 and then later in Swift 6.3 one of the types S that adopts P defines their own S.f in order to override the default implementation from P.f this could lead to nondeterministic behavior where an app that was compiled from 6.2 might or might not use the custom implementation S.f when the 6.3 runtime is available? But compiling that app from 6.3 would deterministically get us the custom implementation S.f? Is that the correct idea?

tera · May 28, 2025, 12:10pm

But if I want to optimise for size I'd have to create memcmp2 to not have the check scattered around the code base (besides it's quite easy to forget doing the check).

Could this be done this way:

<some inline pragma magic here>
int memcmp(const void* s1, const void *s2, size_t n) {
    if (s1 == s2) { return 0; }
    return realmemcmp(s1, s2, n);
}

<do not inline>
int realmemcmp(const void* s1, const void *s2, size_t n) {
    precondition(s1 != s2);
    ...
}

so that with -Osize this is not inlined and with -Ospeed the outer memcmp is inlined and the inner realmemcmp is not? That would save callers the trouble of doing the check manually or introducing a wrapper. And if I absolutely sure that the two pointers are different I'd call realmemcmp.