Background: Comparable and Equatable vs. IEEE 754
The Comparable
and Equatable
protocols want to imply a total order. This is mostly satisfied by FloatingPoint
types, except for the behavior with NaNs. A total order requires that for any a and b, either a ≤ b or a ≥ b is true. But IEEE 754 requires that if either a or b is NaN, both of those comparisons are false.
Why Do We Care
Aside from mathematical purity, there's some highly undesirable behavior that results from the current behavior of API that depends on Comparable and Equatable with floating-point values. Some examples of misbehavior:
> let foo = [2.0, .nan, 1.0].sorted()
foo: [Double] = 3 values {
[0] = 2
[1] = NaN
[2] = 1
}
> foo.contains(.nan)
$R0: Bool = false
Obviously, these results are quite counterintuitive and likely to cause bugs for unsuspecting users.
Proposed Remedy
Use the @_implements
feature to make it so the comparison operators yield a total order on floating-point data when used in a generic context constrained only to Comparable
or Equatable
. When the comparison operators are used in a concrete context, or in a generic context constrained more tightly than Comparable
(e.g. an algorithm generic over FloatingPoint
), they should continue to have the current (IEEE 754) behavior.
Details
On FloatingPoint
, BinaryFloatingPoint
, and the concrete floating point types, the operators would continue to work as they currently do. When constrained only to Comparable or Equatable, we would use the following implementations instead:
@implements(Equatable,==(_:_:))
static func _totalEquals(lhs: Self, rhs: Self) -> Bool {
// Normal equality, except all NaNs compare equal to each other.
return lhs == rhs || lhs.isNaN && rhs.isNaN
}
@implements(Comparable,<(_:_:))
static func _totalLess(lhs: Self, rhs: Self) -> Bool {
// Normal less than, except NaN ordered below everything else.
return lhs < rhs || lhs.isNaN && !rhs.isNaN
}
@implements(Comparable,<=(_:_:))
static func _totalLessEqual(lhs: Self, rhs: Self) -> Bool {
// Normal less-equals, except NaN ordered below everything else.
return lhs <= rhs || lhs.isNaN
}
We would also need to modify hashing for FloatingPoint numbers to hash all NaNs to the same value; this is simple and involves only a well-predicted branch for most use.
Benefits
- API on Comparable and Equatable is much better behaved; the defaults Just Work.
- Floating point algorithms ported from other languages Just Works, because in such a context you still get the IEEE 754 comparisons.
- No performance penalty for the most common operations (concrete comparisons), very slight performance penalty for the comparisons that we replace.
Drawbacks
- It's surprising to have a basic operator behave differently in a concrete or generic context. I personally believe that it's OK here, because users don't write a lot of generic code against
Comparable
; rather they will the existing APIs that depend on it. So all that most users see is that those APIs now give more sensible answers. Users writing code against more specific protocols would see no change in behavior. - Very small performance penalty. I think that's ok in the name of getting a sensible answer.
- It isn't quite possible to do this yet. An implementation is blocked by ([SR-8961] @_implements still not quite powerful enough · Issue #51466 · apple/swift · GitHub)
Alternatives Considered
-
OMG Floating-point is broken, make
<
and==
do the right thing.
Source breaking, and also breaks everyone's expectations coming from almost any other language. A non-starter. Contrary to many people's belief, this would not be a violation of IEEE 754; the standard says "you have to have an operation with these semantics", but does not specify how it needs to be spelled in source code. -
Use the IEEE 754 totalOrder predicate.
Unfortunately, this is really a totalOrder on canonical encodings, rather than numeric values. Doing this would lead to behavior just as confusing, in the presence of redundant encodings:
> [-0.0].contains(0)
$R0: Bool = false
> [20.0 as Decimal64].contains(2.0e1) // hypothetical IEEE decimal type
$R1: Bool = false
This predicate was well-intentioned by the 754 committee, but the semantics are totally wrong for our purposes.
- Do nothing.
I guess. The world hasn't quite caught fire yet.
Additional Notes
We should simultaneously deprecate the legacy isEqual(to:)
, isLess(than:)
and isLessThanOrEqualTo(:)
methods on FloatingPoint. These were originally added because we didn't yet have static operators, and they're occasionally useful as hooks, but the implementation ofFloatingPoint
comparisons should be made consistent with BinaryInteger
where we can, and have the operators be the primary thing.
We should consider introducing an intermediate protocol that refines Numeric
and is refined by both BinaryInteger
and FloatingPoint
, which would be Numeric + Comparable
. As a straw man, I would call it OrderedNumeric
. On it's own, this doesn't enable any new generic code to be written, but in the presence of @_implements
it can. This would be the protocol that abstracts over floating-point and integer, but where floating-point comparisons still have IEEE 754 semantics. I know that Xiaodi has needed to be able to describe this at some point in the past.