Comparable and FloatingPoint types

Nevin · October 11, 2018, 4:08pm

Here is a great example where distinguishing between “>” and “&>” makes a real-world difference. If we introduce “&>”, then numbers.sorted(by: >) will “do the right thing”, and anyone who actually wants to sort by “&>” can easily do so.

Ben_Cohen · October 11, 2018, 4:39pm

I'm a big +1 on this but wanted to chip in another piece of the justification.

The examples from the standard library are surprising and bad, but not terrifying. However, the possible bugs are potentially far worse, and could result not just in failure to sort or find, but in data corruption.

The rules of Comparable give certain latitude to implementors to make optimizations. For example:

let a = [Double.nan]
var b = a

// true, because Array has an optimization
// that when both arrays are sharing storage,
// there's no reason to compare each element
// because Equatable guarantees reflexivity
assert(a == b)

// force unsharing of storage
b.reserveCapacity(100)

// now they don't share storage, Array must do an
// element-wise comparison, which will return false
// because Double violates the reflexivity requirement
assert(a == b) // trap

How could this lead to data corruption? Among the various requirements are that:

equality implies substitutability – that is, if two items are equal, you can replace one with another
if a < b is false, and b < a is false, then a and b are equal

Now, luckily no algorithms in the standard library make use of this optimization (that I know of!). But they easily could. And in that case, you could find that e.g. sorting doesn't just fail to order the collection appropriately, but even throws out some elements and duplicates others.

Tino · October 11, 2018, 9:07pm

I might have missed something (there are really many threads...), but couldn't we just introduce a second set of floating point types that strictly follow IEEE, and "fix" the existing ones?
My expectation is that porting numeric algorithms is a niche topic, and that those who need standard behavior have some knowledge about their datatypes — and we could "redress" them for having to import Numerics with some functions that are useful for their work.

taylorswift · October 11, 2018, 9:59pm

i agree there’s a problem but I think this is a solution that just makes it worse. the @_implements attribute isn’t something people know about (it’s still underscored!) and now you’re making it so that a private attribute is affecting the semantics of user programs.

Joe_Groff · October 11, 2018, 10:07pm

The attribute is a stopgap to give us the opportunity to get the right ABI for this now (if we decide to accept this proposal). We have ideas for how to expose this a user-accessible feature.

Karl · October 11, 2018, 11:00pm

Oh joy, it's this can of worms again...

I'm strongly opposed to </>/== having different behaviour based on whether or not they are used in a generic context, or the particular constraints of context they are used in. I don't agree in principle with the idea of trying to second-guess the behaviour the programmer intended. The @_implements solution is trying to be too clever, and is likely to result in even more complexity and confusion.

I think the set of people who truly require strict IEEE semantics can be reasonably called "floating point experts", and are much smaller in number than those who use Float/Double in other contexts (like graphics) and would be better served by Swift's regular Comparable/Equatable semantics. Those experts will quickly realise that operators are not behaving as they expect and (should) immediately jump to the documentation, where we can make it very clear that they should use a different operator (like &> variant suggested above) to get the IEEE behaviour. It's a small extra thing to learn, but I think it gives most people the result they expect without being cute and trying to infer their intentions.

Contrary to what the pitch says, FP algorithms ported from other languages will still work. You will just need to perform an extra step and consider NaN behaviour as part of the porting, so it's not just a copy/paste. I don't consider that too bad - you also need to add var/let annotations, and you can't do heterogenous numeric operations (like addition), as you can in many other languages. What is important is that you can still express what you want, and it's not overly verbose or awkward to learn how to do so.

Joe_Groff · October 12, 2018, 1:00am

I guess I should say that, from a blank slate, I too would rather have seen the unordered comparison operators be spelled differently. If the strictly ordered and unordered operators must be spelled the same, though, I think the proposal here is the best possible solution given that constraint.

Paul_Cantrell · October 12, 2018, 2:13am

Joe_Groff:

would supersede the need for @_implements , e.g.:

struct MyFloat {
  static func Equatable.==(l: MyFloat, r: MyFloat) -> Bool { /* total equality */ }
  static func Numeric.==(l: MyFloat, r: MyFloat) -> Bool { /* IEEE equality */ }
}

let totallyEqual = Equatable.==(x, y)
let floatallyEqual = Numeric.==(x, y)

This feature — with attendant tooling, e.g. the ability to click on an operator to see its declaration — would make me like this particular proposal much better.

I suppose it is the nicer, more generalized version of the > vs &> counterproposal I self-rejected upthread.

P.S. Thanks, Joe, for “totally / floatally,” which I am now going to use as a term of art for floating point bizarrities whenever possible, doubtless to the distress of all around me.

Letan · October 12, 2018, 11:02am

Obviously, source compatibility is a major concern. However, I wonder if this would not fall under being actively harmful?

I say this because many people aren’t fully aware of nan’s behaviour. Which can cause some really hard to find bugs. Near impossible (for me at least) if you don’t know what you’re looking for.

EDIT: For a more concrete example of the possible harm, take the following perfectly appropriate implementation of something Equatable:

struct Point : Equatable {
  var x, y: Double
}

However, this struct violates the reflexivity rule in that a == a is not always true. This can lead to some really weird errors down the road. What's also really scary is that any type looking to define equality based on a floating point field is likely to make this mistake by forgetting to handle the nan case.

Jon_Hull · October 12, 2018, 12:50pm

I've always wanted to see Swift have a type somewhere that is basically Double without the concept of Nan (using optionals instead where required).

Joe_Groff · October 12, 2018, 4:32pm

Letan:

EDIT: For a more concrete example of the possible harm, take the following perfectly appropriate implementation of something Equatable :
struct Point : Equatable {
  var x, y: Double
}
However, this struct violates the reflexivity rule in that a == a is not always true. This can lead to some really weird errors down the road. What's also really scary is that any type looking to define equality based on a floating point field is likely to make this mistake by forgetting to handle the nan case.

Steve's proposal would improve this case. Since a struct's default Equatable conformance is derived from the Equatable conformances of its fields, and Double's Equatable conformance would guarantee that a == a, so would the struct's == implementation.

taylorswift · October 12, 2018, 5:45pm

could this be implemented like UnsafePointer<T> where the bitpatterns that spell out NaN become extra inhabitants for a nil case?

Tino · October 12, 2018, 10:05pm

I guess the only full solution would be either deprecating zero or division ;-):

Without nan, a lot of calculations would have to return an Optional, and you would have to deal with that instead.

You can define all the operators and functions for those (e.g. func *(lhs: Float?, rhs: Float?) -> Float?), so that calculations that could produce a nan continue to work, and only when you really need a non-nan value, you would be confronted with the fact that your result might be bogus.
But on the other hand, division by zero doesn't work that well with Int either, and although it's a constant danger, we are used to deal with the problem.

I couldn't find any language that utilizes its type system to make calculations more safe (maybe Ada, but I haven't worked with it yet), but it's an interesting idea.

I think a custom number type that avoids some common pitfalls (not only nan - testing floats for equality is often done wrong, even if only "regular" floats are involved) would be an interesting addition - but it would need features Swift will possibly never have.

Jon_Hull · October 12, 2018, 11:20pm

Exactly. The difference is that Swift is well equipped to make/help you deal with optionals.

Yes, I would like to see it handle properly testing for equality as well.

Yeah, I think the optional system is smart enough that we could have any (or at least one) Nan pattern be it's representation of optional, thus it wouldn't take any extra space. We would probably want to pick a pattern that acts on the chip the same way that optional does, so that we can minimize the number of extra checks we have to do on it.

Basically, it would be a thin wrapper over Double, and the optionality of operators/functions would match the patterns that the chip can give back. Swift would then force you to handle the optionals at some point (as opposed to being surprised by Nan).

anandabits · October 13, 2018, 11:26pm

This is true, although users may still need to implement Equatable manually from time to time. The conformance should be available, either by a named method that floating point Equatable.== forwards to or (perhaps better) by adding support for disambiguation.

Further, as the need to do this rather than use the concrete type's == is not obvious the compiler should also warn if a concrete floating point type's == is used in a manual conformance to Equatable.

I agree with this. Is it too late to consider using ampersand operators for IEEE comparison behavior? It would be nice if the usual spellings always obeyed the conventional laws associated with them.

xwu · October 14, 2018, 12:03am

The proposal author has stated that such a change would be a non-starter; I would agree with him on that, and I doubt I'm alone on this. This is not at all like the situation with overflow operators because signed overflow is undefined behavior in C. Here there is multi-language concordance as to the behavior of the floating-point == operator. Moreover, how can one really contemplate changing the meaning of a fundamental mathematical operator in a breaking way this many years into the evolution of a commonly used language?

There are far bigger and more common pitfalls to using == with floating-point types, in the concrete or generic context, than the behavior of NaN. (For example, the assumption that a + b == c implies c - b == a, which does not hold in general even if a, b, and c are finite.) If the behavior of NaN is grounds for a warning, then one might as well argue that there should be a warning on every use of == (however it is defined) with floating-point types: after all, a user might in fact need to test for two values to be within some degree of rounding error of each other instead. But it is clearly not appropriate to use warnings this way, as there are real, correct uses for floating-point ==.

That said, I do support having a compiler warning for the use of a test of equality specifically with NaN (i.e. x == .nan); I cannot think of a scenario in which that is the correct thing to do currently, and however one defines == in whatever context, one ought to use x.isNaN.

[Updated thought: In fact, to make it educational, the compiler could offer two fix-its for x == .nan: x.isNaN and false. If accompanied by the right warning text that explains why x == .nan is always false, this would be a great didactic tool.]

anandabits · October 14, 2018, 1:19pm

To be clear, I was talking specifically of uses in a manual conformance to Equatable where use floating point == will cause the custom conformance to violate the expected laws. This proposal goes out of its way to ensure that the conformances provided by floating point types themselves obey the expected laws.

Shouldn’t the proposal also provide some assistance to manual conformances involving floating point types? The naive implementation using == from the concrete type is unlikely to be what is intended. I think a warning is pretty well justified in this case.

Nevin · October 14, 2018, 8:13pm

The correct comparison is to *unsigned* integer overflow, which is well-defined in C (as modulo wrapping), yet Swift still chooses to make the standard arithmetic operators trap when it occurs.

This very thread was created—by one of the world’s foremost experts on floating-point math—to discuss changing the behavior of fundamental mathematical operators in a breaking way.

That is a pitfall of floating-point *rounding*. The equality operator does exactly what a reasonable person would expect, except when NaN is involved.

xwu:

If the behavior of NaN is grounds for a warning, then one might as well argue that there should be a warning on every use of == (however it is defined) with floating-point types: after all, a user might in fact need to test for two values to be within some degree of rounding error of each other instead. But it is clearly not appropriate to use warnings this way, as there are real, correct uses for floating-point == .

That said, I do support having a compiler warning for the use of a test of equality specifically with NaN (i.e. x == .nan ); I cannot think of a scenario in which that is the correct thing to do currently, and however one defines == in whatever context, one ought to use x.isNaN .

[Updated thought: In fact, to make it educational, the compiler could offer two fix-its for x == .nan : x.isNaN and false . If accompanied by the right warning text that explains why x == .nan is always false , this would be a great didactic tool.]

The fact that x == .nan is always the wrong thing to do, represents a glaring red flag that something is fundamentally wrong.

We can’t fix the IEEE-754 standard, but we *can* fix Swift. And we can do so in a way that still complies with IEEE-754, and still enables floating-point experts to write code that behaves the way they are used to—even if there is literally no circumstance in which that behavior is useful.

michelf · October 14, 2018, 8:27pm

So if I write a wrapper like this:

struct CGFloat: Equatable, Comparable, Hashable {
    var native: Float
}

What is the proper behavior? How do I implement the proper behavior?

xwu · October 14, 2018, 9:00pm

No, that is not the correct comparison. The currency integer type in Swift is the signed integer, and its overflow behavior is not defined in C. Unless you believe that Swift should have different behaviors for arithmetic operators in the case of signed and unsigned types, from there it follows that trapping on overflow by default in the case of signed arithmetic implies trapping on overflow by default in the case of unsigned arithmetic.

No, this thread was created by one of the world's foremost experts on floating-point math stating explicitly that such a change to the concrete floating-point operators is a non-starter.

0.3 - 0.2 == 0.1 evaluates to false. You can claim that the equality operator "does exactly what a reasonable person would expect" in this case, but I think it's fair to say that, for certain values of reasonable people, it does not, and this very much going to ensnare people who are writing certain algorithms generic over Numeric whatever the behavior of NaN.

No, it does not represent a glaring red flag. For any tool it is possible to find an example where its use is clearly and always wrong. The solution is to teach people not to do that with the tool, not to take the tool away.

I would not be so confident that a community composed of non-specialists in floating-point math can come up with a solution relating to floating-point math that's superior to one created by specialists. (The same claim has been thrown around that as a community we can "fix" Unicode bugs in Swift.) I see no reason to believe this to be the case and have plenty of reason to work from the presumption, until proved otherwise, that it is not the case.