Rationalizing FloatingPoint conformance to Equatable

David_Sweeris · October 26, 2017, 10:32pm

Oh, right... because we can't say "extension Numeric where Self.== doesn't trap or anything {}"... got it.

Is that why there's not a default implementation of +=, -=, etc?

- Dave Sweeris

···

On Oct 26, 2017, at 3:16 PM, Matthew Johnson <matthew@anandabits.com> wrote:

On Oct 26, 2017, at 5:12 PM, David Sweeris via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Oct 26, 2017, at 2:57 PM, Greg Parker via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Oct 26, 2017, at 11:47 AM, Xiaodi Wu via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

On Thu, Oct 26, 2017 at 1:30 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:
Now you are just being rude. We all want Swift to be awesome… let’s try to keep things civil.

Sorry if my reply came across that way! That wasn't at all the intention. I really mean to ask you those questions and am interested in the answers:

Unless I misunderstand, you're arguing that your proposal is superior to Rust's design because of a new operator that returns `Bool?` instead of `Bool`; if so, how is it that you haven't reproduced Rust's design problem, only with the additional syntax involved in unwrapping the result?

And if, as I understand, your argument is that your design is superior to Rust's *because* it requires unwrapping, then isn't the extent to which people will avoid using the protocol unintentionally also equally and unavoidably the same extent to which it makes Numeric more cumbersome?

You said it was impossible, so I gave you a very quick example showing that the current behavior was still possible. I wasn’t recommending that everyone should only ever use that example for all things.

For FloatingPoint, ‘(a &== b) == true’ would mimic the current behavior (bugs and all). It may not hold for all types.

No, the question was how it would be possible to have these guarantees hold for `Numeric`, not merely for `FloatingPoint`, as the purpose is to use `Numeric` for generic algorithms. This requires additional semantic guarantees on what you propose to call `&==`.

Would something like this work?

Numeric.== -> Bool
traps on NaN etc.

Numeric.==? -> Bool?
returns nil on NaN etc. You likely don't want this unless you know something about floating-point.

Numeric.&== -> Bool
is IEEE equality. You should not use this unless you are a floating-point expert.

The experts can get high performance or sophisticated numeric behavior. The rest of us who naïvely use == get a relatively foolproof floating-point model. (There is no difference among these three operators for fixed-size integers, of course.)

This is analogous to what Swift does with integer overflow. I would further argue the other Numeric operators like + should be extended to the same triple of trap or optional or just-do-it. We already have two of those three operators for integer addition after all.

Numeric.+ -> T
traps on FP NaN and integer overflow

Numeric.+? -> T?
returns nil on FP NaN and integer overflow

Numeric.&+ -> T
performs FP IEEE addition and integer wraparound

Works for me (although I'd prefer it if we could we stick to one side for the "modifier" symbols -- either "&+" and "?+", or "+&" and "+?", and likewise for "==" and its variants)

At a glance this looks like a reasonable solution to me as well.

Should `Numeric` have extensions that define the variants in terms of `==`, so that authors of custom types don't have to think about it if they don't want to?

Probably not. In this design `==` is allowed to have a precondition while the variants are not.

Jon_Hull · October 27, 2017, 6:09am

Now you are just being rude. We all want Swift to be awesome… let’s try to keep things civil.

Sorry if my reply came across that way! That wasn't at all the intention. I really mean to ask you those questions and am interested in the answers:

Thank you for saying that. I haven’t been sleeping well, so I am probably a bit grumpy.

Unless I misunderstand, you're arguing that your proposal is superior to Rust's design because of a new operator that returns `Bool?` instead of `Bool`; if so, how is it that you haven't reproduced Rust's design problem, only with the additional syntax involved in unwrapping the result?

Two things:

1) PartialEq was available in generic contexts and it provided the IEEE comparison. Our IEEE comparison (which I am calling ‘&==‘ for now) is not available in generic contexts beyond FloatingPoint. If we were to have this in a generic context beyond FloatingPoint, then we would end up with the same issue that Rust had.

What I'm saying is that we *must* have this available in generic contexts beyond FloatingPoint, such as on Numeric, for reasons I've described and which I'll elaborate on shortly.

I disagree pretty strongly with this.

I get that that is your point of view, but I really don’t think it is possible to have everything here at the same time. Nothing prevents you from adding this conformance in your own code (though I wouldn’t recommend it).

2) It is actually semantically different. This MostlyEquatable protocol returns nil when the guarantees of the relation would be violated… and the author has to decide what to do with that. Depending on the use case, the best course of action may be to: treat it as false, trap, throw, or branch. Swift coders are used to this type of decision when encountering optionals.

And if, as I understand, your argument is that your design is superior to Rust's *because* it requires unwrapping, then isn't the extent to which people will avoid using the protocol unintentionally also equally and unavoidably the same extent to which it makes Numeric more cumbersome?

It isn’t that unwrapping is meant to be a deterrent, it is that there are cases where the Equivalence relation may fail to hold, and the programmer needs to deal with those (when working in a generic context). Failure to do so leads to subtle bugs.

Numeric has to use ‘==?’ because there are cases where the relation will fail. I’d love for it to conform to Equatable, but it really doesn’t if you look at it honestly, because it can run into cases where reflexivity doesn’t hold, and we have to deal with those cases.

Well, it's another thing entirely if you want Numeric not to be Equatable (or, by that token, Comparable). Yes, it'd be correct, but that'd be a surprising and user-hostile design.

Yes, that is what I am saying. Numeric can’t actually conform to Equatable (without lying), so let’s be up front about it. It does, however, conform to this new idea of MostlyEquatable, so we can use that for our generic needs. MostlyEquatable semantically provides everything Equatable does… but with the extra possibility that the relation may not hold (it actually gives you additional information). Everything that is possible with Equatable is also possible with MostlyEquatable (just not with the same number of machine instructions).

Everything I have said here applies to Comparable as well, and I have a similar solution in mind that I didn’t want to clutter the discussion with.

I also want to point out that you still have full speed in both Equatable contexts and in FloatingPoint contexts. It is just in generic code that mixes the two that we have some inefficiency because of the differing guarantees. This is true of generic code in general.

As I said above, the typical ways to handle that nil would be: treat it as false, trap, throw, or branch. The current behavior is equivalent to "treat it as false”, and yes, that is the right thing for some algorithms (and you can still do that). But there are also lots of algorithms that need to trap or throw on Nan, or branch to handle it differently. The current behavior also silently fails, which is why the bugs are so hard to track down.

That is inherent to the IEEE definition of "quiet NaN": the operations specified in that standard are required to silently accept NaN.

Premature optimization is the root of all evil.

You said it was impossible, so I gave you a very quick example showing that the current behavior was still possible. I wasn’t recommending that everyone should only ever use that example for all things.

For FloatingPoint, ‘(a &== b) == true’ would mimic the current behavior (bugs and all). It may not hold for all types.

Oops, that should be ‘==?’ (which returns an optional). I am getting tired, it is time for bed.

No, the question was how it would be possible to have these guarantees hold for `Numeric`, not merely for `FloatingPoint`, as the purpose is to use `Numeric` for generic algorithms. This requires additional semantic guarantees on what you propose to call `&==`.

Well, they hold for FloatingPoint and anything which is actually Equatable. Those are the only things I can think of that conform to Numeric right now, but I can’t guarantee that someone won’t later add a type to Numeric which also fails to actually conform to equatable in some different way.

To be fair, anything that breaks this would also break current algorithms on Numeric anyway.

This doesn't answer my question. If `(a ==? b) == true` is the only way to spell what's currently spelled `==` in a generic context, then `Numeric` must make such semantic guarantees as are necessary to guarantee that this spelling behaves in that way for all conforming types, or else it would not be possible to write generic numeric algorithms that operate on any `Numeric`-conforming type. What would those guarantees have to be?

You don’t have those guarantees now.

‘(a ==? b) == true’ is one possible way to get the current behavior for FloatingPoint. It should hold for all FloatingPoint. It should hold for all Numeric things which are FloatingPoint or Integer (or anything Equatable). But if someone comes up with a new exotic type *which doesn’t conform properly to Equatable*, then all bets are off. But then it would also break current code assuming the current IEEE behavior…

But let’s say you have an algorithm you are certain is free of NaNs (maybe you filter them at an earlier stage). Well then you could say '(a ==? b)!’. An easy argument could also be made for allowing ‘a ==! b’ so you don’t have to wrap/unwrap.

or you might use 'guard let’ to have an early exit when NaN == NaN is discovered.

There are also other ways to get the current behavior. For example, you could cast to FloatingPoint and use '&==‘ directly.

The whole point is that you have to put thought into how you want to deal with the optional case where the relation’s guarantees have failed.

If you need full performance, then you would have separate overrides on Numeric for members which conform to FloatingPoint (where you could use &==) and Equatable (where you could use ==). As you get more generic, you lose opportunities for optimization. That is just the nature of generic code. The nice thing about Swift is that you have an opportunity to specialize if you want to optimize more. Once things like conditional conformances come online, all of this will be nicer, of course.

This is a non-starter then. Protocols must enable useful generic code. What you're basically saying is that you do not intend for it to be possible to use methods on `Numeric` to ask about level 1 equivalence in a way that would not be prohibitively expensive. This, again, eviscerates the purpose of `Numeric`.

I don’t consider it “prohibitively expensive”. I mean, dictionaries return an optional. Lots of things return optionals. I have to deal with them all over the place in Swift code.

I think having the tradeoff of having quicker to write code vs more performant code is completely reasonable. Ideally everything would happen instantly, but we really can’t get away from making *some* tradeoffs here.

If I just need something that works, I can use ==? and handle the nil cases. If unwrapping an optional is untenable from a speed perspective in a particular case for some reason, then I think it is completely reasonable to have the author additionally write optimized versions specializing based on additional information which is known (e.g. FloatingPoint or Equatable).

No, it's not the cost of unwrapping the result, it's the cost of computing the result, which is much higher than the single machine instruction that is IEEE floating-point equivalence. The point of `Numeric` is to make it possible to write generic algorithms that do meaningful math with either integer or floating-point types. If the only way to write such an algorithm with reasonable performance is to specialize one version for integers and another for floating-point values, then `Numeric` serves no purpose as a protocol.

Well, the naive implementation of ==? for floats would be:

  static func ==? (lhs: Self, rhs: Self) -> Bool? {
    if lhs.isNan && rhs.isNan {return nil}
    return lhs &== rhs
  }

But we might very easily be able to play compiler tricks to speed that up in certain cases. For example, we could have some underscored subtype of Float or compiler annotation when the compiler can reason it won’t be NaN (e.g. constants or floats created from literals). In those cases, it could just use the machine version directly. At the very least, comparing against literals should be able to be retain single instruction status. The programmer shouldn’t have to worry about that though.

I don’t think it is reasonable to expect a single machine instruction in all generic contexts. Faster is better, but the nature of generic code is that you have to accept some inefficiency in exchange for being able to write code once across multiple types with varying guarantees. My main point was that much/all of the efficiency can be reclaimed where needed by doing extra programming work.

Also, even with ==? instead of ==, Numeric is far from useless. For example, we can generically create math formulas using +,-, and *. In fact, if Numeric’s only utility was ==, we would spell it Equatable.

Finally, once features from the generics manifesto come online, it might be possible to regain Equatable conformance in some cases and not others. So, for example, you would be able to write == against a literal, but would still have to use ==? when both sides could be NaN. That is for the future though...

Note that I am mostly talking about library code here. Once you build up a library of functions on Numeric that handle this correctly, you can use those functions as building blocks, and you aren’t even worrying about == for the most part. For example, if we build a version of index(of:) on collection which works for our MostlyEquatable protocol, then we can pass Numeric to it generically. Whether they decided it was important enough to put in an optimization for FloatingPoint or not, it doesn’t affect the way we call it. It could even have only a generic version for years, and then gain an optimization later if it became important.

You cannot do this for most collection algorithms, because they are mostly protocol extension methods that can be shadowed but not overridden. But again, that's not what I'm talking about. I'm talking about writing _generic numeric algorithms_, not using numeric types with generic collection algorithms.

Well, for something like index(of:) it would actually be using FloatingPoint’s notion of '==?’. Working with Numeric would just fall out for free.

As for writing generic numeric algorithms, my point was that you can use the building blocks of other algorithms written for Numeric. But nothing is stopping you writing code on Numeric which does everything it does now (just using ==? and handling the possibility of nil). That may not always get you code which boils down to a single machine instruction, but that is true of generic code in general. If performance is critical, then you have the option to optimize on top of the generic version.

As I said above, there are also things the compiler can do here in the generic case, so I don’t think the situation is as dire as you say.

The point I'm making here, again, is that there are legitimate uses for `==` guaranteeing partial equivalence in the generic context. The approximation being put forward over and over is that generic code always requires full equivalence and concrete floating-point code always requires IEEE partial equivalence. That is _not true_. Some generic code (for instance, that which uses `Numeric`) relies on partial equivalence semantics and some floating-point code can nonetheless benefit from a notion of full equivalence.

I mean, it would be nice if Float could truly conform to Equatable, but it would also be nice if I didn’t have to check for null pointers. It would certainly be faster if instead of unwrapping optionals, I could just use pointers directly. It would even work most of the time… because I would be careful to remember to add checks where they were really important… until I forget, and then there is a bug! This kind of premature optimization has cost our economy literally Trillions of dollars.

We have optionals for exactly this reason in Swift. It forces us to take those things which will "work fine most of the time”, and consider the case where it won’t. I know it is slightly faster not to consider that case, but that is exactly why this is a notorious source of bugs.

You write as though it's a foregone conclusion that Float cannot conform to Equatable. I disagree. My starting point is that Float *can*--and in fact *must*--conform to Equatable; the question I'm asking is, how must Equatable be designed such that this can be possible?

Equatable conformance (and Equivalence Relations in general) require Reflexivity. IEEE is not Reflexive. QED.

Reflexivity is actually a really important guarantee to write generic code. Removing it as a guarantee would cripple Equatable. You couldn’t write index(of:). You couldn’t write contains(). You couldn’t write Dictionary. Hashing in general, would break.

The closest thing to your starting point is the MostlyEquatable protocol I have described. That provides the relation, but also allows for it to fail to hold. We are talking FloatingPoint here, but I honestly think it would be useful to a host of more complex types as well, which don’t quite fit into Equatable. We should also keep our current notion of Equatable around as well, so types which actually meet it (e.g. Int) don’t have to worry about a case which will never happen.

···

On Oct 26, 2017, at 8:16 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Thu, Oct 26, 2017 at 4:34 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 26, 2017, at 11:47 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
On Thu, Oct 26, 2017 at 1:30 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

Both concepts must be exposed in a protocol-based manner to accommodate all use cases. It will not do to say that exposing both concepts will confuse the user, because the fact remains that both concepts are already and unavoidably exposed, but sometimes without a way to express the distinction in code or any documentation about it. Disappearing the notion of partial equivalence from protocols removes legitimate use cases.

On the contrary, I am saying we should make the difference explicit.

On Oct 26, 2017, at 11:01 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Thu, Oct 26, 2017 at 11:50 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 26, 2017, at 9:40 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Thu, Oct 26, 2017 at 11:38 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 26, 2017, at 9:34 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Thu, Oct 26, 2017 at 10:57 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 26, 2017, at 8:19 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Thu, Oct 26, 2017 at 07:52 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 25, 2017, at 11:22 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Wed, Oct 25, 2017 at 11:46 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:
As someone mentioned earlier, we are trying to square a circle here. We can’t have everything at once… we will have to prioritize. I feel like the precedent in Swift is to prioritize safety/correctness with an option ignore safety and regain speed.

I think the 3 point solution I proposed is a good compromise that follows that precedent. It does mean that there is, by default, a small performance hit for floats in generic contexts, but in exchange for that, we get increased correctness and safety. This is the exact same tradeoff that Swift makes for optionals! Any speed lost can be regained by providing a specific override for FloatingPoint that uses ‘&==‘.

My point is not about performance. My point is that `Numeric.==` must continue to have IEEE floating-point semantics for floating-point types and integer semantics for integer types, or else existing uses of `Numeric.==` will break without any way to fix them. The whole point of *having* `Numeric` is to permit such generic algorithms to be written. But since `Numeric.==` *is* `Equatable.==`, we have a large constraint on how the semantics of `==` can be changed.

It would also conform to the new protocol and have it’s Equatable conformance depreciated. Once we have conditional conformances, we can add Equatable back conditionally. Also, while we are waiting for that, Numeric can provide overrides of important methods when the conforming type is Equatable or FloatingPoint.

For example, if someone wants to write a generic function that works both on Integer and FloatingPoint, then they would have to use the new protocol which would force them to correctly handle cases involving NaN.

What "new protocol" are you referring to, and what do you mean about "correctly handling cases involving NaN"? The existing API of `Numeric` makes it possible to write generic algorithms that accommodate both integer and floating-point types--yes, even if the value is NaN. If you change the definition of `==` or `<`, currently correct generic algorithms that use `Numeric` will start to _incorrectly_ handle NaN.

#1 from my previous email (shown again here):

Currently, I think we should do 3 things:

1) Create a new protocol with a partial equivalence relation with signature of (T, T)->Bool? and automatically conform Equatable things to it
2) Depreciate Float, etc’s… Equatable conformance with a warning that it will eventually be removed (and conform Float, etc… to the partial equivalence protocol)
3) Provide an '&==‘ relation on Float, etc… (without a protocol) with the native Float IEEE comparison

In this case, #2 would also apply to Numeric. You can think of the new protocol as a failable version of Equatable, so in any case where it can’t meet equatable’s rules, it returns nil.

Again, Numeric makes possible the generic use of == with floating-point semantics for floating-point values and integer semantics for integer values; this design would not.

Correct. I view this as a good thing, because another way of saying that is: “it makes possible cases where == sometimes conforms to the rules of Equatable and sometimes doesn’t." Under the solution I am advocating, Numeric would instead allow generic use of '==?’.

I suppose an argument could be made that we should extend ‘&==‘ to Numeric from FloatingPoint, but then we would end up with the Rust situation you were talking about earlier…

This would break any `Numeric` algorithms that currently use `==` correctly. There are useful guarantees that are common to integer `==` and IEEE floating-point `==`; namely, they each model equivalence of their respective types at roughly what IEEE calls "level 1" (as numbers, rather than as their representation or encoding). Breaking that utterly eviscerates `Numeric`.

Nope. They would continue to work as they always have, but would have a depreciation warning on them. The authors of those algorithms would have a full depreciation cycle to update the algorithms. Fixits would be provided to make conversion easier.

After the depreciation cycle, Numeric would no longer guarantee a common "level 1" comparison for conforming types.

It would, using ==?, you would just be forced to deal with the possibility of the Equality relation not holding. '(a ==? b) == true' would mimic the current behavior.

What are the semantic guarantees required of `==?` such that this would be guaranteed to be the current behavior? How would this be implementable without being so costly that, in practice, no generic numeric algorithms would ever use such a facility?

Moreover, if `(a ==? b) == true` guarantees the current behavior for all types, and all currently Equatable types will conform to this protocol, haven't you just reproduced the problem seen in Rust's `PartialEq`, only now with clumsier syntax and poorer performance?

Is it the _purpose_ of this design to make it clumsier and less performant so people don't use it? If so, to the extent that it is an effective deterrent, haven't you created a deterrent to the use of Numeric to an exactly equal extent?

Jon_Hull · October 27, 2017, 6:30am

One completely different idea, which I brought up a year or so ago, is to do what we do with pointers around this. That is you have your fast/unsafe IEEE Floats/Doubles/etc that have a scarier name. These do not conform to Equatable or Comparable, but have their own version of IEEE equality/comparison. Let’s spell it &== and &< to make it feel different so the users consider the possibility of NaN. They don’t have any notion of hashability.

Then you have your safe/friendly Swift Floating point type(s) which just have no concept of NaN at all (and probably a single notion of zero). You have a failable initializer from the IEEE versions. These types conform to Equatable/Hashable/Comparable. Care is taken with internal methods so that NaN can’t creep into the type.

How do we handle math functions which might fail? We do the same thing we do in the rest of Swift... those functions return an optional.

When reading in data from the outside world or C code, you would use the IEEE versions and then either convert or do your calculations directly. They would probably also be used for things like accelerate. But most code, where the values come from user input or literals, would never even have to touch the IEEE version.

The advantage here is that you get full speed all the time, even in generic contexts. You just can’t use the IEEE versions directly in generic contexts. You would have to convert them, which is a one-time cost (or use them non-generically).

Thanks,
Jon

xwu · October 27, 2017, 6:24am

Now you are just being rude. We all want Swift to be awesome… let’s try
to keep things civil.

Sorry if my reply came across that way! That wasn't at all the intention.
I really mean to ask you those questions and am interested in the answers:

Thank you for saying that. I haven’t been sleeping well, so I am probably
a bit grumpy.

Unless I misunderstand, you're arguing that your proposal is superior to
Rust's design because of a new operator that returns `Bool?` instead of
`Bool`; if so, how is it that you haven't reproduced Rust's design problem,
only with the additional syntax involved in unwrapping the result?

Two things:

1) PartialEq was available in generic contexts and it provided the IEEE
comparison. Our IEEE comparison (which I am calling ‘&==‘ for now) is not
available in generic contexts beyond FloatingPoint. If we were to have this
in a generic context beyond FloatingPoint, then we would end up with the
same issue that Rust had.

What I'm saying is that we *must* have this available in generic contexts
beyond FloatingPoint, such as on Numeric, for reasons I've described and
which I'll elaborate on shortly.

I disagree pretty strongly with this.

I get that that is your point of view, but I really don’t think it is
possible to have everything here at the same time. Nothing prevents you
from adding this conformance in your own code (though I wouldn’t recommend
it).

2) It is actually semantically different. This MostlyEquatable protocol

returns nil when the guarantees of the relation would be violated… and the
author has to decide what to do with that. Depending on the use case, the
best course of action may be to: treat it as false, trap, throw, or
branch. Swift coders are used to this type of decision when encountering
optionals.

And if, as I understand, your argument is that your design is superior to
Rust's *because* it requires unwrapping, then isn't the extent to which
people will avoid using the protocol unintentionally also equally and
unavoidably the same extent to which it makes Numeric more cumbersome?

It isn’t that unwrapping is meant to be a deterrent, it is that there are
cases where the Equivalence relation may fail to hold, and the programmer
needs to deal with those (when working in a generic context). Failure to
do so leads to subtle bugs.

Numeric has to use ‘==?’ because there are cases where the relation will
fail. I’d love for it to conform to Equatable, but it really doesn’t if you
look at it honestly, because it can run into cases where reflexivity
doesn’t hold, and we have to deal with those cases.

Well, it's another thing entirely if you want Numeric not to be Equatable
(or, by that token, Comparable). Yes, it'd be correct, but that'd be a
surprising and user-hostile design.

Yes, that is what I am saying. Numeric can’t actually conform to Equatable
(without lying), so let’s be up front about it. It does, however, conform
to this new idea of MostlyEquatable, so we can use that for our generic
needs. MostlyEquatable semantically provides everything Equatable does…
but with the extra possibility that the relation may not hold (it actually
gives you additional information). Everything that is possible with
Equatable is also possible with MostlyEquatable (just not with the same
number of machine instructions).

Everything I have said here applies to Comparable as well, and I have a
similar solution in mind that I didn’t want to clutter the discussion with.

I also want to point out that you still have full speed in both Equatable
contexts and in FloatingPoint contexts. It is just in generic code that
mixes the two that we have some inefficiency because of the differing
guarantees. This is true of generic code in general.

As I said above, the typical ways to handle that nil would be: treat it as

false, trap, throw, or branch. The current behavior is equivalent to
"treat it as false”, and yes, that is the right thing for some algorithms
(and you can still do that). But there are also lots of algorithms that
need to trap or throw on Nan, or branch to handle it differently. The
current behavior also silently fails, which is why the bugs are so hard to
track down.

That is inherent to the IEEE definition of "quiet NaN": the operations
specified in that standard are required to silently accept NaN.

Premature optimization is the root of all evil.

You said it was impossible, so I gave you a very quick example showing

that the current behavior was still possible. I wasn’t recommending that
everyone should only ever use that example for all things.

For FloatingPoint, ‘(a &== b) == true’ would mimic the current behavior
(bugs and all). It may not hold for all types.

Oops, that should be ‘==?’ (which returns an optional). I am getting
tired, it is time for bed.

No, the question was how it would be possible to have these guarantees
hold for `Numeric`, not merely for `FloatingPoint`, as the purpose is to
use `Numeric` for generic algorithms. This requires additional semantic
guarantees on what you propose to call `&==`.

Well, they hold for FloatingPoint and anything which is actually
Equatable. Those are the only things I can think of that conform to Numeric
right now, but I can’t guarantee that someone won’t later add a type to
Numeric which also fails to actually conform to equatable in some different
way.

To be fair, anything that breaks this would also break current algorithms
on Numeric anyway.

This doesn't answer my question. If `(a ==? b) == true` is the only way to
spell what's currently spelled `==` in a generic context, then `Numeric`
must make such semantic guarantees as are necessary to guarantee that this
spelling behaves in that way for all conforming types, or else it would not
be possible to write generic numeric algorithms that operate on any
`Numeric`-conforming type. What would those guarantees have to be?

You don’t have those guarantees now.

‘(a ==? b) == true’ is one possible way to get the current behavior for
FloatingPoint. It should hold for all FloatingPoint. It should hold for
all Numeric things which are FloatingPoint or Integer (or anything
Equatable). But if someone comes up with a new exotic type *which doesn’t
conform properly to Equatable*, then all bets are off. But then it would
also break current code assuming the current IEEE behavior…

But let’s say you have an algorithm you are certain is free of NaNs (maybe
you filter them at an earlier stage). Well then you could say '(a ==?
b)!’. An easy argument could also be made for allowing ‘a ==! b’ so you
don’t have to wrap/unwrap.

or you might use 'guard let’ to have an early exit when NaN == NaN is
discovered.

There are also other ways to get the current behavior. For example, you
could cast to FloatingPoint and use '&==‘ directly.

The whole point is that you have to put thought into how you want to deal

with the optional case where the relation’s guarantees have failed.

If you need full performance, then you would have separate overrides on
Numeric for members which conform to FloatingPoint (where you could use
&==) and Equatable (where you could use ==). As you get more generic, you
lose opportunities for optimization. That is just the nature of generic
code. The nice thing about Swift is that you have an opportunity to
specialize if you want to optimize more. Once things like conditional
conformances come online, all of this will be nicer, of course.

This is a non-starter then. Protocols must enable useful generic code.
What you're basically saying is that you do not intend for it to be
possible to use methods on `Numeric` to ask about level 1 equivalence in a
way that would not be prohibitively expensive. This, again, eviscerates the
purpose of `Numeric`.

I don’t consider it “prohibitively expensive”. I mean, dictionaries
return an optional. Lots of things return optionals. I have to deal with
them all over the place in Swift code.

I think having the tradeoff of having quicker to write code vs more
performant code is completely reasonable. Ideally everything would happen
instantly, but we really can’t get away from making *some* tradeoffs here.

If I just need something that works, I can use ==? and handle the nil
cases. If unwrapping an optional is untenable from a speed perspective in
a particular case for some reason, then I think it is completely reasonable
to have the author additionally write optimized versions specializing based
on additional information which is known (e.g. FloatingPoint or Equatable).

No, it's not the cost of unwrapping the result, it's the cost of computing
the result, which is much higher than the single machine instruction that
is IEEE floating-point equivalence. The point of `Numeric` is to make it
possible to write generic algorithms that do meaningful math with either
integer or floating-point types. If the only way to write such an algorithm
with reasonable performance is to specialize one version for integers and
another for floating-point values, then `Numeric` serves no purpose as a
protocol.

Well, the naive implementation of ==? for floats would be:

static func ==? (lhs: Self, rhs: Self) -> Bool? {
if lhs.isNan && rhs.isNan {return nil}
return lhs &== rhs
}

But we might very easily be able to play compiler tricks to speed that up
in certain cases. For example, we could have some underscored subtype of
Float or compiler annotation when the compiler can reason it won’t be NaN
(e.g. constants or floats created from literals). In those cases, it could
just use the machine version directly. At the very least, comparing against
literals should be able to be retain single instruction status. The
programmer shouldn’t have to worry about that though.

I don’t think it is reasonable to expect a single machine instruction in
all generic contexts. Faster is better, but the nature of generic code is
that you have to accept some inefficiency in exchange for being able to
write code once across multiple types with varying guarantees. My main
point was that much/all of the efficiency can be reclaimed where needed by
doing extra programming work.

Also, even with ==? instead of ==, Numeric is far from useless. For
example, we can generically create math formulas using +,-, and *. In
fact, if Numeric’s only utility was ==, we would spell it Equatable.

Finally, once features from the generics manifesto come online, it might
be possible to regain Equatable conformance in some cases and not others.
So, for example, you would be able to write == against a literal, but would
still have to use ==? when both sides could be NaN. That is for the future
though...

Note that I am mostly talking about library code here. Once you build up
a library of functions on Numeric that handle this correctly, you can use
those functions as building blocks, and you aren’t even worrying about ==
for the most part. For example, if we build a version of index(of:) on
collection which works for our MostlyEquatable protocol, then we can pass
Numeric to it generically. Whether they decided it was important enough to
put in an optimization for FloatingPoint or not, it doesn’t affect the way
we call it. It could even have only a generic version for years, and then
gain an optimization later if it became important.

You cannot do this for most collection algorithms, because they are mostly
protocol extension methods that can be shadowed but not overridden. But
again, that's not what I'm talking about. I'm talking about writing
_generic numeric algorithms_, not using numeric types with generic
collection algorithms.

Well, for something like index(of:) it would actually be using
FloatingPoint’s notion of '==?’. Working with Numeric would just fall out
for free.

As for writing generic numeric algorithms, my point was that you can use
the building blocks of other algorithms written for Numeric. But nothing is
stopping you writing code on Numeric which does everything it does now
(just using ==? and handling the possibility of nil). That may not always
get you code which boils down to a single machine instruction, but that is
true of generic code in general. If performance is critical, then you have
the option to optimize on top of the generic version.

As I said above, there are also things the compiler can do here in the
generic case, so I don’t think the situation is as dire as you say.

The point I'm making here, again, is that there are legitimate uses for

`==` guaranteeing partial equivalence in the generic context. The
approximation being put forward over and over is that generic code always
requires full equivalence and concrete floating-point code always requires
IEEE partial equivalence. That is _not true_. Some generic code (for
instance, that which uses `Numeric`) relies on partial equivalence
semantics and some floating-point code can nonetheless benefit from a
notion of full equivalence.

I mean, it would be nice if Float could truly conform to Equatable, but
it would also be nice if I didn’t have to check for null pointers. It
would certainly be faster if instead of unwrapping optionals, I could just
use pointers directly. It would even work most of the time… because I
would be careful to remember to add checks where they were really
important… until I forget, and then there is a bug! This kind of premature
optimization has cost our economy literally Trillions of dollars.

We have optionals for exactly this reason in Swift. It forces us to take
those things which will "work fine most of the time”, and consider the case
where it won’t. I know it is slightly faster not to consider that case,
but that is exactly why this is a notorious source of bugs.

You write as though it's a foregone conclusion that Float cannot conform

to Equatable. I disagree. My starting point is that Float *can*--and in
fact *must*--conform to Equatable; the question I'm asking is, how must
Equatable be designed such that this can be possible?

Equatable conformance (and Equivalence Relations in general) require
Reflexivity. IEEE is not Reflexive. QED.

Without replying yet to the remainder of this response, as a matter of
defining what it is we're debating, what you state is both true and does
not preclude Float conforming to Equatable.

Yes, an equivalence relation requires reflexivity. Yes, Equatable
conformance should guarantee an equivalence relation. But, as I stated in
my initial message, one question to be answered is: "(A) Must
`Equatable.==` be a full equivalence relation?" Note the part about `==`.
That much is not settled. My take is: no, the equivalence relation
guaranteed by conformance to Equatable does not need to be spelled `==`.

Reflexivity is actually a really important guarantee to write generic

···

On Fri, Oct 27, 2017 at 1:09 AM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 26, 2017, at 8:16 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Thu, Oct 26, 2017 at 4:34 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 26, 2017, at 11:47 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Thu, Oct 26, 2017 at 1:30 PM, Jonathan Hull <jhull@gbis.com> wrote:

code. Removing it as a guarantee would cripple Equatable. You couldn’t
write index(of:). You couldn’t write contains(). You couldn’t write
Dictionary. Hashing in general, would break.

The closest thing to your starting point is the MostlyEquatable protocol I
have described. That provides the relation, but also allows for it to fail
to hold. We are talking FloatingPoint here, but I honestly think it would
be useful to a host of more complex types as well, which don’t quite fit
into Equatable. We should also keep our current notion of Equatable around
as well, so types which actually meet it (e.g. Int) don’t have to worry
about a case which will never happen.

Both concepts must be exposed in a protocol-based manner to accommodate

all use cases. It will not do to say that exposing both concepts will
confuse the user, because the fact remains that both concepts are already
and unavoidably exposed, but sometimes without a way to express the
distinction in code or any documentation about it. Disappearing the notion
of partial equivalence from protocols removes legitimate use cases.

On the contrary, I am saying we should make the difference explicit.

On Oct 26, 2017, at 11:01 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Thu, Oct 26, 2017 at 11:50 AM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 26, 2017, at 9:40 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Thu, Oct 26, 2017 at 11:38 AM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 26, 2017, at 9:34 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Thu, Oct 26, 2017 at 10:57 AM, Jonathan Hull <jhull@gbis.com> >>>>> wrote:

On Oct 26, 2017, at 8:19 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Thu, Oct 26, 2017 at 07:52 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 25, 2017, at 11:22 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Wed, Oct 25, 2017 at 11:46 PM, Jonathan Hull <jhull@gbis.com> >>>>>>> wrote:

As someone mentioned earlier, we are trying to square a circle
here. We can’t have everything at once… we will have to prioritize. I feel
like the precedent in Swift is to prioritize safety/correctness with an
option ignore safety and regain speed.

I think the 3 point solution I proposed is a good compromise that
follows that precedent. It does mean that there is, by default, a small
performance hit for floats in generic contexts, but in exchange for that,
we get increased correctness and safety. This is the exact same tradeoff
that Swift makes for optionals! Any speed lost can be regained by
providing a specific override for FloatingPoint that uses ‘&==‘.

My point is not about performance. My point is that `Numeric.==`
must continue to have IEEE floating-point semantics for floating-point
types and integer semantics for integer types, or else existing uses of
`Numeric.==` will break without any way to fix them. The whole point of
*having* `Numeric` is to permit such generic algorithms to be written. But
since `Numeric.==` *is* `Equatable.==`, we have a large constraint on how
the semantics of `==` can be changed.

It would also conform to the new protocol and have it’s Equatable
conformance depreciated. Once we have conditional conformances, we can add
Equatable back conditionally. Also, while we are waiting for that, Numeric
can provide overrides of important methods when the conforming type is
Equatable or FloatingPoint.

For example, if someone wants to write a generic function that works

both on Integer and FloatingPoint, then they would have to use the new
protocol which would force them to correctly handle cases involving NaN.

What "new protocol" are you referring to, and what do you mean about
"correctly handling cases involving NaN"? The existing API of `Numeric`
makes it possible to write generic algorithms that accommodate both integer
and floating-point types--yes, even if the value is NaN. If you change the
definition of `==` or `<`, currently correct generic algorithms that use
`Numeric` will start to _incorrectly_ handle NaN.

#1 from my previous email (shown again here):

Currently, I think we should do 3 things:

1) Create a new protocol with a partial equivalence relation with
signature of (T, T)->Bool? and automatically conform Equatable things to it
2) Depreciate Float, etc’s… Equatable conformance with a warning
that it will eventually be removed (and conform Float, etc… to the partial
equivalence protocol)
3) Provide an '&==‘ relation on Float, etc… (without a protocol)
with the native Float IEEE comparison

In this case, #2 would also apply to Numeric. You can think of the
new protocol as a failable version of Equatable, so in any case where it
can’t meet equatable’s rules, it returns nil.

Again, Numeric makes possible the generic use of == with
floating-point semantics for floating-point values and integer semantics
for integer values; this design would not.

Correct. I view this as a good thing, because another way of saying
that is: “it makes possible cases where == sometimes conforms to the rules
of Equatable and sometimes doesn’t." Under the solution I am advocating,
Numeric would instead allow generic use of '==?’.

I suppose an argument could be made that we should extend ‘&==‘ to
Numeric from FloatingPoint, but then we would end up with the Rust
situation you were talking about earlier…

This would break any `Numeric` algorithms that currently use `==`
correctly. There are useful guarantees that are common to integer `==` and
IEEE floating-point `==`; namely, they each model equivalence of their
respective types at roughly what IEEE calls "level 1" (as numbers, rather
than as their representation or encoding). Breaking that utterly
eviscerates `Numeric`.

Nope. They would continue to work as they always have, but would have
a depreciation warning on them. The authors of those algorithms would have
a full depreciation cycle to update the algorithms. Fixits would be
provided to make conversion easier.

After the depreciation cycle, Numeric would no longer guarantee a
common "level 1" comparison for conforming types.

It would, using ==?, you would just be forced to deal with the
possibility of the Equality relation not holding. '(a ==? b) == true'
would mimic the current behavior.

What are the semantic guarantees required of `==?` such that this would
be guaranteed to be the current behavior? How would this be implementable
without being so costly that, in practice, no generic numeric algorithms
would ever use such a facility?

Moreover, if `(a ==? b) == true` guarantees the current behavior for all
types, and all currently Equatable types will conform to this protocol,
haven't you just reproduced the problem seen in Rust's `PartialEq`, only
now with clumsier syntax and poorer performance?

Is it the _purpose_ of this design to make it clumsier and less
performant so people don't use it? If so, to the extent that it is an
effective deterrent, haven't you created a deterrent to the use of Numeric
to an exactly equal extent?

xwu · October 27, 2017, 6:44am

One completely different idea, which I brought up a year or so ago, is to
do what we do with pointers around this. That is you have your fast/unsafe
IEEE Floats/Doubles/etc that have a scarier name. These do not conform to
Equatable or Comparable, but have their own version of IEEE
equality/comparison. Let’s spell it &== and &< to make it feel different so
the users consider the possibility of NaN. They don’t have any notion of
hashability.

As I wrote in my reply to Greg, IEEE equality and comparison is _the_ best
approximation of mathematical equality and comparison suitable for
floating-point types. If another one were superior, then floating-point
experts would have designated that as the standard.

Swift correctly exposes only one concept of equality for floating-point
types. It is and should be IEEE equality. People should be encouraged and
not scared to use it. NaN is and should continue to exist as a concept.
Yes, IEEE-compliant floating point is hard; the only thing harder than
IEEE-compliant floating point is non-IEEE-compliant floating point.

This thread is meant to discuss how to reconcile this scenario with the
semantics of Equatable.

Then you have your safe/friendly Swift Floating point type(s) which just

have no concept of NaN at all (and probably a single notion of zero). You
have a failable initializer from the IEEE versions. These types conform to
Equatable/Hashable/Comparable. Care is taken with internal methods so that
NaN can’t creep into the type.

How do we handle math functions which might fail? We do the same thing we
do in the rest of Swift... those functions return an optional.

When reading in data from the outside world or C code, you would use the
IEEE versions and then either convert or do your calculations directly.
They would probably also be used for things like accelerate. But most
code, where the values come from user input or literals, would never even
have to touch the IEEE version.

The advantage here is that you get full speed all the time, even in
generic contexts. You just can’t use the IEEE versions directly in generic
contexts. You would have to convert them, which is a one-time cost (or use
them non-generically).

Again, generics and protocol-based numerics are important; that's what
Numeric is all about. Any idea that doesn't make this possible is a
non-starter.

···

On Fri, Oct 27, 2017 at 1:30 AM, Jonathan Hull <jhull@gbis.com> wrote:

Jon_Hull · October 27, 2017, 10:06am

One completely different idea, which I brought up a year or so ago, is to do what we do with pointers around this. That is you have your fast/unsafe IEEE Floats/Doubles/etc that have a scarier name. These do not conform to Equatable or Comparable, but have their own version of IEEE equality/comparison. Let’s spell it &== and &< to make it feel different so the users consider the possibility of NaN. They don’t have any notion of hashability.

As I wrote in my reply to Greg, IEEE equality and comparison is _the_ best approximation of mathematical equality and comparison suitable for floating-point types. If another one were superior, then floating-point experts would have designated that as the standard.

We definitely have different world views. I see the handling of NaN as a legacy/compatibility issue due to committee/vendor politics from the 1980’s. I am pretty sure if they could do it over with modern tech, we would just have isNan() and NaN == NaN… or we might just have optionals instead.

Just to play devil’s advocate, there are actually much better and more accurate representations available using the same number of bits. The main issue is that there isn’t common hardware support for them. That is not what I am suggesting here however.

Swift correctly exposes only one concept of equality for floating-point types. It is and should be IEEE equality. People should be encouraged and not scared to use it. NaN is and should continue to exist as a concept. Yes, IEEE-compliant floating point is hard; the only thing harder than IEEE-compliant floating point is non-IEEE-compliant floating point.

What I am suggesting is identical to IEEE in every way except for NaN. It is just an IEEE value that has been filtered so that we can guarantee it isn’t NaN. It still uses all the same hardware instructions. Basically, it semantically converts NaN to nil… and that lets us conform to Equatable/Comparable honestly. It also still technically adheres to IEEE… it just never comes up because we are careful to filter/handle NaN before the user ever has to deal with it.

If you want/need to use NaN for some reason, you still have the IEEE types. What you can’t do is use them generically for ==.

Can you give me an example of where you would want NaN in a generic context (that might also contain Ints), but an optional Float (which had been filtered not to have NaN) wouldn’t meet your needs? Remember, this is a generic context, not one that is special casing Floats.

This thread is meant to discuss how to reconcile this scenario with the semantics of Equatable.

Yes it is… and this is one possible approach to doing it. We make it so Floats just don’t conform to Equatable, but we make a wrapper type that does conform (and can fully meet the guarantees).

Then you have your safe/friendly Swift Floating point type(s) which just have no concept of NaN at all (and probably a single notion of zero). You have a failable initializer from the IEEE versions. These types conform to Equatable/Hashable/Comparable. Care is taken with internal methods so that NaN can’t creep into the type.

How do we handle math functions which might fail? We do the same thing we do in the rest of Swift... those functions return an optional.

When reading in data from the outside world or C code, you would use the IEEE versions and then either convert or do your calculations directly. They would probably also be used for things like accelerate. But most code, where the values come from user input or literals, would never even have to touch the IEEE version.

The advantage here is that you get full speed all the time, even in generic contexts. You just can’t use the IEEE versions directly in generic contexts. You would have to convert them, which is a one-time cost (or use them non-generically).

Again, generics and protocol-based numerics are important; that's what Numeric is all about. Any idea that doesn't make this possible is a non-starter.

Why is everything “impossible” or a “non-starter”?

The wrapper version would adhere to Numeric and would be fully usable in generic contexts. You can wrap the IEEE Floats, and the NaNs get converted to optionals, where you can apply the generic algorithm. If your “generic” algorithm was depending on the bit representation of NaN or the fact that it breaks ==, then I would argue you really don’t want a generic algorithm after all… you have a Float specific algorithm.

Also, in terms of speed, we should find something with the semantics we want (that should be the main focus of our discussion)… and then we can figure out how to tweak it so it goes as fast as we need it to once we know what we are aiming for. Anything else is premature optimization.

···

On Oct 26, 2017, at 11:44 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Fri, Oct 27, 2017 at 1:30 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

Chris_Lattner · October 28, 2017, 4:07pm

I haven’t been following this thread closely, but I agree that solving this by making floating point numbers substantially harder to work with (e.g. by having to explicitly reason about NaN’s everywhere) is the wrong direction to go.

Is the intersection of NaN handling and Equality actually causes problems in practice for people?

-Chris

···

On Oct 26, 2017, at 11:44 PM, Xiaodi Wu via swift-dev <swift-dev@swift.org> wrote:

On Fri, Oct 27, 2017 at 1:30 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:
One completely different idea, which I brought up a year or so ago, is to do what we do with pointers around this. That is you have your fast/unsafe IEEE Floats/Doubles/etc that have a scarier name. These do not conform to Equatable or Comparable, but have their own version of IEEE equality/comparison. Let’s spell it &== and &< to make it feel different so the users consider the possibility of NaN. They don’t have any notion of hashability.

As I wrote in my reply to Greg, IEEE equality and comparison is _the_ best approximation of mathematical equality and comparison suitable for floating-point types. If another one were superior, then floating-point experts would have designated that as the standard.

Swift correctly exposes only one concept of equality for floating-point types. It is and should be IEEE equality. People should be encouraged and not scared to use it. NaN is and should continue to exist as a concept. Yes, IEEE-compliant floating point is hard; the only thing harder than IEEE-compliant floating point is non-IEEE-compliant floating point.

This thread is meant to discuss how to reconcile this scenario with the semantics of Equatable.

David_Sweeris · October 27, 2017, 8:54pm

For a sufficiently non-mathematical definition of "logic"...

Logically speaking, NaN == NaN, but mathematically speaking it does not. NaN is, by definition, not a number, so asking if it's equal to itself or anything else is a mathematically meaningless question (same goes for < and >). If you're keeping track of why the answer is NaN, then there's at least a basis for discussing the matter... like in `1.0 == sin(x)/x`, where x is 0... Well, 0/0 is classic example of an undefined result, but the limit of sin(x)/x as x approaches 0 does equal 1, so in some sense -- AFAIK a very non-rigorous sense -- you'd be not entirely wrong to say that `1 == sin(0)/0` could kinda sorta be considered to be "proximately related to `true`" or something. For `==` to know that, though, .nan would need to carry the relevant closure and its arguments as a payload so that some other bits of logic could know how we got to 0/0, and `==` would need in-hardware calculus to verify, in a timely manner, that the limit both exists and is the same from both sides (or I suppose the `/` op could do it... that'd save on the payload requirements, but you're still stuck doing calculus in hardware). Since that's prohibitively impractical, if `==` has to give a boolean answer, it is much more correct to say "NaN != NaN". One could argue that the semantics of Swift's `==` ought be closer to "matches" than "equals", though, which I think is what Xiaodi is getting at.

- Dave Sweeris

···

On Oct 27, 2017, at 3:06 AM, Jonathan Hull via swift-dev <swift-dev@swift.org> wrote:

On Oct 26, 2017, at 11:44 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Fri, Oct 27, 2017 at 1:30 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:
One completely different idea, which I brought up a year or so ago, is to do what we do with pointers around this. That is you have your fast/unsafe IEEE Floats/Doubles/etc that have a scarier name. These do not conform to Equatable or Comparable, but have their own version of IEEE equality/comparison. Let’s spell it &== and &< to make it feel different so the users consider the possibility of NaN. They don’t have any notion of hashability.

As I wrote in my reply to Greg, IEEE equality and comparison is _the_ best approximation of mathematical equality and comparison suitable for floating-point types. If another one were superior, then floating-point experts would have designated that as the standard.

We definitely have different world views. I see the handling of NaN as a legacy/compatibility issue due to committee/vendor politics from the 1980’s. I am pretty sure if they could do it over with modern tech, we would just have isNan() and NaN == NaN… or we might just have optionals instead.

xwu · October 30, 2017, 2:34am

One completely different idea, which I brought up a year or so ago, is to
do what we do with pointers around this. That is you have your fast/unsafe
IEEE Floats/Doubles/etc that have a scarier name. These do not conform to
Equatable or Comparable, but have their own version of IEEE
equality/comparison. Let’s spell it &== and &< to make it feel different so
the users consider the possibility of NaN. They don’t have any notion of
hashability.

As I wrote in my reply to Greg, IEEE equality and comparison is _the_ best
approximation of mathematical equality and comparison suitable for
floating-point types. If another one were superior, then floating-point
experts would have designated that as the standard.

We definitely have different world views. I see the handling of NaN as a
legacy/compatibility issue due to committee/vendor politics from the
1980’s. I am pretty sure if they could do it over with modern tech, we
would just have isNan() and NaN == NaN… or we might just have optionals
instead.

Just to play devil’s advocate, there are actually much better and more
accurate representations available using the same number of bits. The main
issue is that there isn’t common hardware support for them. That is not
what I am suggesting here however.

Swift correctly exposes only one concept of equality for floating-point
types. It is and should be IEEE equality. People should be encouraged and
not scared to use it. NaN is and should continue to exist as a concept.
Yes, IEEE-compliant floating point is hard; the only thing harder than
IEEE-compliant floating point is non-IEEE-compliant floating point.

What I am suggesting is identical to IEEE in every way except for NaN.

The most recent edition of IEEE 754 is 70 pages long and mentions NaN on 40
of them. What you suggest is not an IEEE-compliant design for floating
point.

It is just an IEEE value that has been filtered so that we can guarantee
it isn’t NaN. It still uses all the same hardware instructions. Basically,
it semantically converts NaN to nil… and that lets us conform to
Equatable/Comparable honestly. It also still technically adheres to IEEE…
it just never comes up because we are careful to filter/handle NaN before
the user ever has to deal with it.

If you want/need to use NaN for some reason, you still have the IEEE
types. What you can’t do is use them generically for ==.

Can you give me an example of where you would want NaN in a generic
context (that might also contain Ints), but an optional Float (which had
been filtered not to have NaN) wouldn’t meet your needs? Remember, this is
a generic context, not one that is special casing Floats.

I'm not sure I understand the question. If you've already filtered out NaN
values, then you wouldn't have any NaN values, so why would you need a
generic algorithm that handles NaN?

This thread is meant to discuss how to reconcile this scenario with the
semantics of Equatable.

Yes it is… and this is one possible approach to doing it. We make it so
Floats just don’t conform to Equatable, but we make a wrapper type that
does conform (and can fully meet the guarantees).

Then you have your safe/friendly Swift Floating point type(s) which just

have no concept of NaN at all (and probably a single notion of zero). You
have a failable initializer from the IEEE versions. These types conform to
Equatable/Hashable/Comparable. Care is taken with internal methods so
that NaN can’t creep into the type.

How do we handle math functions which might fail? We do the same thing
we do in the rest of Swift... those functions return an optional.

When reading in data from the outside world or C code, you would use the
IEEE versions and then either convert or do your calculations directly.
They would probably also be used for things like accelerate. But most
code, where the values come from user input or literals, would never even
have to touch the IEEE version.

The advantage here is that you get full speed all the time, even in
generic contexts. You just can’t use the IEEE versions directly in generic
contexts. You would have to convert them, which is a one-time cost (or use
them non-generically).

Again, generics and protocol-based numerics are important; that's what
Numeric is all about. Any idea that doesn't make this possible is a
non-starter.

Why is everything “impossible” or a “non-starter”?

To have a focused discussion, we have to define the givens and the task to
be accomplished. Here, the given constraint is that Swift floating point
types are IEEE-compliant and conform to Equatable. The task is to design
Equatable.

The wrapper version would adhere to Numeric and would be fully usable in

···

On Fri, Oct 27, 2017 at 5:06 AM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 26, 2017, at 11:44 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Fri, Oct 27, 2017 at 1:30 AM, Jonathan Hull <jhull@gbis.com> wrote:
generic contexts. You can wrap the IEEE Floats, and the NaNs get converted
to optionals, where you can apply the generic algorithm. If your “generic”
algorithm was depending on the bit representation of NaN or the fact that
it breaks ==, then I would argue you really don’t want a generic algorithm
after all… you have a Float specific algorithm.

Also, in terms of speed, we should find something with the semantics we
want (that should be the main focus of our discussion)… and then we can
figure out how to tweak it so it goes as fast as we need it to once we know
what we are aiming for. Anything else is premature optimization.

Stephen_Canon · October 31, 2017, 4:07pm

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming majority of cases in which IEEE 754 semantics lead to bugs are due to non-reflexivity of equality, so let’s focus on that. In the cases where this causes a bug, the user has code that looks like this:

  // Programmer fails to consider NaN behavior.
  if a == b {
  }

but the correct implementation would be:

  // Programmer has thought about how to handle NaN here.
  if a == b || (a.isNaN && b.isNaN) {
  }

W.r.t ease of writing correct *concrete* code, the task is to make *this* specific case cleaner and more intuitive. What does this look like under other proposed notions of equality? Suppose we make comparisons with NaN trap:

  // Programmer fails to consider NaN behavior. This now traps if a or b is NaN.
  // That’s somewhat safer, but almost surely not the desired behavior.
  if a == b {
  }

  // Programmer considers NaNs. They now cannot use `==` until they rule out
  // either a or b is NaN. This actually makes the code *more* complicated and
  // less readable. Alternatively, they use `&==` or whatever we call the unsafe
  // comparison and it’s just like what we had before, except now they have a
  // “weird operator”.
  if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
  }

Now what happens if we return Bool?

  // Programmer fails to consider NaN behavior. Maybe the error when they
  // wrote a == b clues them in that they should. Otherwise they just throw in
  // a `!` and move on. They have the same bug they had before.
  if (a == b)! {
  }

  // Programmer considers NaNs. Unchanged from what we have currently,
  // except that we replace || with ??.
  if a == b ?? (a.isNaN && b.isNaN) {
  }

If we are going to do the work of introducing another notion of floating-point equality, it should directly solve non-reflexivity of equality *by making equality reflexive*. My preferred approach would be to simply identify all NaNs:

  // Programmer fails to consider NaN behavior. Now their code works!
  if a == b {
  }

  // Programmer thinks about NaNs, realizes they can simplify their existing code:
  if a == b {
  }

What are the downsides of this?

(a) it will confuse sometimes experts who expect IEEE 754 semantics.
(b) any code that uses `a != a` as an idiom for detecting NaNs will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully less than result from people failing to consider NaNs. The only real risk with (a) is that we get a biennial rant posted to hacker news about Swift equality being broken, and the response is basically “read the docs, use &== if you want that behavior”.

One specific response:

I see the handling of NaN as a legacy/compatibility issue due to committee/vendor politics from the 1980’s. I am pretty sure if they could do it over with modern tech, we would just have isNan() and NaN == NaN… or we might just have optionals instead.

With the exception of how they interact with non-floating-point types (comparisons, conversions to/from integers and strings), NaNs are just Maybes with fast hardware support. Integers and booleans and strings are outside the scope of IEEE 754, so it was not in the standard’s purview to do anything else for those operations. They are not some exotic legacy thing leftover from the 1980’s; they were quite ahead of their time.

– Steve

David_Sweeris · October 31, 2017, 7:16pm

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming majority of cases in which IEEE 754 semantics lead to bugs are due to non-reflexivity of equality, so let’s focus on that. In the cases where this causes a bug, the user has code that looks like this:

  // Programmer fails to consider NaN behavior.
  if a == b {
  }

but the correct implementation would be:

  // Programmer has thought about how to handle NaN here.
  if a == b || (a.isNaN && b.isNaN) {
  }

W.r.t ease of writing correct *concrete* code, the task is to make *this* specific case cleaner and more intuitive. What does this look like under other proposed notions of equality? Suppose we make comparisons with NaN trap:

  // Programmer fails to consider NaN behavior. This now traps if a or b is NaN.
  // That’s somewhat safer, but almost surely not the desired behavior.
  if a == b {
  }

  // Programmer considers NaNs. They now cannot use `==` until they rule out
  // either a or b is NaN. This actually makes the code *more* complicated and
  // less readable. Alternatively, they use `&==` or whatever we call the unsafe
  // comparison and it’s just like what we had before, except now they have a
  // “weird operator”.
  if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
  }

Now what happens if we return Bool?

  // Programmer fails to consider NaN behavior. Maybe the error when they
  // wrote a == b clues them in that they should. Otherwise they just throw in
  // a `!` and move on. They have the same bug they had before.
  if (a == b)! {
  }

Same bug, yes, but at least it'd crash when the nil gets force-unwrapped.

  // Programmer considers NaNs. Unchanged from what we have currently,
  // except that we replace || with ??.
  if a == b ?? (a.isNaN && b.isNaN) {
  }

`(a == b) != false` would be more compact (assuming they didn't want to use `&==` for some reason), but, yes, that's still not as simple as `a == b`

If we are going to do the work of introducing another notion of floating-point equality, it should directly solve non-reflexivity of equality *by making equality reflexive*. My preferred approach would be to simply identify all NaNs:

  // Programmer fails to consider NaN behavior. Now their code works!
  if a == b {
  }

  // Programmer thinks about NaNs, realizes they can simplify their existing code:
  if a == b {
  }

What are the downsides of this?

  (a) it will confuse sometimes experts who expect IEEE 754 semantics.
  (b) any code that uses `a != a` as an idiom for detecting NaNs will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully less than result from people failing to consider NaNs. The only real risk with (a) is that we get a biennial rant posted to hacker news about Swift equality being broken, and the response is basically “read the docs, use &== if you want that behavior”.

"<whatever> != <whatever>" is a pretty specific pattern... Could we warn or have a fix-it or something on that?

One specific response:

I see the handling of NaN as a legacy/compatibility issue due to committee/vendor politics from the 1980’s. I am pretty sure if they could do it over with modern tech, we would just have isNan() and NaN == NaN… or we might just have optionals instead.

With the exception of how they interact with non-floating-point types (comparisons, conversions to/from integers and strings), NaNs are just Maybes with fast hardware support. Integers and booleans and strings are outside the scope of IEEE 754, so it was not in the standard’s purview to do anything else for those operations. They are not some exotic legacy thing leftover from the 1980’s; they were quite ahead of their time.

Agreed. I wish int/string/etc supported that bit of semantics... it'd make dealing with floats more consistent with other types.

- Dave Sweeris

···

On Oct 31, 2017, at 9:07 AM, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

Jon_Hull · October 31, 2017, 9:37pm

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming majority of cases in which IEEE 754 semantics lead to bugs are due to non-reflexivity of equality, so let’s focus on that. In the cases where this causes a bug, the user has code that looks like this:

  // Programmer fails to consider NaN behavior.
  if a == b {
  }

but the correct implementation would be:

  // Programmer has thought about how to handle NaN here.
  if a == b || (a.isNaN && b.isNaN) {
  }

W.r.t ease of writing correct *concrete* code, the task is to make *this* specific case cleaner and more intuitive. What does this look like under other proposed notions of equality? Suppose we make comparisons with NaN trap:

  // Programmer fails to consider NaN behavior. This now traps if a or b is NaN.
  // That’s somewhat safer, but almost surely not the desired behavior.
  if a == b {
  }

  // Programmer considers NaNs. They now cannot use `==` until they rule out
  // either a or b is NaN. This actually makes the code *more* complicated and
  // less readable. Alternatively, they use `&==` or whatever we call the unsafe
  // comparison and it’s just like what we had before, except now they have a
  // “weird operator”.
  if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
  }

Now what happens if we return Bool?

  // Programmer fails to consider NaN behavior. Maybe the error when they
  // wrote a == b clues them in that they should. Otherwise they just throw in
  // a `!` and move on. They have the same bug they had before.
  if (a == b)! {
  }

  // Programmer considers NaNs. Unchanged from what we have currently,
  // except that we replace || with ??.
  if a == b ?? (a.isNaN && b.isNaN) {
  }

I still like both of these better than the first case, since the programmer has to take an action, which means they are forced to deal with the possibility in some way. Yes, that action could be mindless (like adding !), but at least there is an indication it could trap when reading the code.

I like your suggestion of making it reflexive better though...

If we are going to do the work of introducing another notion of floating-point equality, it should directly solve non-reflexivity of equality *by making equality reflexive*. My preferred approach would be to simply identify all NaNs:

  // Programmer fails to consider NaN behavior. Now their code works!
  if a == b {
  }

  // Programmer thinks about NaNs, realizes they can simplify their existing code:
  if a == b {
  }

If you think this is possible/palatable, then this would be pretty ideal.

What are the downsides of this?

(a) it will confuse sometimes experts who expect IEEE 754 semantics.
(b) any code that uses `a != a` as an idiom for detecting NaNs will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully less than result from people failing to consider NaNs. The only real risk with (a) is that we get a biennial rant posted to hacker news about Swift equality being broken, and the response is basically “read the docs, use &== if you want that behavior”.

Maybe we can warn on 'a != a' as David suggests? It is definitely a specific pattern that shouldn’t be used anywhere else, so we could write a pretty specific warning.

One specific response:

I see the handling of NaN as a legacy/compatibility issue due to committee/vendor politics from the 1980’s. I am pretty sure if they could do it over with modern tech, we would just have isNan() and NaN == NaN… or we might just have optionals instead.

With the exception of how they interact with non-floating-point types (comparisons, conversions to/from integers and strings), NaNs are just Maybes with fast hardware support. Integers and booleans and strings are outside the scope of IEEE 754, so it was not in the standard’s purview to do anything else for those operations. They are not some exotic legacy thing leftover from the 1980’s; they were quite ahead of their time.

I hope I didn’t offend. For what it is worth, I agree that NaN was ahead of it’s time, and they are essentially hardware supported optionals. My point was if we were to do it from scratch in modern day, we might just use optionals, since we have those now.

The bit about a political/legacy issue was specifically about NaN != NaN, and not about NaN itself. I seem to remember reading that there was a disagreement over whether to do NaN == NaN with .isNan() or NaN != NaN, and that the latter was chosen because one of the larger vendors wanted compatibility with their existing line, and pushed it through committee. I can’t find the article now, so I could be mis-remembering.

Thanks,
Jon

···

On Oct 31, 2017, at 9:07 AM, Stephen Canon <scanon@apple.com> wrote:

David_Sweeris · October 31, 2017, 10:56pm

One more thought — and it’s crazy enough that I’m not even sure it’s worth posting — does Swift’s `Equatable` semantics require that `(a == b) != (a != b)` always evaluate to `true`? Because it seems like the arguments for having `.nan == .nan` return `false` would apply for `!=` as well. Without getting into trapping, faulting or returning a `Bool?` / `Maybe` / `Tern`, I can’t think of anything else that’d get a developer’s attention faster than the same value being both not equal and not not equal to itself.

I mean, is that likely to cause any more bugs than having `.nan == .nan` return `true`?

I *think* yes, but I tend to use `.isNaN`, so I’m not sure.

- Dave Sweeris

···

On Oct 31, 2017, at 09:07, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming majority of cases in which IEEE 754 semantics lead to bugs are due to non-reflexivity of equality, so let’s focus on that. In the cases where this causes a bug, the user has code that looks like this:

   // Programmer fails to consider NaN behavior.
   if a == b {
   }

but the correct implementation would be:

   // Programmer has thought about how to handle NaN here.
   if a == b || (a.isNaN && b.isNaN) {
   }

W.r.t ease of writing correct *concrete* code, the task is to make *this* specific case cleaner and more intuitive. What does this look like under other proposed notions of equality? Suppose we make comparisons with NaN trap:

   // Programmer fails to consider NaN behavior. This now traps if a or b is NaN.
   // That’s somewhat safer, but almost surely not the desired behavior.
   if a == b {
   }

   // Programmer considers NaNs. They now cannot use `==` until they rule out
   // either a or b is NaN. This actually makes the code *more* complicated and
   // less readable. Alternatively, they use `&==` or whatever we call the unsafe
   // comparison and it’s just like what we had before, except now they have a
   // “weird operator”.
   if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
   }

Now what happens if we return Bool?

   // Programmer fails to consider NaN behavior. Maybe the error when they
   // wrote a == b clues them in that they should. Otherwise they just throw in
   // a `!` and move on. They have the same bug they had before.
   if (a == b)! {
   }

   // Programmer considers NaNs. Unchanged from what we have currently,
   // except that we replace || with ??.
   if a == b ?? (a.isNaN && b.isNaN) {
   }

If we are going to do the work of introducing another notion of floating-point equality, it should directly solve non-reflexivity of equality *by making equality reflexive*. My preferred approach would be to simply identify all NaNs:

   // Programmer fails to consider NaN behavior. Now their code works!
   if a == b {
   }

   // Programmer thinks about NaNs, realizes they can simplify their existing code:
   if a == b {
   }

What are the downsides of this?

   (a) it will confuse sometimes experts who expect IEEE 754 semantics.
   (b) any code that uses `a != a` as an idiom for detecting NaNs will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully less than result from people failing to consider NaNs. The only real risk with (a) is that we get a biennial rant posted to hacker news about Swift equality being broken, and the response is basically “read the docs, use &== if you want that behavior”.

xwu · November 1, 2017, 2:34am

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either
trap on NaN or return `Bool?`. I think that these suggestions result from
people getting tunnel-vision on the idea of “make FloatingPoint equality
satisfy desired axioms of Equatable / Comparable”. This is misguided. Our
goal is (should be) to make a language usable by developers; satisfying
axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct
concrete code, and it does not enable writing generic algorithms that
operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming
majority of cases in which IEEE 754 semantics lead to bugs are due to
non-reflexivity of equality, so let’s focus on that. In the cases where
this causes a bug, the user has code that looks like this:

        // Programmer fails to consider NaN behavior.
        if a == b {
        }

but the correct implementation would be:

        // Programmer has thought about how to handle NaN here.
        if a == b || (a.isNaN && b.isNaN) {
        }

W.r.t ease of writing correct *concrete* code, the task is to make *this*
specific case cleaner and more intuitive. What does this look like under
other proposed notions of equality? Suppose we make comparisons with NaN
trap:

        // Programmer fails to consider NaN behavior. This now traps if a
or b is NaN.
        // That’s somewhat safer, but almost surely not the desired
behavior.
        if a == b {
        }

        // Programmer considers NaNs. They now cannot use `==` until they
rule out
        // either a or b is NaN. This actually makes the code *more*
complicated and
        // less readable. Alternatively, they use `&==` or whatever we
call the unsafe
        // comparison and it’s just like what we had before, except now
they have a
        // “weird operator”.
        if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
        }

Now what happens if we return Bool?

        // Programmer fails to consider NaN behavior. Maybe the error
when they
        // wrote a == b clues them in that they should. Otherwise they
just throw in
        // a `!` and move on. They have the same bug they had before.
        if (a == b)! {
        }

        // Programmer considers NaNs. Unchanged from what we have
currently,
        // except that we replace || with ??.
        if a == b ?? (a.isNaN && b.isNaN) {
        }

If we are going to do the work of introducing another notion of
floating-point equality, it should directly solve non-reflexivity of
equality *by making equality reflexive*. My preferred approach would be to
simply identify all NaNs:

        // Programmer fails to consider NaN behavior. Now their code works!
        if a == b {
        }

        // Programmer thinks about NaNs, realizes they can simplify their
existing code:
        if a == b {
        }

What are the downsides of this?

        (a) it will confuse sometimes experts who expect IEEE 754
semantics.
        (b) any code that uses `a != a` as an idiom for detecting NaNs
will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully
less than result from people failing to consider NaNs. The only real risk
with (a) is that we get a biennial rant posted to hacker news about Swift
equality being broken, and the response is basically “read the docs, use
&== if you want that behavior”.

One of my premises for this discussion was that concrete NaN != NaN is
desirable, correct, and an absolute must-have; the question here was how to
write correct *generic* code given that Equatable currently guarantees a ==
a for all a. Do you disagree with the premise?

···

On Tue, Oct 31, 2017 at 11:07 AM, Stephen Canon <scanon@apple.com> wrote:

One specific response:

> I see the handling of NaN as a legacy/compatibility issue due to
committee/vendor politics from the 1980’s. I am pretty sure if they could
do it over with modern tech, we would just have isNan() and NaN == NaN… or
we might just have optionals instead.

With the exception of how they interact with non-floating-point types
(comparisons, conversions to/from integers and strings), NaNs are just
Maybes with fast hardware support. Integers and booleans and strings are
outside the scope of IEEE 754, so it was not in the standard’s purview to
do anything else for those operations. They are not some exotic legacy
thing leftover from the 1980’s; they were quite ahead of their time.

– Steve

Chris_Lattner · November 1, 2017, 5:11am

+100. Swift isn’t the first language to face the problems of floating point, nor is it the first to try to shoehorn it into a framework like Equatable. Despite weird cases involving NaNs, I haven’t seen a significant example of harm that it causes in practice, nor have I seen a proposal that makes the state of the art *better* than it currently is. IMO, better involves reducing existing pain without introducing new pains that are more significant than the old ones.

-Chris

···

On Oct 31, 2017, at 9:07 AM, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

xwu · November 1, 2017, 2:26am

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either
trap on NaN or return `Bool?`. I think that these suggestions result from
people getting tunnel-vision on the idea of “make FloatingPoint equality
satisfy desired axioms of Equatable / Comparable”. This is misguided. Our
goal is (should be) to make a language usable by developers; satisfying
axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct
concrete code, and it does not enable writing generic algorithms that
operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming
majority of cases in which IEEE 754 semantics lead to bugs are due to
non-reflexivity of equality, so let’s focus on that. In the cases where
this causes a bug, the user has code that looks like this:

   // Programmer fails to consider NaN behavior.
   if a == b {
   }

but the correct implementation would be:

   // Programmer has thought about how to handle NaN here.
   if a == b || (a.isNaN && b.isNaN) {
   }

W.r.t ease of writing correct *concrete* code, the task is to make *this*
specific case cleaner and more intuitive. What does this look like under
other proposed notions of equality? Suppose we make comparisons with NaN
trap:

   // Programmer fails to consider NaN behavior. This now traps if a or b
is NaN.
   // That’s somewhat safer, but almost surely not the desired behavior.
   if a == b {
   }

   // Programmer considers NaNs. They now cannot use `==` until they rule
out
   // either a or b is NaN. This actually makes the code *more*
complicated and
   // less readable. Alternatively, they use `&==` or whatever we call the
unsafe
   // comparison and it’s just like what we had before, except now they
have a
   // “weird operator”.
   if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
   }

Now what happens if we return Bool?

   // Programmer fails to consider NaN behavior. Maybe the error when they
   // wrote a == b clues them in that they should. Otherwise they just
throw in
   // a `!` and move on. They have the same bug they had before.
   if (a == b)! {
   }

   // Programmer considers NaNs. Unchanged from what we have currently,
   // except that we replace || with ??.
   if a == b ?? (a.isNaN && b.isNaN) {
   }

If we are going to do the work of introducing another notion of
floating-point equality, it should directly solve non-reflexivity of
equality *by making equality reflexive*. My preferred approach would be to
simply identify all NaNs:

   // Programmer fails to consider NaN behavior. Now their code works!
   if a == b {
   }

   // Programmer thinks about NaNs, realizes they can simplify their
existing code:
   if a == b {
   }

What are the downsides of this?

   (a) it will confuse sometimes experts who expect IEEE 754 semantics.
   (b) any code that uses `a != a` as an idiom for detecting NaNs will be
broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully
less than result from people failing to consider NaNs. The only real risk
with (a) is that we get a biennial rant posted to hacker news about Swift
equality being broken, and the response is basically “read the docs, use
&== if you want that behavior”.

One more thought — and it’s crazy enough that I’m not even sure it’s worth
posting — does Swift’s `Equatable` semantics require that `(a == b) != (a
!= b)` *always* evaluate to `true`?

Yes. `!=` is an extension method that cannot be overridden, guaranteed to
return false if `==` returns true.

···

On Tue, Oct 31, 2017 at 5:56 PM, David Sweeris <davesweeris@mac.com> wrote:

On Oct 31, 2017, at 09:07, Stephen Canon via swift-dev < > swift-dev@swift.org> wrote:

Because it seems like the arguments for having `.nan == .nan` return
`false` would apply for `!=` as well. Without getting into trapping,
faulting or returning a `Bool?` / `Maybe` / `Tern`, I can’t think of
anything else that’d get a developer’s attention faster than the same value
being both not equal and not not equal to itself.

I mean, is that likely to cause any more bugs than having `.nan == .nan`
return `true`?

I *think* yes, but I tend to use `.isNaN`, so I’m not sure.

- Dave Sweeris

Ben_Cohen · November 1, 2017, 4:16pm

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

+100. Swift isn’t the first language to face the problems of floating point, nor is it the first to try to shoehorn it into a framework like Equatable.

Java and C# do not have this problem with their generic algorithms (albeit possibly because of limitations in their languages that Swift doesn’t have). Swift is setting itself up as a major language with confusing and unjustifiable behavior by comparison. That some other languages are also bad at this doesn’t seem relevant.

Despite weird cases involving NaNs, I haven’t seen a significant example of harm that it causes in practice,

Sorting an array with NaNs resulting in arbitrary ordering, weird edge cases like == behaving differently depending on the identity of the buffer, don’t seem like problems in practice? Users do encounter these problems, that’s what led to this discussion.

nor have I seen a proposal that makes the state of the art *better* than it currently is.

I agree trapping and optional bool solutions aren’t good.

Steve’s preferred approach later in the email – that NaN == NaN and that &== be used for IEEE when needed – seems significantly better than the current situation. I hadn’t thought this was on the table, hence the more elaborate suggestions about generic vs concrete contexts. But if this is acceptable in the eyes of at least some FP experts, that’s certainly the best option from my perspective.

···

On Oct 31, 2017, at 10:11 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:
On Oct 31, 2017, at 9:07 AM, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

IMO, better involves reducing existing pain without introducing new pains that are more significant than the old ones.

-Chris

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

David_Sweeris · November 1, 2017, 3:23am

Wait, what? So if I have a `Password` type, and want to trigger extra logging if the `!=` function is called too many times within a second or something, that won't get called in generic code? That seems... unintuitive...

- Dave Sweeris

···

On Oct 31, 2017, at 7:26 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Tue, Oct 31, 2017 at 5:56 PM, David Sweeris <davesweeris@mac.com <mailto:davesweeris@mac.com>> wrote:

On Oct 31, 2017, at 09:07, Stephen Canon via swift-dev <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote:

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

Why do they not help write correct concrete code? The overwhelming majority of cases in which IEEE 754 semantics lead to bugs are due to non-reflexivity of equality, so let’s focus on that. In the cases where this causes a bug, the user has code that looks like this:

   // Programmer fails to consider NaN behavior.
   if a == b {
   }

but the correct implementation would be:

   // Programmer has thought about how to handle NaN here.
   if a == b || (a.isNaN && b.isNaN) {
   }

W.r.t ease of writing correct *concrete* code, the task is to make *this* specific case cleaner and more intuitive. What does this look like under other proposed notions of equality? Suppose we make comparisons with NaN trap:

   // Programmer fails to consider NaN behavior. This now traps if a or b is NaN.
   // That’s somewhat safer, but almost surely not the desired behavior.
   if a == b {
   }

   // Programmer considers NaNs. They now cannot use `==` until they rule out
   // either a or b is NaN. This actually makes the code *more* complicated and
   // less readable. Alternatively, they use `&==` or whatever we call the unsafe
   // comparison and it’s just like what we had before, except now they have a
   // “weird operator”.
   if (!a.isNaN && !b.isNaN && a == b) || (a.isNaN && b.isNaN) {
   }

Now what happens if we return Bool?

   // Programmer fails to consider NaN behavior. Maybe the error when they
   // wrote a == b clues them in that they should. Otherwise they just throw in
   // a `!` and move on. They have the same bug they had before.
   if (a == b)! {
   }

   // Programmer considers NaNs. Unchanged from what we have currently,
   // except that we replace || with ??.
   if a == b ?? (a.isNaN && b.isNaN) {
   }

If we are going to do the work of introducing another notion of floating-point equality, it should directly solve non-reflexivity of equality *by making equality reflexive*. My preferred approach would be to simply identify all NaNs:

   // Programmer fails to consider NaN behavior. Now their code works!
   if a == b {
   }

   // Programmer thinks about NaNs, realizes they can simplify their existing code:
   if a == b {
   }

What are the downsides of this?

   (a) it will confuse sometimes experts who expect IEEE 754 semantics.
   (b) any code that uses `a != a` as an idiom for detecting NaNs will be broken.

(b) is by far the bigger risk. It *will* result in some bugs. Hopefully less than result from people failing to consider NaNs. The only real risk with (a) is that we get a biennial rant posted to hacker news about Swift equality being broken, and the response is basically “read the docs, use &== if you want that behavior”.

One more thought — and it’s crazy enough that I’m not even sure it’s worth posting — does Swift’s `Equatable` semantics require that `(a == b) != (a != b)` always evaluate to `true`?

Yes. `!=` is an extension method that cannot be overridden

Greg_Titus · November 1, 2017, 4:51pm

The common (and correct!) wisdom in _any_ programming language that uses IEEE floating point is that checking equality of two floating point values is almost always a terrible idea. Usually what you want in any real world code is to check for a difference less than some epsilon value, which depends upon context. There are just too many issues with values that aren’t exactly representable, rounding errors during computations, et cetera, for perfectly normal floats even if you completely left aside equality rules for NaN.

I completely understand the desire in this thread to make floating point really satisfy the axioms of Equatable, but the fact is, even if you did, using a generic algorithm that depends upon equatability with floating point types is almost always just a programming error waiting to happen. It’s implicit in the representation and use of floating point values themselves, no matter what particular implementation you decide on for == or &==.

If you really want to make the language better for developers, provide and emphasize fixed point or infinite precision or rational types for doing various things instead, and encourage them to shun floats as much as possible. If you really need to change anything about the standard library of Swift, my preferred solution would be to continue to provide ==(lhs : Float, rhs: Float) and != but NOT declare conformance to Equatable at all so that generic algorithms involving floats would fail to compile.

- Greg

···

On Nov 1, 2017, at 9:16 AM, Ben Cohen via swift-dev <swift-dev@swift.org> wrote:

On Oct 31, 2017, at 10:11 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:
On Oct 31, 2017, at 9:07 AM, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

+100. Swift isn’t the first language to face the problems of floating point, nor is it the first to try to shoehorn it into a framework like Equatable.

Java and C# do not have this problem with their generic algorithms (albeit possibly because of limitations in their languages that Swift doesn’t have). Swift is setting itself up as a major language with confusing and unjustifiable behavior by comparison. That some other languages are also bad at this doesn’t seem relevant.

Chris_Lattner · November 3, 2017, 5:39am

C++ has exactly this problem, std::sort on a std::vector<double>. I haven’t seen the world burn down in practice.

-Chris

···

On Nov 1, 2017, at 9:16 AM, Ben Cohen via swift-dev <swift-dev@swift.org> wrote:

On Oct 31, 2017, at 10:11 PM, Chris Lattner via swift-dev <swift-dev@swift.org> wrote:

On Oct 31, 2017, at 9:07 AM, Stephen Canon via swift-dev <swift-dev@swift.org> wrote:

[Replying to the thread as a whole]

There have been a bunch of suggestions for variants of `==` that either trap on NaN or return `Bool?`. I think that these suggestions result from people getting tunnel-vision on the idea of “make FloatingPoint equality satisfy desired axioms of Equatable / Comparable”. This is misguided. Our goal is (should be) to make a language usable by developers; satisfying axioms is only useful in as much as they serve that goal.

Trapping or returning `Bool?` does not make it easier to write correct concrete code, and it does not enable writing generic algorithms that operate on Comparable or Equatable. Those are the problems to be solved.

+100. Swift isn’t the first language to face the problems of floating point, nor is it the first to try to shoehorn it into a framework like Equatable.

Java and C# do not have this problem with their generic algorithms (albeit possibly because of limitations in their languages that Swift doesn’t have). Swift is setting itself up as a major language with confusing and unjustifiable behavior by comparison. That some other languages are also bad at this doesn’t seem relevant.