[pitch] Comparison Reform

xwu · April 25, 2017, 3:25am

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both
the existing design and Dave's proposed design jump through huge hoops to
preserve the behavior of these operators and reconcile them with the rest
of Swift. The biggest weakness of my alternative idea as written up earlier
is that it cannot preserve the spelling of these operators but requires a
minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with:
other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and <

···

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution < swift-evolution@swift.org> wrote:

work with a default tolerance. That is, 'a == b' would check that (a - b)
< epsilon for some reasonable choice of epsilon that works for common
usage. I know this would make the hash function harder to write, but I
think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance would
explicitly declare that need by using 'compare(with: other, using: .IEEE)'
or whatever we want to call that metric. We can even provide metrics for
different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’
on FloatingPoint which returns a comparison metric that defines equality as
(a - b) < tolerance, for those who want to specify a specific tolerance for
their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution < > swift-evolution@swift.org> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially the
== operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the
same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm
specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense
in different domains… so let’s make it explicit so it doesn’t surprise
anyone.
>
> The question then becomes, which metric should be the default (i.e. the
one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use
floats/doubles in dictionaries and sets. People and algorithms which need
full IEEE correctness can use a different metric which specifically
guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested in
the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give
ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because,
hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) ->
ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or
multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using:
.IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution < > swift-evolution@swift.org> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous
to have `==` not bound to 754-equals as it is in essentially every other
language. In particular, I worry about the risk of people porting numerical
code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they
are unlikely to test. I’ll try to follow up with more detailed thoughts
tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being either
bound or not bound to 754 depending on the context is what makes me even
more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on
reconsideration it's clear to me that few if any numerical recipes can be
ported verbatim from C-like languages and we should probably not encourage
people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably
should be rewritten as `&<<` (I still haven't had enough time to think
about this), and the bitwise operators have differing precedences that
require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it
seems to me that there are a few workable approaches with different
tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it
is just a marker protocol), in which case FP types can’t conform to it, and
thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or
Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different
requirement, and FP types can therefore implement both Equatable and
StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type
requires you to implement the <=> operation, as well as the
StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it
is really bad that abstracting a concrete algorithm would change its
behavior so substantially. I don’t care about SNaNs, but I do care about
the difference between +0/-1 and secondarily that of NaN handling. It
seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise,
"T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Stephen_Canon · April 25, 2017, 2:34pm

It’s possible to tinker with IEEE comparisons and maybe improve them or at least work to make them compatible in some way with the expectations of Comparable. Making them have a tolerance by default is out of the question, however: as soon as you add a tolerance, you give up transitivity of equality, which is a much, much worse violation of the axioms than anything inflicted by the presence of NaN in the IEEE 754 model.

– Steve

···

On Apr 24, 2017, at 10:06 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE, I think we should also consider taking the opportunity to make == and < work with a default tolerance. That is, 'a == b' would check that (a - b) < epsilon for some reasonable choice of epsilon that works for common usage. I know this would make the hash function harder to write, but I think it is worthwhile.

Bjorn_Forster · April 27, 2017, 12:14pm

Hi all together,
I was just looking quickly over an article from Wolfram which covers new
books covering Mathematica and tripped over this book:

From the reviews:

"This book is an extraordinary reinvention of computer arithmetic and
elementary numerical methods from the ground up. Unum arithmetic is an
extension of floating point in which it is also possible to represent the
open intervals *between* two floating point numbers. This leads to
arithmetic that is algebraically much cleaner, without rounding error,
overflow underflow, or negative zero, and with clean and consistent
treatment of positive and negative infinity and NaN. These changes are not
just marginal technical improvements. As the book fully demonstrates, they
lead to what can only be described as a radical re-foundation of elementary
numerical analysis, with new methods that are free of rounding error, fully
parallelizable, fully portable, easier for programmers to master, and often
more economical of memory, bandwidth, and power than comparable floating
point methods. The book is exceptionally well written and produced and is
illustrated on every page with full-color diagrams that perfectly
communicate the material. Anyone interested in computer arithmetic or
numerical methods must read this book. It is surely destined to be a
classic."
—David Jefferson, Center for Advanced Scientific Computing, Lawrence
Livermore National Laboratory

I haven't read it myself, as said I stepped just over it, but *MAYBE* it
covers the NaN problem in depth and the current state of art how to deal
with it.
Maybe someone has free access to an online library (maybe via some
university enrollment) and can have a look at it?

- Björn

···

On Sun, Apr 23, 2017 at 4:40 PM, Chris Lattner via swift-evolution < swift-evolution@swift.org> wrote:

> On Apr 22, 2017, at 11:46 PM, David Waite <david@alkaline-solutions.com> > wrote:
>
>> On Apr 22, 2017, at 10:58 PM, Chris Lattner via swift-evolution < > swift-evolution@swift.org> wrote:
>>
>> I don’t think that this proposal is acceptable as written. I think it
is really bad that abstracting a concrete algorithm would change its
behavior so substantially. I don’t care about SNaNs, but I do care about
the difference between +0/-1 and secondarily that of NaN handling. It
seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise,
"T : Equatable”.
>
> Did I misunderstand the proposal? If T : FloatingPoint is not included
in level 1 comparisons, then you cannot have classes of generic floating
point algorithms.

Sorry about that, my mistake, I meant “T: Comparable"

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · April 24, 2017, 12:05am

>
>>
>>
>> >
>> >>
>> >> >> > That is to say, I would expect the standard library to supply an
>> >> >> > alternative implementation of equality for Array<T where T :
>> >> >> > >.
>> >> >>
>> >> >> And also for Dictionary? What do you expect to happen when
Double is
>> >> >> used as a dictionary key and it happens to be NaN?
>> >> >
>> >> > The proposal is very clear that `Dictionary` and `sort` will always
>> use
>> >> > level 2 comparison.
>> >>
>> >> Yes, but I'm not asking about my own proposal :-). My question is,
what
>> >> are *your* expectations for dictionaries or sets, specifically for
the
>> >> insertion of NaN keys and for comparison of whole collections?
>> >
>> > My expectations for comparison of collections clearly differ from
>> > yours.
>>
>> Maybe, in that you *have* strong expectations when it comes to FP. I'm
>> still trying to figure out what mine should be.
>>
>> > For me, I think it is a reasonable expectation that, for all `x` and
>> > `y` where `x == y`, ` == `[y]`, `[] == [[y]]`, `[x, x, x] == [y,
>> > y, y]`, etc.
>>
>> It seems pretty obvious when you put it that way. Here's an interesting
>> question: how much of that is because of the way array literals let you
>> “see through” the value being compared to its elements?
>>
>
> I don't think it's the visual representation. More to do with
expectations
> as to what an array is.
>
> If `a == b`, then `thingWrappingA == thingWrappingB`, for values of
"thing"
> that are "wrapper-like" (I will blithely leave the term undefined) rather
> than, say, stateful class instances.

wrapper-like probably means "Monad"

>> Also, does the same go for != ? That's telling w.r.t. NaN.
>
> Yes, I think so. I'd have to think harder to be more sure of that.

Well, that's the hard one, so please do. We can easily shift the
proposal so -0.0 == +0.0, but NaNs are harder to account for.

Right, more thought is necessary. It was obviated when I was writing my
proposed design, because NaN != NaN simply traps, and [NaN] != [NaN] would
too.

>> > To put it more strongly, I think that anything short of that is rather
>> > inexplicable.
>>
>> That's somewhat mitigated if we have to accept x != x (i.e. for NaN).
>>
>> > With respect to dictionaries keys or sets, it's a subtly different
>> > question, I think. When it comes to questions like "does this
dictionary
>> > have this key" or "does this set already have this element," I think
what
>> > the user is really asking about is a closer approximation to identity
>> than
>> > to equality. I get that this gets complicated if we're talking about
sets
>> > of class instances, so let me move away from the word identity and
>> > re-phrase.
>> >
>> > If a user asks if a set (or even an array) contains something, it's
>> > not exactly identical to asking if a set contains an element _equal
>> > to_ something. See:
>> >
>> > * "Does the set of cars in your parking lot contain my car?" =>
>> > `parkingLot.contains(myCar)`
>> > * "Does the set of cars in your parking lot contain a car that is
equal
>> > to
>> > my car?" => `parkingLot.contains { $0 == myCar }`
>>
>> I don't think this example is a good illustration of it, but I am
>> familiar with the principle.
>>
>> > Put another way, one can simultaneously hold the thought that (1) a
thing
>> > is itself, obviously; but (2) a thing is not necessarily _equal to_
>> itself.
>>
>> x === y but x != y
>>
>> ?
>>
>> IMO it may be possible, but not without head explosion. Sure, it's the
>> case for NaN, but that's just because of how IEEE defined it; there is
>> nothing in daily experience that acts that way. IMO it is much easier
>> to understand x == y but x !== y, that is, “these two forks are
>> identical, but they're not the same fork.”
>
> Yet both these scenarios really do happen with stdlib types.
>
> I'm of the strong belief that it's only potentially head-exploding for
some
> users because the language tries to paper over scenarios where it might
> happen and doesn't provide a way even to express this idea in code
("Binary
> operator '===' cannot be applied to two 'Int' operands").

I don't think this has anything to do with the language. IMO this is
about what “equal” means. People have an innate sense of equality that
doesn't admit x != x, at least not without some thread atomically
changing x's value behind our backs.

I don't disagree. This is why I proposed to migrate FP comparison operators
to &==, &!=, &<, etc. The result is that `x == x` is either always true or
it traps, and `x != x` is either never true or it traps. We do not insist
that a thing might not be "==" to itself.

If, as I suggest, we put front and center the idea that _some comparisons
> might fail or return nil_, the resulting design is not in itself
guaranteed
> to explode heads--I think.

I don't know that it's possible to put together a coherent understanding
of what a == b means in such a world.

>> >> >> >> >> This is a bump in the rug – push it down in one place, it
pops
>> up
>> >> >> >> >> in another. I feel like this proposal at least moves the
bump
>> to
>> >> >> >> >> where
>> >> >> >> fewer
>> >> >> >> >> people will trip over it. I think it highly likely that the
>> >> >> >> intersection of
>> >> >> >> >> developers who understand enough about floating point to
write
>> >> truly
>> >> >> >> >> correct concrete code, but won’t know about or discover the
>> >> >> documented
>> >> >> >> >> difference in generic code, is far smaller than the set of
>> people
>> >> who
>> >> >> >> hit
>> >> >> >> >> problems with the existing behavior.
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > So, to extend this analogy, I'd rather say that the bump is
not
>> in
>> >> the
>> >> >> >> rug
>> >> >> >> > [Comparable] but rather in a section of the floor [FP NaN].
The
>> rug
>> >> >> might
>> >> >> >> > overlie the bump, but the bump will always be there and
people
>> will
>> >> >> find
>> >> >> >> it
>> >> >> >> > as they walk even if they don't immediately see it.
>> >> >> >>
>> >> >> >> Correct.
>> >> >> >>
>> >> >> >> > If we don't want people to trip over the bump while walking
on
>> the
>> >> >> >> > rug, one very good alternative, IMHO, is to shape the rug so
>> that
>> >> it
>> >> >> >> > doesn't cover the bump.
>> >> >> >>
>> >> >> >> At what cost?
>> >> >> >>
>> >> >> >> More specifically: why is it the right behavior, for our
>> audience, to
>> >> >> >> trap when Equatable comparison happens to encounter NaN? Will
>> this
>> >> not
>> >> >> >> simply "crash" programs in the field that otherwise would have
>> "just
>> >> >> >> worked?"
>> >> >> >
>> >> >> > No, as I propose it, programs in the field would be
automatically
>> >> >> migrated
>> >> >> > to an alternative set of comparison operators `&==`, `&<`, etc.
>> that
>> >> >> would
>> >> >> > work exactly as `==`, `<`, etc. do today.
>> >> >>
>> >> >> Meaning, for floating point NaN &== NaN is false, and if you want
to
>> >> >> write numeric code that accounts for NaN, you use &==.
>> >> >>
>> >> >> OK, so... Is &== a protocol requirement, or a protocol extension,
or
>> >> >> neither? If so, to which protocol is it attached?
>> >> >
>> >> > Please allow me to refer you to a Gist:
>> >> > An alternative design for Comparable · GitHub
>> >> >
>> >> > In brief, it would be a protocol requirement on Comparable with a
>> default
>> >> > implementation. The rationale for its being on Comparable is given
in
>> the
>> >> > text.
>> >>
>> >> I'm sorry, I've seen your gist; I still can't find this rationale.
>> >>
>> >
>> > "These operators will be defined on Comparable and not on
FloatingPoint.
>> A
>> > key rationale for this design is to permit types other than
>> FloatingPoint,
>> > including third-party types, to distinguish between signaling and
quiet
>> > comparison of values when some values may be unordered with respect to
>> > others. (Another rationale for this design is that &< corresponds to
what
>> > is currently spelled as <, which can be used as a predicate for
>> > Sequence.sorted.)"
>>
>> Thanks.
>>
>> >> > I am not married to its being a requirement vs. an extension, but
my
>> >> > initial thought here is that there might be reason to provide an
>> >> > alternative implementation in a conforming type, say for
performance
>> >> > reasons on Float.
>> >> >
>> >> >> > I would quibble with the notion that all such generic algorithms
>> >> >> > currently "just work,"
>> >> >>
>> >> >> I never claimed they do! They don't, because Equatable.== for
>> floating
>> >> >> point is not an equivalence relation. That's part of what we aim
to
>> >> >> fix.
>> >> >>
>> >> >> You are proposing to fix that same problem a different way, one
that
>> >> leaves
>> >> >> NaNs a bit out-in-the-cold (not necessarily bad), but also
explicitly
>> >> >> modifies generic algorithms so they continue to silently produce
>> >> >> unspecified results (bad!)
>> >> >
>> >> > To clarify, no, I would not have the stdlib's generic algorithms
>> continue
>> >> > to produce unspecified results. I propose changes to them which
align
>> >> their
>> >> > behavior with what you and Ben have proposed.
>> >>
>> >> OK, I guess I must've misunderstood your earlier statements.
>> >>
>> >> So this, IMO, is not tenable, at least in its current form:
>> >>
>> >> ,----
>> >> > The default implementation for contains, elementsEqual, split, and
>> >> > starts(with:) on Sequence where Iterator.Element : Equatable, and
for
>> >> > index(of:) on Collection where Iterator.Element : Equatable, will
>> >> > (notionally) use the following predicate:
>> >> >
>> >> > {
>> >> > ($0 &== $1) || (
>> >> > ($0 <=> $1) == .none &&
>> >> > ($0 <=> $0) == .none &&
>> >> > ($1 <=> $1) == .none)
>> >> > }
>> >> `----
>> >>
>> >> The first problem here is that I can't figure out what this means
and I
>> >> doubt any normal user could either.
>> >
>> > I think it is a little ill-served by the notation. However the
concept is
>> > simple. Sometimes, a thing cannot be compared even to itself.
>>
>> As noted above, I disagree that that is a simple concept. Is it even
>> mathematically well-founded? What you are expressing, at least on the
>> surface, *seems* to be different from the mathematical notion of
>> incomparability, which—despite its confusing name—is actually simple
>> <https://en.wikipedia.org/wiki/Comparability>: x is incomparable to y
if
>> neither x > y nor x < y. The most common case is when x == y.
>
> Hmm, not an expert, but I don't see that definition. MathWorld <
> http://mathworld.wolfram.com/ComparableElements.html> confirms that the
> Wikipedia formulation is correct:
>
> x is incomparable to y if neither x ***>=*** y nor x ***<=*** y.

Ah, right. But this is where it gets tricky to interpret, because IIUC
in orderings in general, there is no concept of equality; there's just
"x precedes y or it doesn't." So when these articles use <= or <, it is
a stand-in for "the ordering predicate". In ancient times, I wrote an
article exploring some of this
(http://web.archive.org/web/20120422220137/http://cpp-
next.com/archive/2010/02/order-i-say/).
I always go back to that when I need to ground myself, though using your
own article for that has its perils. Please check that out and let me
know if you think I'm off-base, because the notion of incomparability in
that article is central to my understanding (and is, I believe,
consistent with Weak ordering - Wikipedia, FWIW).

IIUC, you're talking about a _strict_ partial order, which is
_irreflexive_. Such a relation is totally silent, however, as to whether x
and y are equivalent if `x !<> y`. As a standalone predicate, it isn't
sufficient to order a set which contains a pair of incomparable values in
the way that you and Ben propose. For instance, suppose I'm trying to sort
an array with 42 and NaN using a generic algorithm that can't just test
`isNaN`. So, to the algorithm, these values are an opaque `a` and `b`.
Since !(a < b) and !(b < a), there's no information I can get from the
predicate to tell me which one is the "problem value" (is it a, or is it b:
which one is NaN?).

In the context of set theory, I'm reading that comparability is defined in
terms of _weak_ partial orders (which are reflexive). I guess I've just
assumed we're using this definition of comparability because IEEE floating
point appears to adopt it. With a _weak_ partial order as a predicate, you
*can* order a set which contains a pair of incomparable values:

* Given: 42 ("a") and NaN ("b").
* First, we note that !(a <= b) && !(b <= a). So, we have a pair of
incomparable values.
* Then, we ask which value is not a member of the partially ordered set on
which the <= relation is defined.
* We find that (a <= a) == true but (b <= b) == false.
* We conclude that b is the "problem value" (NaN) which is not a member of
the partially ordered set.
* We decide to send it to the back of the line.

In a

>> strict-weak ordering, every element is incomparable with itself. The
>> complexity of your predicate suggests that you mean something much more
>> complicated. Is there any reason not to require that when (x <=> y) ==
>> nil, (x <=> x) == nil? That would simplify things a lot.
>>
>> Also, IMO it is confusing to say, “cannot be compared to itself,” when
>> in practice that means “you can compare it to itself, and you get nil.”
>> This is as much a result of comparison as any other.
>
> Well, x is said to be comparable to x if either x <= x or x >= x.
> Otherwise, x and x are incomparable.
>
> I propose we model this by returning a result of type `Comparison?`, such
> that this scenario produces a nil value rather than a `Comparison` case.
In
> that sense, one might say that there is no comparison result.
>
>> > The prime example we speak of is NaN != NaN. This is the only major
>> > concept that the design I propose here would require a user to
>> > understand.
>> >
>> > In this notation, if `x` cannot be compared to itself, `x <=> x` is
nil.
>> > For `contains` and other methods that are asking "do you have a thing"
>> more
>> > than they are asking "do you have a thing that is equivalent to a
thing,"
>> > we'll regard all values that can't be compared even to themselves as
"the
>> > same"; therefore, if `x <=> x` is nil and `y <=> y` is nil, then a
>> > collection with an element `x` will be regarded as "containing" `y`.
>> >
>> > How would this change in semantics
>> >> be reflected in the documentation for these algorithms? How would
you
>> >> describe their requirements and results? All these algorithms
currently
>> >> have simple, understandable descriptions.
>> >>
>> >
>> > The description would be augmented by an explanation along the lines
of
>> > this: "Some types can represent values that do not compare equal to
>> > themselves; one example is floating point NaN ("not a number"). For
the
>> > purposes of { contains | split | etc. }, every value not equal to
itself
>> is
>> > considered indistinguishable from every other value not equal to
itself."
>> >
>> >> Secondarily, I can't expect any generic algorithm author to
understand
>> >> what he's getting with this.
>> >>
>> >
>> > Obviously, we'd have to have real-world data to back it up, but the
>> > beauty of `Comparison?` being an optional enum is that all an
>> > author has to do is handle all the cases. The compiler can even
>> > help with that. In other words, all your generic algorithm author
>> > has to do is to decide *something* for the question, "What should I
>> > do when x <=> y returns nil?"
>>
>> Is that an easy question to answer? It doesn't look that way, to me.
>
> For behaviors such as "always sort NaN greater than everything else" or
> "always sort NaN less than everything else," the way to answer the
question
> would be to inspect whether `x` is comparable to `x` and/or `y` is
> comparable to `y`.

That's fine if you know what the behavior should be. I'm not yet
convinced it's easy to answer the question of what the behavior should
be.

>> >> > Any automatically migrated third-party generic code would indeed
>> >> > continue to exhibit the same behavior as in Swift 3--but not only
>> >> > do I not consider that to be a problem, I consider it to be a
>> >> > requirement of source compatibility which is absolutely essential.
>> >>
>> >> Well, we have not promised to preserve unspecified behaviors, so
>> >> technically we can handle this either way. And we all seem to be
agreed
>> >> that any code that uses, e.g., the default sort(), *will* change its
>> >> behavior with respect to NaN. So this is really a matter of degree.
>> >>
>> >> > It would not, however, be _invisible_ to the reader of the generic
>> >> > algorithm. The use of my proposed `&==` in a generic context should
>> >> > stand out and prompt re-evaluation. That is to say, by using a
>> >> > different spelling, we would have a visible hint in the code that a
>> >> > generic algorithm may produce unspecified results with NaN.
>> >>
>> >> That's a very nice feature.
>> >>
>> >> >>> but the result is that they would behave exactly as they do today
>> and
>> >> >>> therefore would at least be no more broken.
>> >> >>
>> >> >> If that's all we acheive, we should do nothing.
>> >> >
>> >> > I should hope that it's not all we achieve. But, consider the
>> >> > following two alternatives: migrated code exhibits identical
>> >> > behavior to Swift 3, or migrated code silently exhibits different
>> >> > behavior that is "fixed."
>> >>
>> >> I think we're both agreed that the latter *will* happen with
>> >>
>> >> sort(floatsContainingNaN)
>> >>
>> >> or
>> >>
>> >> floatsContainingNaN.contains(.NaN)
>> >
>> > Well, I do not agree that `[+0.0].contains(-0.0)` should return
`false`,
>> > but we can discuss that on the side.
>>
>> I'm not wedded to any particular answer there. The only reason I think
>> we proposed it is that it corresponds to an IEEE comparison level.
>>
>> > Otherwise, the key difference here is that code that's _correctly
>> written_
>> > for FP *would never use* a statement like
>> > `floatsContainingNaN.contains(.nan)` because with its current
behavior
>> it'd
>> > be equivalent to `false` (or at least, it's not reliably `true`). The
>> same
>> > cannot be said for `Array<Double>.==`, which can be profitably used
when
>> > the user knows that `[.nan] != [.nan]`.
>>
>> Okay, take elementsEqual then. I don't see why array1 == array2 is any
>> different from array1.elementsEqual(array2).
>
> Indeed, `elementsEqual` requires special consideration. I had the same
> thought while writing the Gist and commented; "One consequence of this
> design is that `[Double.nan].contains(.nan)` returns true, as ordinarily
> expected. Consideration may be given to improving the naming of
> `elementsEqual` so as not to suggest that all elements of the receiver
and
> argument literally compare `.some(.equal)`."
>
> I am not suggesting this particular name, but
> `array1.elementsIndistinguishable(from: array2)` expresses something of
the
> idea.

I'm very skeptical of introducing a notion of indistinguishability
that's distinct from equality.

I think this goes back to the above about what we want equality to be. If
equality is to be distinguished from incomparability (as it is in floating
point), then we must have some notion of what happens when two values
cannot be compared. The only solution I know of is to say that if neither
value can be compared with itself, then we can treat them as [some word
here less unwieldy than indistinguishable].

>>> > I am very disturbed by the possibility of the latter. It is the
>> >> > only part of this proposal that keeps me up at night.
>> >>
>> >> I agree it's concerning, but I don't think fixing just part of this
>> >> problem is going to help either ;-).
>> >>
>> >> > As it turns out, some people really do understand how floating
>> >> > point comparison works, and they might have even carefully written
>> >> > code that behaves correctly, relying on the current behavior when
>> >> > things are compared. Please don't "fix" that code. If an array of
>> >> > type [Float] starts to distinguish between +0.0 and -0.0 as you
>> >> > propose, I'm quite sure that there is at least some code of my own
>> >> > that will be quite broken.
>> >>
>> >> yep, sadly. But IIUC that's inevitable.
>> >>
>> >> >> > Standard library changes to `sort` and other functions will make
>> them
>> >> >> > "just work" with no distinguishable difference to the end user
as
>> >> >> > compared to this proposal here.
>> >> >>
>> >> >> I'm sorry, I don't know what "this proposal here" means. Is that
>> yours
>> >> >> or the one Ben and I offered? It's certainly different from the
>> results
>> >> >> of our proposal.
>> >> >>
>> >> >> The big problem with our proposal, AFAICT, is that
>> >> >>
>> >> >> floatsIncludingNaNs.sort()
>> >> >>
>> >> >> works but
>> >> >>
>> >> >> floatsIncludingNaNs.sort(>)
>> >> >>
>> >> >> does not. That is a real problem, but it *is* a difference from
the
>> >> >> current behavior, where neither one works.
>> >> >>
>> >> >
>> >> > Hmm, I get the sense that some of my replies to you have been
lost. I
>> >> have
>> >> > explicitly proposed a design where `floatsIncludingNaNs.sort()`
>> produces
>> >> > the same behavior as what is proposed by you and Ben. I'd like to
>> refer
>> >> you
>> >> > again to the fleshed out Gist:
>> >> >
>> >> > An alternative design for Comparable · GitHub
>> >>
>> >> Sorry I misread your earlier remarks. I don't think I missed them.
>> >>
>> >> >> > It would be an improvement over how the algorithms work today
with
>> >> >> > NaN.
>> >> >> >
>> >> >> > The major difference to the end user between what I propose and
>> >> >> > this proposal here will surface when _new_ code is written that
>> >> >> > uses `==` in the generic context, when working with types whose
>> >> >> > values may compare unordered. Since I propose `<=>` to return a
>> >> >> > value of type `Comparison?`, using the revised operator `==` is
an
>> >> >> > assertion that the result of comparison is not unordered. A
user is
>> >> >> > welcome to use `&==` or a custom predicate if that is not their
>> >> >> > intention.
>> >> >>
>> >> >> The problem with this is that there's still no simple way to get
an
>> >> ^^^^^^
>> >> >> equivalence relation or a total order over all Doubles, including
>> NaNs.
>> >> >
>> >> > There is. Given two values x and y, `x &< y || (y <=> y) == nil` is
>> >> > identical to the `<` that you propose.
>> >>
>> >> Err, I rest my case?
>> >>
>> >
>> > It'd be easier with some more syntactic sugar, I guess.
>>
>> I don't think so. The problem, IMO, is that you're expressing something
>> hard to understand that falls outside anyone's experience (except maybe
>> their experience with NaN).
>>
>> > But my point above is that it is not difficult to describe what is
>> > happening in prose. Here, simply, if a value is not equal to itself,
>> > then it's ordered after all values that are equal to themselves.
>> >
>> >> >> Now, I'm totally willing to have the discussion about how NaNs
have
>> no
>> >> >> business being used as dictionary keys, or sort keys, or searched
>> for,
>> >> >> or any of the other things we do with day-to-day values. That's
not
>> >> >> something I really have an opinion on, yet.
>> >> >
>> >> > I would not assert that NaN has no business being used here;
again, my
>> >> > alternative design accommodates all of these use cases.
>> >> >
>> >> > Where we differ is that, in the case of a generic algorithm, my
>> >> > alternative design would result in the author of that algorithm
either
>> >> > explicitly accommodating the presence of unordered values or
asserting
>> >> > their absence.
>> >>
>> >> That is a nice property, except that IIUC you're talking about
granting
>> >> an exception for the most common forms of algorithms.
>> >
>> > Not that I'm aware of? All algorithms will need to be modified to deal
>> with
>> > what happens if `(x <=> y) == nil`. I've just listed the ways in which
>> > stdlib functions will do so.
>>
>> An algorithm written in terms of any of the algorithms whose semantics
>> you're talking about adjusting (sort, contains, elementsEqual...) will
>> pick up that algorithm's decision without its author making any explicit
>> choice.
>
> Sure. However, consider that `sort`, `contains`, `elementsEqual` all have
> corresponding `sort(by:)`, `contains(where:)`, `elementsEqual(_:by:)`.
When
> I write `sort()`, I'm also saying, "Let the stdlib provide me with a
> default sort." I don't think that anyone who writes `sort()` is ignorant
of
> the fact that they must accept a default predicate since they are
> _choosing_ not to specify one themselves.

I don't think this is consistent with your claim that algorithm authors
“either explicitly accommodate the presence of unordered values or
assert their absence.” In your model I can call sort without making any
kind of explicit accomodation or assertion.

Let me rephrase: "...either accommodate or assert, etc., _during comparison
operations_". One cannot, of course, stop libraries from providing
facilities to abstract away that decision when it comes to performing
common tasks.

>>> Aside: I object to characterizing these things as unordered.
>> >
>> > Sorry, just using the terminology I read about. For instance, IEEE
says
>> > that NaN compares _unordered_ to itself.
>> >
>> >> For the purposes of the algorithms that are written to accomodate
>> >> them, they must be very much ordered, if not with respect to each
>> >> other, at *least* with respect to other values in the space. In
>> >> other words, there needs to be an underlying strict-weak order or the
>> >> algorithms won't work.
>> >
>> > Not a fixed one, though, unless I'm misunderstanding? It would be
>> > trivial in my design for `sort` to order NaN values to the _end of
>> > the array_, as opposed to ordering them greater than all other
>> > values, with the result that NaN comes last whether you sort
>> > ascending or descending. Not that I'm proposing this behavior, but
>> > I don't *think* that it breaks anything.
>>
>> Well, that's just like saying you can sort with different predicates.
>>
>> I'm not sure it's OK to break the invariant that, in the absence of
>> equal elements,
>>
>> x.sorted(by: <).elementsEqual(x.sorted(by: >).reversed())
>>
>
> Aha. A harebrained idea strikes: `elementsEqual` could ignore NaN.
>
> [.nan, .nan, 42, 42].elementsEqual([42, .nan, 42, .nan]) // true
>
> Just thinking out loud; not seriously proposing it.

It's interesting... but I'm not sure why it's right, any more than you
would want

[.nan, .nan, 42, 42] == [42, .nan, 42, .nan] // true

Yes, it gets silly rather quickly. One would have to try to justify saying
that NaNs, in some ways, are elements that are not elements. (Which is
bonkers, and then one notes that NaNs, in some ways, are numbers that are
not numbers...)

I do not have a problem with observing the invariant you state by
> sorting all NaN values always greater than or less than non-NaN
> values.

This suggests to me that you haven't thought about the strictest
possible requirements we could place on comparison that would still work
for FP. I think that's a necessary part of doing this design.

That is one approach; I'm rather thinking about what requirements we could
place on Comparison that permit the widest possible array of useful generic
uses without unduly complicating the mental model for the most common types
or making heads explode.

> Put another way, as far as I can tell, values like NaN only need to

>> > have a specified sequence with respect to other values _for the
>> > purposes of any particular operation at hand_. Therefore, I've
>> > proposed a design where it's the generic algorithm and not the type
>> > that makes the decision for how these values are placed in
>> > sequence.
>>
>> Is that a known-useful property, or just a flexibility that is plausibly
>> useful, someday?
>>
>
> This is what permits a generic `Comparable.max` and `Comparable.min` to
> both avoid spitting out NaN.

I am not convinced that's necessary.

> It also enables potentially more advanced algorithms: a partially ordered
> set S can have more than one maximal element; a mathematically rigorous
> algorithm could be provided (in a third-party library, say) to return all
> of them. Any such work in this domain requires the baseline Comparable
> protocol to recognize this concept of incomparability. It cannot be
> retroactively bolted on by a third party.
>
>>> It is not an avoidable problem--this is the bump in the rug that
>> >> > cannot be smoothed out.
>> >> >
>> >> > I would posit that it is not possible to write an arbitrary generic
>> >> > algorithm that (a) compares floating point values; (b) doesn't
account
>> >> for
>> >> > NaN; and (c) behaves correctly, where correctly here means that it
>> >> returns
>> >> > what an average user would expect who is not thinking of floating
>> point
>> >> > comparison foibles.
>> >>
>> >> Existence proof: any algorithm that, internally, uses binary search
over
>> >> a sorted collection of Comparabble values, or stores Comparable
values
>> >> in a tree-based dictionary, but does not care exactly how two
distinct
>> >> values sort, will work just fine
>> >>
>> >> ...provided average users expect the existence of a distinct -0.0
>> >> ;-)
>> >>
>> >
>> > Ha, that's not what I was trying to get at. I fully expect that there
>> will
>> > be _some_ algorithms that will work out in this way. But, it is not
>> > possible to say that any particular design of Comparable will smooth
over
>> > wrinkles with NaN such that generic algorithms in general can ignore
the
>> > possibility of NaN and yet handle them "correctly."
>>
>> I guess if I agree that min and max should never return NaN unless
>> that's the only value in the sequence, I have to agree with that
>> statement.
>>
>> >> > For instance, generic `max` produces what to the average user is
>> >> > nonsense if NaN compares greater than everything.
>> >> >
>> >> >> I am, however, concerned that ordinary valid computations can
lead to
>> >> >> NaN and that allowing the appearance of a NaN to turn into a trap
>> >> >> much later in the program, where it is finally compared with
>> >> >> something, is not a behavior that would work for ordinary users.
>> >>
>> >> That is my central worry with anything that turns operations on NaN
into
>> >> a trap. I'd very much appreciate hearing your thoughts.
>> >>
>> >
>> > There are a limited number of ordinary operations that actually
generate
>> > NaN.
>> >
>> > If using arithmetic operators, there's 0/0 (which, as everyone already
>> > knows, traps if you're doing integer math--you'll notice I lobbied to
>> > remove `/` from protocols refined by both Integer and FloatingPoint,
so
>> now
>> > it's not even possible to accidentally do this in generic code unless
you
>> > unwisely make your own `Divisible` protocol).
>>
>> Thanks for that. But people *will* divide Doubles outside of a generic
>> context, and store the results. It happens a *lot*.
>>
>
> Sure. Is your principal worry that people will run into this behavior
while
> writing simple code in the context of using concrete Doubles?

Yes.

> It is an inescapable possibility with trapping `==`. It is also
> clearly different from precedent in a lot of languages to name the
> default FP comparison operator `&==`. That said, conceptually it seems
> similar to making `+` trap on overflow.

The difference is that with `+` the trap happens as close as possible to
the source of the problem, not after nonsense has propagated all over
the program, been saved to disk, etc.

Your summary below about bad decisions => comparison => better off trapping
is pretty good, I think. The way I think of it is indeed that _comparison_
of NaN is itself the (or, at least one) source of the problem. Telling
someone they have NaN GB remaining on their file transfer is tacky but
pretty harmless. Dividing by NaN to say that NaN GB is NaN% of their total
disk space is silly and still pretty harmless.

Evaluating whether NaN is greater than or less than some limit, however,
could get you into trouble. For instance, you might refuse to transfer a
file that's NaN GB, because NaN is not less than the available disk space.
This could be the right decision or it could be the wrong decision, but
certainly it's not, in general, safe to proceed without knowing that you
didn't have the information to make the decision in the first place. Given
my day job, I'm thinking again about drug dosing.

Were the designers of that feature simply not worried about people
> adding lots of Int8 values?
>
> Anyway, I look forward to Steve's expanded thoughts on this particular
> point.
>
>> Other than that, you'd have to be already working with infinite values
and
>> > the arithmetic operators, or you'd have to invoke some of the
>> trigonometric
>> > and transcendental functions--but those are specific to floating point
>> and
>> > not accounting for NaN would be pretty silly there.
>> >
>> > In general, I really think there's something valuable to trapping when
>> you
>> > try to propagate NaN through code that doesn't expect it.
>>
>> Oh, I agree. The problem is that nobody is talking about stopping
>> propagation... unless you happen to compare the value somewhere along
>> the way.
>>
>
> How convenient that (quoting from Wikipedia) it is recorded for all the
> world that "[f]loating-point operations **other than ordered
comparisons**
> normally propagate a quiet NaN" [emphasis mine]. It is true that the IEEE
> default is "non-stop" exception handling; the consequence of a design
such
> as mine would be that such IEEE floating-point defaults would need their
> own distinct notation.
>
> Simply, all of the following cannot be true:
>
> (a) `==` gives the same answer in all contexts (i.e., if `a == b` in
> generic contexts, then `a == b` in non-generic contexts)
> (b) `==` behaves according to IEEE default "non-stop" exception handling
> rules for FP values and gives the IEEE-prescribed result
> (c) `==` is useful for a generic algorithm that works with both FP and
> non-FP values and avoids unspecified behavior with FP values
>
> Currently, (a) and (b) are both true. I propose (a) and (c), providing
(b)
> with a different spelling. You propose (b) and (c).

Well, I'm withdrawing my proposal and claiming agnosticism at this
point. We're back into the design phase on this one. But I understand
what you mean.

>> After all, _something_ has gone wrong if you're copying "NaN GB" of
>> > data, or you're at "NaN%" progress on a task. And those are innocuous
>> > examples because it's probable that the value is being used for user
>> > interface display and not much else. With other uses, certainly
>> > nothing good can come of an unaccounted-for NaN.
>>
>> Now, wait just a darn-tootin' minute there, pardner. Well, if that were
>> strictly true, we'd just make them always trap, right? Non-signaling
>> NaNs exist because somebody thought it was “good” to be able to finish a
>> calculation and get *some* results out of it even if other parts of the
>> results come from unaccounted-for nonsense.
>>
>
> Hmm, I have always assumed that the most pressing motivation is to permit
> you to compute `sin(cos(tan(atan(acos(asin(x))))))` and deal with NaN at
> the end instead of checking after every function call, which would make
> writing out a mathematical formula like this quite untenable.

I think there are at least two uses for NaN. That is one of them.
Another is, e.g., in linear algebra, where you may be able to solve for
some of the variables in a system and the appearance of NaN in one place
should not necessarily invalidate the entire calculation. IIUC there
are more esoteric uses too, but I don't know much about them.

I was just reading earlier about the world of pain that arises from using
NaN in matrix math. Different implementations of BLAS (which backs
essentially every linear algebra library) do different things with NaN.
Some, for example, make the assumption that x * 0 == 0 for all values of x.
Your results can be different from implementation to implementation as a
consequence. R incurs a large performance penalty by scanning through every
element before forwarding matrices to BLAS and also ships with its own copy
of the library. The practical upshot is that, without checking for it
yourself, NaN really *can* screw up your linear algebra computations, and
in an environment-dependent way.

It is true that by allowing propagation, you can serialize NaN into some
> long-term storage format for arbitrary periods of time before dealing
with
> it, but my understanding is that it's actually recommended for
placeholders
> in deserialized data to be __signaling NaNs__. The standard is pretty
terse
> on the topic (unless I'm missing some text), but writes: "Signaling NaNs
> afford representations for uninitialized variables and arithmetic-like
> enhancements (such as complex-affine infinities or extremely wide range)
> that are not in the scope of this standard."

You don't need to go through serialization to put off comparison
arbitrarily long.

>> > Is it really the correct thing to supply some "default" answer,
>> > however explainable, when a user unintentionally asks, "Is my number x
>> > less than not-a-number?" As a healthcare provider, would it be OK for
>> > me to prescribe NaN doses of medication? An EMR written in Swift might
>> > tell me it's not less than the minimum dose!
>>
>> Well, I'm afraid you haven't really addressed my concern, which is that
>> NaNs may propagate a long way before we find out they have appeared, if
>> we find out at all, and because of that propagation, trapping might be
>> worse than continuing.
>
> I think that this concern would have to be addressed with either
empirical
> data or expertise more than I can offer. It seems that you're arguing
that
> no function should ever halt execution when it encounters NaN.

I'm not making an argument; I'm saying “I don't know, and I have a
concern.”

I do sympathize with the concern. Hopefully, Steve will have insights here.

Again, I'm concerned that NaNs will arise as the result of computations

>> >> involving external inputs, and that programs that would otherwise
>> >> harmlessly propagate NaN values (just as non-signaling NaN was
>> >> designed!) will trap in the field, very far from the original source
of
>> >> the problem, which means bugs will be very hard to correct.
>> >>
>> >
>> > Is it that common to get NaN as a result of external inputs?
>>
>> I honestly don't know.
>>
>> > JSON, for example, doesn't even permit NaN.
>>
>> Yeah, but you only need to encode zero in your JSON to produce a NaN
>> from a simple division.
>>
>> > I agree that it is possible to propagate NaN in a useful way, and
>> > indeed I would propose to expand the option to propagate values that
>> > are unordered with respect to themselves by having operators defined
>> > on Comparable. However, my disagreement with you here is that we
>> > should not assume that _unintentional_ propagation is _harmless_
>> > propagation.
>>
>> I didn't say it was harmless. I'm just not sure whether it's as harmful
>> as trapping far from the source of the problem would be. I honestly
>> don't know the answers here. I've used very little FP in the
>> applicatiopn programming I've done, so I don't have a good read on what
>> goes wrong for people.
>
> In my limited experience, generally what goes wrong is that feeding NaN
> into algorithms often leads to nonsensical results. Other than simply
> displaying the result, if you want to actually use it to do interesting
> work, you end up taking decisions you do not really intend on the basis
of
> those results.

...which would suggest that NaN shouldn't propagate at all. I see your
point that if nearly all the bad effects come from decisions, and decisions
are rooted in comparison, trapping comparison could be enough to prevent
it.

Anyway, I think I want to propose something much simpler based on what
Chris L. has been pressing me to think about.

Sketch: suppose we take it as a given that == and < for all FP types
have to have IEEE level 1 semantics in all contexts? Then the only
problematic value is NaN. We can say it lies outside the domain of FP
Equatable/Comparable conformance. Then we can ask how we want important
generic components (sort, min, Dictionary, etc.) to behave in the
presence of NaN and consider what they need to do in order to provide
that behavior. These components can document their behavior w.r.t. NaN,
and if/when it emerges that there's a coherent generalization of this
NaN handling, we can standardize it in a protocol or protocol
requirements. Until then it can remain an implementation detail of the
standard library.

I realize this doesn't force anyone to think about NaN, but that might
be a feature rather than a bug. Note also that this direction doesn't
presume we need a ternary comparison.

I don't have time to go into this in any more detail right now (need to
focus on String work), but I wanted to sketch it here in case anyone is
inspired to carry this direction forward and see where it leads. That
could be a huge contribution.

Well, in composing this reply I've started to think along these lines.
Suppose we leave the requirements for Comparable and Equatable essentially
as they are.

Since < defines an irreflexive relationship, when paired with == we already
have the information necessary to synthesize the <= relation which we can
use to determine if a value is incomparable with respect to itself or
others.

To make this design work more intuitively for users of generic algorithms,
add a requirement with a default implementation to Equatable: `var
isEquatableValue: Bool { get }`. Provide a default implementation:

public var isEquatableValue: Bool { return true }

FloatingPoint would instead provide:

public var isEquatableValue: Bool { return !isNaN }

Any arbitrary type could instead provide a different implementation:

public var isEquatableValue: Bool { return self == self }

We can have `sort` arrange things as you and Ben propose by evaluating the
predicate `areInIncreasingOrder($0, $1) && ($0.isEquatableValue ||
!$1.isEquatableValue)`. This scheme would be performant *if* the compiler
can elide some of these comparisons where `isEquatableValue` is always true.

We don't force anybody to think about NaN: I'll agree that this is a
feature and a bug. But, it may provide a way for generic algorithms to do
something sensible for all types without sacrificing performance for the
most common use cases.

···

On Sun, Apr 23, 2017 at 4:32 PM, Dave Abrahams <dabrahams@apple.com> wrote:

on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
> On Sun, Apr 23, 2017 at 7:54 AM, Dave Abrahams <dabrahams@apple.com> > wrote:
>> on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
>> > On Sat, Apr 22, 2017 at 11:00 PM, Dave Abrahams <dabrahams@apple.com> > >> wrote:

dabrahams · April 25, 2017, 3:50am

Allow me to put it even more strongly: I consider reinventing how
floating point works to be a massive undertaking comparable to
supplanting the Unicode standard with something better. I have no doubt
that it could be done by somebody, somewhere, someday, but it would be
easy to do something much worse than what IEEE has done, and getting it
right would occupy so much of this group's time that we couldn't hope to
accomplish anything else, if we even had the expertise—I know I don't!
Doing anything in this area that is not firmly rooted in existing
standards and practices is not an option I'm willing to pursue.

···

on Mon Apr 24 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution < > swift-evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules.

--
-Dave

Jon_Hull · April 25, 2017, 5:52am

The only other alternative I can think of is to include a notion of unordered to our comparisons…

What if we made <=> return an optional comparison result, with nil meaning unordered (which is how Ruby does it)? For == and <, we still require a strict total ordering (i.e. they would not be optional). Most of the time, <=> and <, == would have the same comparison (<=> could have a default implementation based on == and <), but in the case of floating point it would be slightly different: <=> would return nil for anything involving NaN (in full accordance with IEEE 754), while < and == would somehow jam it into a strict total order (preferably with Nan == NaN so we don’t break collections).

Thanks,
Jon

···

On Apr 24, 2017, at 8:25 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both the existing design and Dave's proposed design jump through huge hoops to preserve the behavior of these operators and reconcile them with the rest of Swift. The biggest weakness of my alternative idea as written up earlier is that it cannot preserve the spelling of these operators but requires a minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with: other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and < work with a default tolerance. That is, 'a == b' would check that (a - b) < epsilon for some reasonable choice of epsilon that works for common usage. I know this would make the hash function harder to write, but I think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance would explicitly declare that need by using 'compare(with: other, using: .IEEE)' or whatever we want to call that metric. We can even provide metrics for different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’ on FloatingPoint which returns a comparison metric that defines equality as (a - b) < tolerance, for those who want to specify a specific tolerance for their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially the == operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense in different domains… so let’s make it explicit so it doesn’t surprise anyone.
>
> The question then becomes, which metric should be the default (i.e. the one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use floats/doubles in dictionaries and sets. People and algorithms which need full IEEE correctness can use a different metric which specifically guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested in the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because, hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) -> ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using: .IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous to have `==` not bound to 754-equals as it is in essentially every other language. In particular, I worry about the risk of people porting numerical code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they are unlikely to test. I’ll try to follow up with more detailed thoughts tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being either bound or not bound to 754 depending on the context is what makes me even more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on reconsideration it's clear to me that few if any numerical recipes can be ported verbatim from C-like languages and we should probably not encourage people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably should be rewritten as `&<<` (I still haven't had enough time to think about this), and the bitwise operators have differing precedences that require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it seems to me that there are a few workable approaches with different tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it is just a marker protocol), in which case FP types can’t conform to it, and thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different requirement, and FP types can therefore implement both Equatable and StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type requires you to implement the <=> operation, as well as the StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

Jon_Hull · April 25, 2017, 11:28pm

Ah, yes, you are correct. Thank you, I now understand why no one bakes in a tolerance…

A couple of quick questions:

How much flexibility do we have around how we treat NaN (especially if there is a way to specify complete 754 compliance using alternate syntax)? For example NaN != NaN is used in some C algorithms to identify Nan, but we also have isNan() now. How important is that assumption to the whole system?

What is the importance of signaling Nan in the context of Swift (besides interoperability with computer systems from the 1980’s)?

It strikes me that (quiet) NaN is basically a legacy form of optional (although we treat nil == nil). It is a shame we can’t merge the two concepts somehow.

My idea of making (quiet) NaN the binary representation of nil for ‘Double?’ and then migrating Double to Double? (which would then give us a non-optional Double where we guaranteed a lack of NaN) was shot down immediately, but I wasn’t really given a reason why. What I liked about that is that it would be easy to define comparable on non-optional Double, and ‘Double?' would have the same propagation properties as NaN. It would also work well with legacy C and ObjC code (just using Double? instead of Double via an overlay).

I am just wondering if there is a different way of looking at this problem that we have overlooked...

Thanks,
Jon

···

On Apr 25, 2017, at 7:34 AM, Stephen Canon <scanon@apple.com> wrote:

On Apr 24, 2017, at 10:06 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE, I think we should also consider taking the opportunity to make == and < work with a default tolerance. That is, 'a == b' would check that (a - b) < epsilon for some reasonable choice of epsilon that works for common usage. I know this would make the hash function harder to write, but I think it is worthwhile.

It’s possible to tinker with IEEE comparisons and maybe improve them or at least work to make them compatible in some way with the expectations of Comparable. Making them have a tolerance by default is out of the question, however: as soon as you add a tolerance, you give up transitivity of equality, which is a much, much worse violation of the axioms than anything inflicted by the presence of NaN in the IEEE 754 model.

– Steve

Jon_Hull · April 27, 2017, 9:26pm

I have read it, and it is a truly brilliant work. I would love to see some (or all) of it make it into Swift (most likely Swift 5 or 6). The author is related to a friend of mine, so I can see if he is available to answer questions if there is interest...

···

On Apr 27, 2017, at 5:14 AM, Björn Forster via swift-evolution <swift-evolution@swift.org> wrote:

Hi all together,
I was just looking quickly over an article from Wolfram which covers new books covering Mathematica and tripped over this book:

https://www.crcpress.com/The-End-of-Error-Unum-Computing/Gustafson/p/book/9781482239867

From the reviews:
"This book is an extraordinary reinvention of computer arithmetic and elementary numerical methods from the ground up. Unum arithmetic is an extension of floating point in which it is also possible to represent the open intervals between two floating point numbers. This leads to arithmetic that is algebraically much cleaner, without rounding error, overflow underflow, or negative zero, and with clean and consistent treatment of positive and negative infinity and NaN. These changes are not just marginal technical improvements. As the book fully demonstrates, they lead to what can only be described as a radical re-foundation of elementary numerical analysis, with new methods that are free of rounding error, fully parallelizable, fully portable, easier for programmers to master, and often more economical of memory, bandwidth, and power than comparable floating point methods. The book is exceptionally well written and produced and is illustrated on every page with full-color diagrams that perfectly communicate the material. Anyone interested in computer arithmetic or numerical methods must read this book. It is surely destined to be a classic."
—David Jefferson, Center for Advanced Scientific Computing, Lawrence Livermore National Laboratory

I haven't read it myself, as said I stepped just over it, but *MAYBE* it covers the NaN problem in depth and the current state of art how to deal with it.
Maybe someone has free access to an online library (maybe via some university enrollment) and can have a look at it?

- Björn

On Sun, Apr 23, 2017 at 4:40 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

> On Apr 22, 2017, at 11:46 PM, David Waite <david@alkaline-solutions.com <mailto:david@alkaline-solutions.com>> wrote:
>
>> On Apr 22, 2017, at 10:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable”.
>
> Did I misunderstand the proposal? If T : FloatingPoint is not included in level 1 comparisons, then you cannot have classes of generic floating point algorithms.

Sorry about that, my mistake, I meant “T: Comparable"

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

dabrahams · April 25, 2017, 12:16am

>
>>
>>
>> >
>> >>
>> >> >> > That is to say, I would expect the standard library to supply an
>> >> >> > alternative implementation of equality for Array<T where T :
>> >> >> > >.
>> >> >>
>> >> >> And also for Dictionary? What do you expect to happen when
Double is
>> >> >> used as a dictionary key and it happens to be NaN?
>> >> >
>> >> > The proposal is very clear that `Dictionary` and `sort` will always
>> use
>> >> > level 2 comparison.
>> >>
>> >> Yes, but I'm not asking about my own proposal :-). My question is,
what
>> >> are *your* expectations for dictionaries or sets, specifically for
the
>> >> insertion of NaN keys and for comparison of whole collections?
>> >
>> > My expectations for comparison of collections clearly differ from
>> > yours.
>>
>> Maybe, in that you *have* strong expectations when it comes to FP. I'm
>> still trying to figure out what mine should be.
>>
>> > For me, I think it is a reasonable expectation that, for all `x` and
>> > `y` where `x == y`, ` == `[y]`, `[] == [[y]]`, `[x, x, x] == [y,
>> > y, y]`, etc.
>>
>> It seems pretty obvious when you put it that way. Here's an interesting
>> question: how much of that is because of the way array literals let you
>> “see through” the value being compared to its elements?
>>
>
> I don't think it's the visual representation. More to do with
expectations
> as to what an array is.
>
> If `a == b`, then `thingWrappingA == thingWrappingB`, for values of
"thing"
> that are "wrapper-like" (I will blithely leave the term undefined) rather
> than, say, stateful class instances.

wrapper-like probably means "Monad"

>> Also, does the same go for != ? That's telling w.r.t. NaN.
>
> Yes, I think so. I'd have to think harder to be more sure of that.

Well, that's the hard one, so please do. We can easily shift the
proposal so -0.0 == +0.0, but NaNs are harder to account for.

Right, more thought is necessary. It was obviated when I was writing my
proposed design, because NaN != NaN simply traps, and [NaN] != [NaN] would
too.

More thought was obviated? That's a neat trick! ;-)

> Hmm, not an expert, but I don't see that definition. MathWorld <
> http://mathworld.wolfram.com/ComparableElements.html> confirms that the
> Wikipedia formulation is correct:
>
> x is incomparable to y if neither x ***>=*** y nor x ***<=*** y.

Ah, right. But this is where it gets tricky to interpret, because IIUC
in orderings in general, there is no concept of equality; there's just
"x precedes y or it doesn't." So when these articles use <= or <, it is
a stand-in for "the ordering predicate". In ancient times, I wrote an
article exploring some of this
(http://web.archive.org/web/20120422220137/http://cpp-
next.com/archive/2010/02/order-i-say/).
I always go back to that when I need to ground myself, though using your
own article for that has its perils. Please check that out and let me
know if you think I'm off-base, because the notion of incomparability in
that article is central to my understanding (and is, I believe,
consistent with Weak ordering - Wikipedia, FWIW).

IIUC, you're talking about a _strict_ partial order, which is
_irreflexive_.

That may be another name for the same thing. The terminology in this
area is... problematic. But look for “incomparable” at
Weak ordering - Wikipedia and you'll see that I'm
consistent with that.

Such a relation is totally silent, however, as to whether x and y are
equivalent if `x !<> y`.

Correct; that's what I meant by “in orderings in general, there's no
concept of equality.” However, if `x !<> y` in a strict weak ordering,
`x` and `y` are part of an equivalence class, each element of which has
the same relationships with all the other elements of the set.

As a standalone predicate, it isn't sufficient to order a set which
contains a pair of incomparable values in the way that you and Ben
propose. For instance, suppose I'm trying to sort an array with 42 and
NaN using a generic algorithm that can't just test `isNaN`.
So, to the algorithm, these values are an opaque `a` and `b`. Since
!(a < b) and !(b < a), there's no information I can get from the
predicate to tell me which one is the "problem value" (is it a, or is
it b: which one is NaN?).

I'm sorry, I don't follow. Your statement that our ordering of incomparable
values is insufficient seems to lack context, so I can't evaluate it.
I'll note that in a strict weak ordering, incomparable values are not
problematic, but Floating Point doesn't have a strict weak ordering.

In the context of set theory, I'm reading that comparability is defined in
terms of _weak_ partial orders (which are reflexive).

Where are you reading that?

As my article notes, trying to analyze the space can be really
confusing, because *strict weak* partial orders are *not* reflexive.

I guess I've just assumed we're using this definition of comparability
because IEEE floating point appears to adopt it.

What makes you say that?

With a _weak_ partial order as a predicate, you *can* order a set
which contains a pair of incomparable values:

* Given: 42 ("a") and NaN ("b").
* First, we note that !(a <= b) && !(b <= a). So, we have a pair of
incomparable values.
* Then, we ask which value is not a member of the partially ordered set on
which the <= relation is defined.
* We find that (a <= a) == true but (b <= b) == false.
* We conclude that b is the "problem value" (NaN) which is not a member of
the partially ordered set.
* We decide to send it to the back of the line.

Yes, I understand that. But I don't think I get the point yet.

> I am not suggesting this particular name, but
> `array1.elementsIndistinguishable(from: array2)` expresses something of
the
> idea.

I'm very skeptical of introducing a notion of indistinguishability
that's distinct from equality.

I think this goes back to the above about what we want equality to be. If
equality is to be distinguished from incomparability (as it is in floating
point), then we must have some notion of what happens when two values
cannot be compared.

That two elements are incomparable doesn't mean you can't compare them.
It means that when you do compare them, you'll find they have no
relative ordering, like siblings in a DAG. See

More later, if I can find time...

···

on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

On Sun, Apr 23, 2017 at 4:32 PM, Dave Abrahams <dabrahams@apple.com> wrote:

on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
> On Sun, Apr 23, 2017 at 7:54 AM, Dave Abrahams <dabrahams@apple.com> >> wrote:
>> on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
>> > On Sat, Apr 22, 2017 at 11:00 PM, Dave Abrahams <dabrahams@apple.com> >> >> wrote:

--
-Dave

dwaite · April 25, 2017, 4:44am

I still think this is a naming conflict more than anything else.

The first expectation is that equatable and comparable provides strict equality and strict total ordering, and that those are exposed via operators. The other expectation is that floating point abides by the IEEE rules which have neither of these, and are exposed via operators.

Either:
1. the operators need to do different things in different contexts
2. we need different methods/operators to indicate these different concepts
3. Equatable and comparable need to be altered to no longer require strict equality and strict total ordering (and all generic algorithms based on equatable/comparable need to be checked that they still work)
4. floating point numbers need to explicitly not be equatable/comparable to prevent their usage in generic algorithms requiring strict behavior.
5. We break away from IEEE behavior and provide only strict equality and strict total ordering

(I tried to sort these roughly in order of feasibility)

Are there any other options?

-DW

···

On Apr 24, 2017, at 9:50 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Mon Apr 24 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution < >> swift-evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules.

Allow me to put it even more strongly: I consider reinventing how
floating point works to be a massive undertaking comparable to
supplanting the Unicode standard with something better. I have no doubt
that it could be done by somebody, somewhere, someday, but it would be
easy to do something much worse than what IEEE has done, and getting it
right would occupy so much of this group's time that we couldn't hope to
accomplish anything else, if we even had the expertise—I know I don't!
Doing anything in this area that is not firmly rooted in existing
standards and practices is not an option I'm willing to pursue.

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · April 25, 2017, 5:54am

The only other alternative I can think of is to include a notion of
unordered to our comparisons…

What if we made <=> return an optional comparison result, with nil meaning
unordered (which is how Ruby does it)?

Have you seen my Gist? It's a whole design based on that idea.

For == and <, we still require a strict total ordering (i.e. they would
not be optional). Most of the time, <=> and <, == would have the same
comparison (<=> could have a default implementation based on == and <), but
in the case of floating point it would be slightly different: <=> would
return nil for anything involving NaN (in full accordance with IEEE 754),
while < and == would somehow jam it into a strict total order (preferably
with Nan == NaN so we don’t break collections).

As I say above, nan == nan is a non-starter for the numerics crowd, and I
have to agree. NaN *must not be equal to* NaN, or all sorts of numerics
recipes are likely to break in weird ways when ported to Swift.

···

On Tue, Apr 25, 2017 at 12:52 AM, Jonathan Hull <jhull@gbis.com> wrote:

Thanks,
Jon

On Apr 24, 2017, at 8:25 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution <swift- > evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both
the existing design and Dave's proposed design jump through huge hoops to
preserve the behavior of these operators and reconcile them with the rest
of Swift. The biggest weakness of my alternative idea as written up earlier
is that it cannot preserve the spelling of these operators but requires a
minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with:
other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and <

work with a default tolerance. That is, 'a == b' would check that (a - b)
< epsilon for some reasonable choice of epsilon that works for common
usage. I know this would make the hash function harder to write, but I
think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance
would explicitly declare that need by using 'compare(with: other, using:
.IEEE)' or whatever we want to call that metric. We can even provide
metrics for different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’
on FloatingPoint which returns a comparison metric that defines equality as
(a - b) < tolerance, for those who want to specify a specific tolerance for
their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution < >> swift-evolution@swift.org> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially
the == operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the
same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm
specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense
in different domains… so let’s make it explicit so it doesn’t surprise
anyone.
>
> The question then becomes, which metric should be the default (i.e. the
one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use
floats/doubles in dictionaries and sets. People and algorithms which need
full IEEE correctness can use a different metric which specifically
guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested
in the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give
ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because,
hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) ->
ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or
multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using:
.IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution < >> swift-evolution@swift.org> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous
to have `==` not bound to 754-equals as it is in essentially every other
language. In particular, I worry about the risk of people porting numerical
code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they
are unlikely to test. I’ll try to follow up with more detailed thoughts
tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being
either bound or not bound to 754 depending on the context is what makes me
even more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on
reconsideration it's clear to me that few if any numerical recipes can be
ported verbatim from C-like languages and we should probably not encourage
people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably
should be rewritten as `&<<` (I still haven't had enough time to think
about this), and the bitwise operators have differing precedences that
require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it
seems to me that there are a few workable approaches with different
tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it
is just a marker protocol), in which case FP types can’t conform to it, and
thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or
Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different
requirement, and FP types can therefore implement both Equatable and
StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type
requires you to implement the <=> operation, as well as the
StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP
types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it
is really bad that abstracting a concrete algorithm would change its
behavior so substantially. I don’t care about SNaNs, but I do care about
the difference between +0/-1 and secondarily that of NaN handling. It
seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise,
"T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

dwaite · April 25, 2017, 7:41am

The only other alternative I can think of is to include a notion of unordered to our comparisons…

What if we made <=> return an optional comparison result, with nil meaning unordered (which is how Ruby does it)? For == and <, we still require a strict total ordering (i.e. they would not be optional). Most of the time, <=> and <, == would have the same comparison (<=> could have a default implementation based on == and <), but in the case of floating point it would be slightly different: <=> would return nil for anything involving NaN (in full accordance with IEEE 754), while < and == would somehow jam it into a strict total order (preferably with Nan == NaN so we don’t break collections).

There are many potential NaN values; in addition to signalling/quiet, there are payload bits. There could be more than a single unordered value.

<=> returning an optional enum might be appropriate, but there are still likely algorithms that would be written assuming that the operators defined in terms of comparable provide strict total ordering behavior:

1. [3.0, 1.0, Float.nan, 2.0].sorted(by: <)
$R8: [Float] = 4 values {
  [0] = 1
  [1] = 3
  [2] = NaN
  [3] = 2

-DW

···

On Apr 24, 2017, at 11:52 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

Thanks,
Jon

On Apr 24, 2017, at 8:25 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both the existing design and Dave's proposed design jump through huge hoops to preserve the behavior of these operators and reconcile them with the rest of Swift. The biggest weakness of my alternative idea as written up earlier is that it cannot preserve the spelling of these operators but requires a minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with: other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and < work with a default tolerance. That is, 'a == b' would check that (a - b) < epsilon for some reasonable choice of epsilon that works for common usage. I know this would make the hash function harder to write, but I think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance would explicitly declare that need by using 'compare(with: other, using: .IEEE)' or whatever we want to call that metric. We can even provide metrics for different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’ on FloatingPoint which returns a comparison metric that defines equality as (a - b) < tolerance, for those who want to specify a specific tolerance for their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially the == operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense in different domains… so let’s make it explicit so it doesn’t surprise anyone.
>
> The question then becomes, which metric should be the default (i.e. the one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use floats/doubles in dictionaries and sets. People and algorithms which need full IEEE correctness can use a different metric which specifically guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested in the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because, hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) -> ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using: .IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous to have `==` not bound to 754-equals as it is in essentially every other language. In particular, I worry about the risk of people porting numerical code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they are unlikely to test. I’ll try to follow up with more detailed thoughts tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being either bound or not bound to 754 depending on the context is what makes me even more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on reconsideration it's clear to me that few if any numerical recipes can be ported verbatim from C-like languages and we should probably not encourage people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably should be rewritten as `&<<` (I still haven't had enough time to think about this), and the bitwise operators have differing precedences that require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it seems to me that there are a few workable approaches with different tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it is just a marker protocol), in which case FP types can’t conform to it, and thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different requirement, and FP types can therefore implement both Equatable and StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type requires you to implement the <=> operation, as well as the StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

anandabits · April 27, 2017, 9:35pm

I mentioned unums on the list about a year ago. Steve Canon replied with some thoughts: https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160509/016889.html\.

···

On Apr 27, 2017, at 4:26 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

I have read it, and it is a truly brilliant work. I would love to see some (or all) of it make it into Swift (most likely Swift 5 or 6). The author is related to a friend of mine, so I can see if he is available to answer questions if there is interest...

On Apr 27, 2017, at 5:14 AM, Björn Forster via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi all together,
I was just looking quickly over an article from Wolfram which covers new books covering Mathematica and tripped over this book:

https://www.crcpress.com/The-End-of-Error-Unum-Computing/Gustafson/p/book/9781482239867

From the reviews:
"This book is an extraordinary reinvention of computer arithmetic and elementary numerical methods from the ground up. Unum arithmetic is an extension of floating point in which it is also possible to represent the open intervals between two floating point numbers. This leads to arithmetic that is algebraically much cleaner, without rounding error, overflow underflow, or negative zero, and with clean and consistent treatment of positive and negative infinity and NaN. These changes are not just marginal technical improvements. As the book fully demonstrates, they lead to what can only be described as a radical re-foundation of elementary numerical analysis, with new methods that are free of rounding error, fully parallelizable, fully portable, easier for programmers to master, and often more economical of memory, bandwidth, and power than comparable floating point methods. The book is exceptionally well written and produced and is illustrated on every page with full-color diagrams that perfectly communicate the material. Anyone interested in computer arithmetic or numerical methods must read this book. It is surely destined to be a classic."
—David Jefferson, Center for Advanced Scientific Computing, Lawrence Livermore National Laboratory

I haven't read it myself, as said I stepped just over it, but *MAYBE* it covers the NaN problem in depth and the current state of art how to deal with it.
Maybe someone has free access to an online library (maybe via some university enrollment) and can have a look at it?

- Björn

On Sun, Apr 23, 2017 at 4:40 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

> On Apr 22, 2017, at 11:46 PM, David Waite <david@alkaline-solutions.com <mailto:david@alkaline-solutions.com>> wrote:
>
>> On Apr 22, 2017, at 10:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable”.
>
> Did I misunderstand the proposal? If T : FloatingPoint is not included in level 1 comparisons, then you cannot have classes of generic floating point algorithms.

Sorry about that, my mistake, I meant “T: Comparable"

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · April 25, 2017, 12:38am

Yeah, I'll try to get to this at some point, but now that the weekend is
over I think I'm running out of steam as well...

···

On Mon, Apr 24, 2017 at 7:16 PM, Dave Abrahams <dabrahams@apple.com> wrote:

on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

> On Sun, Apr 23, 2017 at 4:32 PM, Dave Abrahams <dabrahams@apple.com> > wrote:
>
>>
>> on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
>>
>> > On Sun, Apr 23, 2017 at 7:54 AM, Dave Abrahams <dabrahams@apple.com> > >> wrote:
>> >
>> >>
>> >> on Sun Apr 23 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
>> >>
>> >> > On Sat, Apr 22, 2017 at 11:00 PM, Dave Abrahams < > dabrahams@apple.com> > >> >> wrote:
>> >> >
>> >> >>
>> >> >> >> > That is to say, I would expect the standard library to
supply an
>> >> >> >> > alternative implementation of equality for Array<T where T :
>> >> >> >> > >.
>> >> >> >>
>> >> >> >> And also for Dictionary? What do you expect to happen when
>> Double is
>> >> >> >> used as a dictionary key and it happens to be NaN?
>> >> >> >
>> >> >> > The proposal is very clear that `Dictionary` and `sort` will
always
>> >> use
>> >> >> > level 2 comparison.
>> >> >>
>> >> >> Yes, but I'm not asking about my own proposal :-). My question
is,
>> what
>> >> >> are *your* expectations for dictionaries or sets, specifically for
>> the
>> >> >> insertion of NaN keys and for comparison of whole collections?
>> >> >
>> >> > My expectations for comparison of collections clearly differ from
>> >> > yours.
>> >>
>> >> Maybe, in that you *have* strong expectations when it comes to FP.
I'm
>> >> still trying to figure out what mine should be.
>> >>
>> >> > For me, I think it is a reasonable expectation that, for all `x`
and
>> >> > `y` where `x == y`, ` == `[y]`, `[] == [[y]]`, `[x, x, x] ==
[y,
>> >> > y, y]`, etc.
>> >>
>> >> It seems pretty obvious when you put it that way. Here's an
interesting
>> >> question: how much of that is because of the way array literals let
you
>> >> “see through” the value being compared to its elements?
>> >>
>> >
>> > I don't think it's the visual representation. More to do with
>> expectations
>> > as to what an array is.
>> >
>> > If `a == b`, then `thingWrappingA == thingWrappingB`, for values of
>> "thing"
>> > that are "wrapper-like" (I will blithely leave the term undefined)
rather
>> > than, say, stateful class instances.
>>
>> wrapper-like probably means "Monad"
>>
>> >> Also, does the same go for != ? That's telling w.r.t. NaN.
>> >
>> > Yes, I think so. I'd have to think harder to be more sure of that.
>>
>> Well, that's the hard one, so please do. We can easily shift the
>> proposal so -0.0 == +0.0, but NaNs are harder to account for.
>>
>>
> Right, more thought is necessary. It was obviated when I was writing my
> proposed design, because NaN != NaN simply traps, and [NaN] != [NaN]
would
> too.

More thought was obviated? That's a neat trick! ;-)

>> > Hmm, not an expert, but I don't see that definition. MathWorld <
>> > http://mathworld.wolfram.com/ComparableElements.html> confirms that
the
>> > Wikipedia formulation is correct:
>> >
>> > x is incomparable to y if neither x ***>=*** y nor x ***<=*** y.
>>
>> Ah, right. But this is where it gets tricky to interpret, because IIUC
>> in orderings in general, there is no concept of equality; there's just
>> "x precedes y or it doesn't." So when these articles use <= or <, it is
>> a stand-in for "the ordering predicate". In ancient times, I wrote an
>> article exploring some of this
>> (http://web.archive.org/web/20120422220137/http://cpp-
>> next.com/archive/2010/02/order-i-say/).
>> I always go back to that when I need to ground myself, though using your
>> own article for that has its perils. Please check that out and let me
>> know if you think I'm off-base, because the notion of incomparability in
>> that article is central to my understanding (and is, I believe,
>> consistent with Weak ordering - Wikipedia, FWIW).
>
> IIUC, you're talking about a _strict_ partial order, which is
> _irreflexive_.

That may be another name for the same thing. The terminology in this
area is... problematic. But look for “incomparable” at
Weak ordering - Wikipedia and you'll see that I'm
consistent with that.

> Such a relation is totally silent, however, as to whether x and y are
> equivalent if `x !<> y`.

Correct; that's what I meant by “in orderings in general, there's no
concept of equality.” However, if `x !<> y` in a strict weak ordering,
`x` and `y` are part of an equivalence class, each element of which has
the same relationships with all the other elements of the set.

> As a standalone predicate, it isn't sufficient to order a set which
> contains a pair of incomparable values in the way that you and Ben
> propose. For instance, suppose I'm trying to sort an array with 42 and
> NaN using a generic algorithm that can't just test `isNaN`.
> So, to the algorithm, these values are an opaque `a` and `b`. Since
> !(a < b) and !(b < a), there's no information I can get from the
> predicate to tell me which one is the "problem value" (is it a, or is
> it b: which one is NaN?).

I'm sorry, I don't follow. Your statement that our ordering of
incomparable
values is insufficient seems to lack context, so I can't evaluate it.
I'll note that in a strict weak ordering, incomparable values are not
problematic, but Floating Point doesn't have a strict weak ordering.

> In the context of set theory, I'm reading that comparability is defined
in
> terms of _weak_ partial orders (which are reflexive).

Where are you reading that?

As my article notes, trying to analyze the space can be really
confusing, because *strict weak* partial orders are *not* reflexive.

> I guess I've just assumed we're using this definition of comparability
> because IEEE floating point appears to adopt it.

What makes you say that?

> With a _weak_ partial order as a predicate, you *can* order a set
> which contains a pair of incomparable values:
>
> * Given: 42 ("a") and NaN ("b").
> * First, we note that !(a <= b) && !(b <= a). So, we have a pair of
> incomparable values.
> * Then, we ask which value is not a member of the partially ordered set
on
> which the <= relation is defined.
> * We find that (a <= a) == true but (b <= b) == false.
> * We conclude that b is the "problem value" (NaN) which is not a member
of
> the partially ordered set.
> * We decide to send it to the back of the line.

Yes, I understand that. But I don't think I get the point yet.

>> > I am not suggesting this particular name, but
>> > `array1.elementsIndistinguishable(from: array2)` expresses something
of
>> the
>> > idea.
>>
>> I'm very skeptical of introducing a notion of indistinguishability
>> that's distinct from equality.
>
> I think this goes back to the above about what we want equality to be. If
> equality is to be distinguished from incomparability (as it is in
floating
> point), then we must have some notion of what happens when two values
> cannot be compared.

That two elements are incomparable doesn't mean you can't compare them.
It means that when you do compare them, you'll find they have no
relative ordering, like siblings in a DAG. See
Comparability - Wikipedia

More later, if I can find time...

Jon_Hull · April 25, 2017, 5:45am

+1

What I am arguing for is #2. We have two different expectations and we need to be explicit about which is being used. Furthermore, I am arguing that if one of them is going to be the default (and use the ‘==‘ and ‘<‘ symbols), it needs to be the strict equality/total ordering version, since that is what every other Swift type is modeling, and IEEE is only applicable to floating point.

···

On Apr 24, 2017, at 9:44 PM, David Waite via swift-evolution <swift-evolution@swift.org> wrote:

I still think this is a naming conflict more than anything else.

The first expectation is that equatable and comparable provides strict equality and strict total ordering, and that those are exposed via operators. The other expectation is that floating point abides by the IEEE rules which have neither of these, and are exposed via operators.

Either:
1. the operators need to do different things in different contexts
2. we need different methods/operators to indicate these different concepts
3. Equatable and comparable need to be altered to no longer require strict equality and strict total ordering (and all generic algorithms based on equatable/comparable need to be checked that they still work)
4. floating point numbers need to explicitly not be equatable/comparable to prevent their usage in generic algorithms requiring strict behavior.
5. We break away from IEEE behavior and provide only strict equality and strict total ordering

(I tried to sort these roughly in order of feasibility)

Are there any other options?

-DW

On Apr 24, 2017, at 9:50 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

on Mon Apr 24 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution < >>> swift-evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules.

Allow me to put it even more strongly: I consider reinventing how
floating point works to be a massive undertaking comparable to
supplanting the Unicode standard with something better. I have no doubt
that it could be done by somebody, somewhere, someday, but it would be
easy to do something much worse than what IEEE has done, and getting it
right would occupy so much of this group's time that we couldn't hope to
accomplish anything else, if we even had the expertise—I know I don't!
Doing anything in this area that is not firmly rooted in existing
standards and practices is not an option I'm willing to pursue.

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Jon_Hull · April 25, 2017, 6:46am

The only other alternative I can think of is to include a notion of unordered to our comparisons…

What if we made <=> return an optional comparison result, with nil meaning unordered (which is how Ruby does it)?

Have you seen my Gist? It's a whole design based on that idea.

The main difference is that I am suggesting using <=> vs. < and == as the different spellings for our different expectations. There would also be no trapping. < and == would always somehow represent a strict total ordering. <=> would adhere to the letter of 754 (including NaN being unordered… which is different than just unequal).

For == and <, we still require a strict total ordering (i.e. they would not be optional). Most of the time, <=> and <, == would have the same comparison (<=> could have a default implementation based on == and <), but in the case of floating point it would be slightly different: <=> would return nil for anything involving NaN (in full accordance with IEEE 754), while < and == would somehow jam it into a strict total order (preferably with Nan == NaN so we don’t break collections).

As I say above, nan == nan is a non-starter for the numerics crowd, and I have to agree. NaN *must not be equal to* NaN, or all sorts of numerics recipes are likely to break in weird ways when ported to Swift.

The numerics crowd should use <=>. You made a very convincing case before when talking about &+ that we already can’t expect direct cut and paste with C code.

Worst case, I suppose we could have < and == use the current definitions. It isn’t ideal, but then nothing about this entire subject is ideal…

Thanks,
Jon

···

On Apr 24, 2017, at 10:54 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Tue, Apr 25, 2017 at 12:52 AM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Apr 24, 2017, at 8:25 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both the existing design and Dave's proposed design jump through huge hoops to preserve the behavior of these operators and reconcile them with the rest of Swift. The biggest weakness of my alternative idea as written up earlier is that it cannot preserve the spelling of these operators but requires a minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with: other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and < work with a default tolerance. That is, 'a == b' would check that (a - b) < epsilon for some reasonable choice of epsilon that works for common usage. I know this would make the hash function harder to write, but I think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance would explicitly declare that need by using 'compare(with: other, using: .IEEE)' or whatever we want to call that metric. We can even provide metrics for different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’ on FloatingPoint which returns a comparison metric that defines equality as (a - b) < tolerance, for those who want to specify a specific tolerance for their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially the == operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense in different domains… so let’s make it explicit so it doesn’t surprise anyone.
>
> The question then becomes, which metric should be the default (i.e. the one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use floats/doubles in dictionaries and sets. People and algorithms which need full IEEE correctness can use a different metric which specifically guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested in the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because, hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) -> ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using: .IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous to have `==` not bound to 754-equals as it is in essentially every other language. In particular, I worry about the risk of people porting numerical code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they are unlikely to test. I’ll try to follow up with more detailed thoughts tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being either bound or not bound to 754 depending on the context is what makes me even more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on reconsideration it's clear to me that few if any numerical recipes can be ported verbatim from C-like languages and we should probably not encourage people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably should be rewritten as `&<<` (I still haven't had enough time to think about this), and the bitwise operators have differing precedences that require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it seems to me that there are a few workable approaches with different tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it is just a marker protocol), in which case FP types can’t conform to it, and thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different requirement, and FP types can therefore implement both Equatable and StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type requires you to implement the <=> operation, as well as the StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · April 25, 2017, 5:45am

I still think this is a naming conflict more than anything else.

The first expectation is that equatable and comparable provides strict
equality and strict total ordering, and that those are exposed via
operators. The other expectation is that floating point abides by the IEEE
rules which have neither of these, and are exposed via operators.

Either:
1. the operators need to do different things in different contexts

Right, this is the proposal that started the thread. A number of people
object to this idea.

2. we need different methods/operators to indicate these different concepts

Right, I and others have proposed designs for this. A number of people also
object to this idea.

3. Equatable and comparable need to be altered to no longer require strict
equality and strict total ordering (and all generic algorithms based on
equatable/comparable need to be checked that they still work)

If you read the documentation on Comparable, it actually says: "A
conforming type may contain a subset of values which are treated as
exceptional." It goes on to describe floating point semantics. So this is
the status quo.

4. floating point numbers need to explicitly not be equatable/comparable
to prevent their usage in generic algorithms requiring strict behavior.

I think we can all agree that not being able to compare arrays of floating
point numbers would be rather limiting.

5. We break away from IEEE behavior and provide only strict equality and
strict total ordering

(I tried to sort these roughly in order of feasibility)

Right, 5 is totally a non-starter.

Are there any other options?

Your list is pretty exhaustive. Even though I've tried to promote designs
along some of these lines in the past, I can safely say that for myself I
dislike all of these choices.

As to (3), one point to be made is that it is actually possible to make all
generic algorithms work sensibly with the current design of Comparable. You
can make `sort`, for example, behave exactly as Dave and Ben propose using
what we have today. An issue to consider is that a naive implementation of
the additional checks necessary for that result would penalize types that
are currently well-behaved.

There are judicious tweaks to the design of Equatable and/or Comparable
that could help the compiler elide any additional checks when a type is
well-behaved. For example, if `Comparable` had a requirement named
`isExceptional` (for exceptional values), then `sort` could evaluate
`areInIncreasingOrder($0,
$1) || ($1.isExceptional && !$0.isExceptional)` in order to sort NaN
greater than everything else, as desired. A type that has no exceptional
values would implement `var isExceptional: Bool = false`, and the compiler
could elide the additional checks.

(Another issue to think about with (3) is that few people who are writing
their own generic algorithms would be motivated to actually do look into
this issue or know that they should even think about it; it is, after all,
the status quo and we know that generic algorithms *aren't* taking NaN into
account. The issue is that for every "motivation" we build into an
alternative design, it represents a roadblock or a nuisance to someone else
who, for example, just wants their numerics algorithms to work.)

···

On Mon, Apr 24, 2017 at 11:44 PM, David Waite <david@alkaline-solutions.com> wrote:

-DW

> On Apr 24, 2017, at 9:50 PM, Dave Abrahams via swift-evolution < > swift-evolution@swift.org> wrote:
>
>
> on Mon Apr 24 2017, Xiaodi Wu <xiaodi.wu-AT-gmail.com> wrote:
>
>> On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution < > >> swift-evolution@swift.org> wrote:
>>
>>> As I am thinking about it more, this means that for == and <
>>>
>>> NaN == NaN
>>> -0 == +0
>>> +Inf < NaN
>>>
>>> Since this would break from IEEE,
>>
>> Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules.
>
> Allow me to put it even more strongly: I consider reinventing how
> floating point works to be a massive undertaking comparable to
> supplanting the Unicode standard with something better. I have no doubt
> that it could be done by somebody, somewhere, someday, but it would be
> easy to do something much worse than what IEEE has done, and getting it
> right would occupy so much of this group's time that we couldn't hope to
> accomplish anything else, if we even had the expertise—I know I don't!
> Doing anything in this area that is not firmly rooted in existing
> standards and practices is not an option I'm willing to pursue.
>
> --
> -Dave
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · April 25, 2017, 10:24am

The only other alternative I can think of is to include a notion of
unordered to our comparisons…

What if we made <=> return an optional comparison result, with nil meaning
unordered (which is how Ruby does it)? For == and <, we still require a
strict total ordering (i.e. they would not be optional). Most of the time,
<=> and <, == would have the same comparison (<=> could have a default
implementation based on == and <), but in the case of floating point it
would be slightly different: <=> would return nil for anything involving
NaN (in full accordance with IEEE 754), while < and == would somehow jam it
into a strict total order (preferably with Nan == NaN so we don’t break
collections).

There are many potential NaN values; in addition to signalling/quiet,
there are payload bits. There could be more than a single unordered value.

Right, strict total order is almost certainly not what you want for generic
use, as NaN ceases to equal NaN in all scenarios because of payload bits.
If you throw in IEEE decimal types (not yet offered in Swift), total order
would distinguish between different representations of the same decimal
value, IIUC.

<=> returning an optional enum might be appropriate, but there are still

···

On Tue, Apr 25, 2017 at 2:41 AM, David Waite <david@alkaline-solutions.com> wrote:

On Apr 24, 2017, at 11:52 PM, Jonathan Hull via swift-evolution < > swift-evolution@swift.org> wrote:
likely algorithms that would be written assuming that the operators defined
in terms of comparable provide strict total ordering behavior:

1. [3.0, 1.0, Float.nan, 2.0].sorted(by: <)
$R8: [Float] = 4 values {
  [0] = 1
  [1] = 3
  [2] = NaN
  [3] = 2

-DW

Thanks,
Jon

On Apr 24, 2017, at 8:25 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Apr 24, 2017 at 9:06 PM, Jonathan Hull via swift-evolution <swift- > evolution@swift.org> wrote:

As I am thinking about it more, this means that for == and <

NaN == NaN
-0 == +0
+Inf < NaN

Since this would break from IEEE,

Yeah, as Steve mentioned, it's a huge deal to break from IEEE rules. Both
the existing design and Dave's proposed design jump through huge hoops to
preserve the behavior of these operators and reconcile them with the rest
of Swift. The biggest weakness of my alternative idea as written up earlier
is that it cannot preserve the spelling of these operators but requires a
minor adjustment (`&`). It's simply a no-go to move `==` to `compare(with:
other, using: .ieee)`. No one will write numerics algorithms like that.

I think we should also consider taking the opportunity to make == and <

work with a default tolerance. That is, 'a == b' would check that (a - b)
< epsilon for some reasonable choice of epsilon that works for common
usage. I know this would make the hash function harder to write, but I
think it is worthwhile.

Then, as I mentioned before, people who need strict IEEE conformance
would explicitly declare that need by using 'compare(with: other, using:
.IEEE)' or whatever we want to call that metric. We can even provide
metrics for different IEEE levels, if desired.

We could also provide a function ‘tolerance(_:Self) -> Comparable.Metric’
on FloatingPoint which returns a comparison metric that defines equality as
(a - b) < tolerance, for those who want to specify a specific tolerance for
their use-case/algorithm...

Thanks,
Jon

> On Apr 23, 2017, at 7:18 AM, Jonathan Hull via swift-evolution < >> swift-evolution@swift.org> wrote:
>
> There is one more option which hasn’t really been considered:
>
> • == and != are tied to the Equatable protocol, which is essentially
the == operation.
> • <, <=, >, >= are tied to the Comparable protocol (which is kept the
same except for minor changes/additions listed below)
> • Hashable still requires Equatable
> • There is a new ComparisonMetric concept which lets an algorithm
specify exactly how the comparison is done (see below)
>
>
> Tl;dr: There are different definitions of ‘comparison’ which make sense
in different domains… so let’s make it explicit so it doesn’t surprise
anyone.
>
> The question then becomes, which metric should be the default (i.e. the
one defined by ‘<‘ and ‘==‘), and the answer is: the one which lets us use
floats/doubles in dictionaries and sets. People and algorithms which need
full IEEE correctness can use a different metric which specifically
guarantees it. They can even build their own metric if needed.
>
>
> ====The Design====
> // (Note: I wrote this code in mail, so it may not compile)
>
>
> //This defines the result of a comparison. It would ideally be nested
in the protocol below if that becomes possible.
> enum ComparisonResult : Int {
> case ascending = -1
> case equal = 0
> case descending = 1
> }
>
> protocol Comparable {
> typealias Metric = (Self, Self) -> ComparisonResult //Give
ourselves an easy way to refer to this function type
>
> var defaultMetric: Metric
> static func <(lhs: Self, rhs: Self) -> Bool
> }
>
> extension Comparable {
> //Not shown: We would define <=, etc… plus ≤,≥,and ≠ (because,
hey, it is my proposal)
>
> func compare(with other: Self, using metric: Metric) ->
ComparisonResult {
> return metric(self, other)
> }
>
> func compare(with other: Self) -> ComparisonResult {
> return self.defaultMetric(self, other)
> }
>
> static func <=> (lhs: Self, rhs: Self) -> Int {
> return self.defaultMetric(lhs, rhs).rawValue
> }
>
> var defaultMetric: Metric {
> return { lhs, rhs in
> if lhs == rhs {
> return .equal
> } else if lhs < rhs {
> return .ascending
> }
> return .descending
> }
> }
> }
>
> ============
>
> Then for Double, we would make a second metric for IEEE compliant (or
multiple for different levels)
>
> extension Double : Comparable {
>
> static func < (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static func == (lhs: Self, rhs: Self) -> Bool {
> //define using best for dictionaries / sets / layman
understanding
> }
>
> static var IEEEcompare:Comparable.Metric {
> //return function here that does full IEEE comparison
> }
>
> }
>
> Then we can call ‘myDouble.compare(with: otherDouble, using:
.IEEEcompare)’ when needed.
>
>
> Thanks,
> Jon
>
>
>
>> On Apr 22, 2017, at 9:58 PM, Chris Lattner via swift-evolution < >> swift-evolution@swift.org> wrote:
>>
>> On Apr 22, 2017, at 6:06 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
>>> but my quick reaction to `&==` is that it would make me quite nervous
to have `==` not bound to 754-equals as it is in essentially every other
language. In particular, I worry about the risk of people porting numerical
code that depends on isnan(x) <—> !(x < y) in non-obvious ways that they
are unlikely to test. I’ll try to follow up with more detailed thoughts
tomorrow.
>>>
>>> Indeed, it makes me a little nervous too. That said, `==` being
either bound or not bound to 754 depending on the context is what makes me
even more nervous.
>>>
>>> I was once adamantly against a new spelling for `==`, but on
reconsideration it's clear to me that few if any numerical recipes can be
ported verbatim from C-like languages and we should probably not encourage
people to do so. Already, `+` needs to be rewritten as `&+`, `<<` probably
should be rewritten as `&<<` (I still haven't had enough time to think
about this), and the bitwise operators have differing precedences that
require careful proofreading.
>>
>>
>> I haven’t been following this proposal or discussion closely, but it
seems to me that there are a few workable approaches with different
tradeoffs:
>>
>> 1. The strictly correct but user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable refines Equatable (with no other requirements, it
is just a marker protocol), in which case FP types can’t conform to it, and
thus can’t participate as dictionary keys
>>
>> => This approach sucks because you can’t have Set<Float>, or
Dictionary<Float, String>.
>>
>> 2. The strictly correct but somewhat user hostile approach:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable doesn’t require equatable, it requires a related
StrictlyEquatable protocol.
>> * StrictlyEquatable doesn’t refine Equatable: it has a different
requirement, and FP types can therefore implement both Equatable and
StrictlyEquatable.
>>
>> => This approach is suboptimal because implementing your own type
requires you to implement the <=> operation, as well as the
StrictlyEquatable protocol, both.
>>
>> 3. The user friendly but incorrect model:
>>
>> * == and != are tied to the Equatable protocol, which is essentially
the == operation.
>> * <, <=, >, >= are tied to the Comparable protocol, which is
essentially the <=> operation.
>> * Hashable is defined in terms of Equatable.
>>
>> => This is easy (types just have to define <=>), but fails for FP
types.
>>
>>
>> I don’t think that this proposal is acceptable as written. I think it
is really bad that abstracting a concrete algorithm would change its
behavior so substantially. I don’t care about SNaNs, but I do care about
the difference between +0/-1 and secondarily that of NaN handling. It
seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise,
"T : Equatable".
>>
>> -Chris
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

mayoff · April 25, 2017, 2:00pm

Have Double and Float do #5, and new types IEEEDouble and IEEEFloat
implement IEEE behavior and not be suitable for use with generic algorithms
when NaNs are present.

···

On Mon, Apr 24, 2017 at 11:44 PM, David Waite via swift-evolution < swift-evolution@swift.org> wrote:

I still think this is a naming conflict more than anything else.

The first expectation is that equatable and comparable provides strict
equality and strict total ordering, and that those are exposed via
operators. The other expectation is that floating point abides by the IEEE
rules which have neither of these, and are exposed via operators.

Either:
1. the operators need to do different things in different contexts
2. we need different methods/operators to indicate these different concepts
3. Equatable and comparable need to be altered to no longer require strict
equality and strict total ordering (and all generic algorithms based on
equatable/comparable need to be checked that they still work)
4. floating point numbers need to explicitly not be equatable/comparable
to prevent their usage in generic algorithms requiring strict behavior.
5. We break away from IEEE behavior and provide only strict equality and
strict total ordering

(I tried to sort these roughly in order of feasibility)

Are there any other options?

Jon_Hull · April 27, 2017, 11:35pm

There have been a bunch of updates since then (currently under peer review), which deal with implementation on current systems. Reading the new/updated paper now…

Here is a video of the author speaking about some of the general ideas:

I doubt we would get rid of Double/Float, but I would love to see it used as a core type in Swift 5. In addition to the increase in accuracy/expressible range, and the simplification of exception handling, apparently the results when used in neural networks are amazing. It also allows you to simplify a bunch of numerical algorithms, because you don’t have to worry about some of the things that go wrong with traditional floats/doubles.

Thanks,
Jon

···

On Apr 27, 2017, at 2:35 PM, Matthew Johnson <matthew@anandabits.com> wrote:

I mentioned unums on the list about a year ago. Steve Canon replied with some thoughts: https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160509/016889.html\.

On Apr 27, 2017, at 4:26 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I have read it, and it is a truly brilliant work. I would love to see some (or all) of it make it into Swift (most likely Swift 5 or 6). The author is related to a friend of mine, so I can see if he is available to answer questions if there is interest...

On Apr 27, 2017, at 5:14 AM, Björn Forster via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi all together,
I was just looking quickly over an article from Wolfram which covers new books covering Mathematica and tripped over this book:

https://www.crcpress.com/The-End-of-Error-Unum-Computing/Gustafson/p/book/9781482239867

From the reviews:
"This book is an extraordinary reinvention of computer arithmetic and elementary numerical methods from the ground up. Unum arithmetic is an extension of floating point in which it is also possible to represent the open intervals between two floating point numbers. This leads to arithmetic that is algebraically much cleaner, without rounding error, overflow underflow, or negative zero, and with clean and consistent treatment of positive and negative infinity and NaN. These changes are not just marginal technical improvements. As the book fully demonstrates, they lead to what can only be described as a radical re-foundation of elementary numerical analysis, with new methods that are free of rounding error, fully parallelizable, fully portable, easier for programmers to master, and often more economical of memory, bandwidth, and power than comparable floating point methods. The book is exceptionally well written and produced and is illustrated on every page with full-color diagrams that perfectly communicate the material. Anyone interested in computer arithmetic or numerical methods must read this book. It is surely destined to be a classic."
—David Jefferson, Center for Advanced Scientific Computing, Lawrence Livermore National Laboratory

I haven't read it myself, as said I stepped just over it, but *MAYBE* it covers the NaN problem in depth and the current state of art how to deal with it.
Maybe someone has free access to an online library (maybe via some university enrollment) and can have a look at it?

- Björn

On Sun, Apr 23, 2017 at 4:40 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

> On Apr 22, 2017, at 11:46 PM, David Waite <david@alkaline-solutions.com <mailto:david@alkaline-solutions.com>> wrote:
>
>> On Apr 22, 2017, at 10:58 PM, Chris Lattner via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>>
>> I don’t think that this proposal is acceptable as written. I think it is really bad that abstracting a concrete algorithm would change its behavior so substantially. I don’t care about SNaNs, but I do care about the difference between +0/-1 and secondarily that of NaN handling. It seems really bad that generalizing something like:
>>
>> func doThing(a : Double, b : Double) -> Bool {
>> ….
>> return a != b
>> }
>>
>> to:
>>
>> func doThing<T : FloatingPoint> (a : T, b : T) -> Bool {
>> ….
>> return a != b
>> }
>>
>> would change behavior (e.g. when a is -0.0 and b is +0.0). Likewise, "T : Equatable”.
>
> Did I misunderstand the proposal? If T : FloatingPoint is not included in level 1 comparisons, then you cannot have classes of generic floating point algorithms.

Sorry about that, my mistake, I meant “T: Comparable"

-Chris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution