SE-0259: Approximate Equality for Floating Point

jayton · April 25, 2019, 9:42am

On a point of order, this doesn’t mean that it has to be spelled ==. There’s something to be said for a.isAlmostEqual(to: b) and a.isEqual(to: b) (or isApproximately and isExactly); in addition to fixing the Equatable API contract violation¹, it would prompt developers think more clearly about which one is appropriate.

That said, I don’t think this direction would be very popular, or that it should block this review. If I had been paying attention during the pitch, I would have pushed for this to at least be an alternative considered.

¹I thought Equatable had a caveat about “exceptional values”, but on closer inspection that’s only Comparable, while Equatable requires reflexivity and substitutability for user-defined types. Does this mean that user-defined FloatingPoint implementations are in violation by default, or do they inherit immunity from the protocol?

nuclearace · April 25, 2019, 11:36am

tl;dr: I don't think isApproximately is really any better than isAlmost. They both convey the intended functionality in a way I think most people with a working level of English would be able to understand. And they also both start with isA, so someone looking for either of those names is likely to find the method.

So, hopping on the bikeshedding a bit. I'm not really convinced isApproximately is any better than isAlmostEqual. Now, approximate(ly) isn't exactly an obscure English term that would require lookup for most first, or I suspect, second language English speakers, and it's even in the name of proposal. However, I think it's slightly less favorable for a couple simple reasons:

almost is more common than approximately in casual/spoken English, and is thus slightly more likely to be understood by someone still getting familiar with English.
Given that they both accurately describe the intended functionality, I don't really see a reason to chose a more verbose word.

All this being said, it really doesn't matter what the name for this is. I wouldn't lose sleep if it was called isApproximately or isAlmost or isNear or even isReallyCloseToButNotQuite They all communicate the behavior of something that is sorely needed in the stdlib.

yxckjhasdkjh · April 25, 2019, 11:45am

I think that maybe tools like SwiftLint could add linting rules for such rules. I don't know how ubiquitous SwiftLint is in Swift projects, but if it's used a lot, that could already help avoiding these mistakes.

+1 for the proposal.

cukr · April 25, 2019, 12:01pm

Just my 2¢
If I were to take the names literally, and compare two numbers that are exactly the same, I would expect isAlmost and isReallyCloseToButNotQuite to return false, and isApproximately and isNear to return true.

That's why I would like isApproximately more than isAlmost

scanon · April 25, 2019, 12:21pm

As ever, technically correct is the best kind of correct. I've made this point several times myself, but ...

You wouldn't actually be able to get rid of ==, because we need Equatable conformance to support Hashable, and approximatelyEqual / almostEqual can't be used for that purpose (because it breaks the axioms in a much worse way than FloatingPoint.== does). So now you need to define a third notion of equality, and you have three different operations flying around. This helps no one.

jayton · April 25, 2019, 12:24pm

You win again, reality.

scanon · April 25, 2019, 12:26pm

You would have gotten away with it too, if it wasn't for those pesky kids and their pragmatic considerations.

nrith · April 25, 2019, 3:07pm

And I like isKindaSorta() even more.

Nevin · April 25, 2019, 4:47pm

In the interest of clarity at the point of use, perhaps we should consider adding “relative” to the second argument label of isAlmostEqual:

if x.isAlmostEqual(to: y, relativeTolerance: t) { … }

GarthSnyder · April 26, 2019, 12:09am

It's isAlmostEqual(to:) in the proposal, and I believe the to: label is in fact needed for that function name according to the API guidelines.

One of the reasons I like isApproximately is that it gets rid of the argument label. I have no animus against labels generally, but for a single argument and for candidate names that are in other ways similarly appealing, I'd just as soon not have a label.

Equal and to: can both be dropped here without loss of clarity. isApproximately(), isAbout(), isNearly(), and isAlmost() are all approximately equivalent.

Jon_Shier · April 26, 2019, 12:22am

What is your evaluation of the proposal?

+1, functionality enabling greater correctness is always appreciated in a language.

Is the problem being addressed significant enough to warrant a change to Swift?

Given the complexity of correct implementation, yes.

Does this proposal fit well with the feel and direction of Swift?

isAlmostEqual(to:) seems a bit... informal to me, but otherwise looks good.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I've never used a language that had this built in.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

Read the proposal and other reviews.

dlbuckley · April 26, 2019, 6:47am

Solid +1 from me. I have something almost equal (hah, get it?) in my code base already and use it quite often.

MattSeaman · April 26, 2019, 7:35pm

+1 in concept, with a few comments:

This is definitely a common source of error and definitely warrants a change to Swift. As to whether it fits with the feel of Swift, I don't particularly like how there is a separate function for comparison with zero. I realize the alternatives section explains that there are good implementation reasons for this, but if those limitations could somehow be overcome I think it will avoid a lot of confusion this would create for developers. I'm thinking that for the average developer and especially those learning Swift in school as their first language, using the standard comparison function with 0 will be a common practice because people simply won't read the documentation and will expect it to "just work". Heck, I would probably use it with 0 had I not read this proposal. It would be really great (and in my opinion important) if it indeed did "just work".

Additionally, I personally think having an operator (perhaps ~~?) would be nice to fit better with == but I also know that's listed in alternatives and I can live without it. That could potentially also be added in the future.

Tino · April 27, 2019, 7:49am

I guess this is one of the rare cases where no one denies that the core idea of the proposal is a good addition - so it's just haggling about the details ;-)

The proposal states that there is no sensible default for an absolute tolerance, and has a good rationale for this claim - but many people are used to think in terms of absolute tolerance, so I'd strongly recommend to add that functionality as well:
It's only a tiny addition, and it would also increase usability of the method with relative tolerance, as autocompletion would make people aware the meaning that might not be different from their expectation.

Absolute tolerance is not bad per se, because when it's specified by the user, the library does not need to guess the scaling of the input.
At the same time, I'd say that relative tolerance isn't automatically a good choice when the parameter is determined by the library - which not only is unaware of the scaling of input data, but also has no clue about what precision is needed in any real application.
Therefor, I fully support the choice of not adding a ≊ operator.

Still, the aspect of conciseness is very desirable for a method that is intended to replace wrong use of the handy ==, so I agree with those who prefer a shorter name; actually, without a the default tolerance, I'd go with x.equals(y, tolerance: 0.01).

tomkeith · April 27, 2019, 2:37pm

Thanks Steve, this is a great proposal.

I do have some (possibly naive) questions.

Default relative tolerance.

The default value of tolerance is sqrt(.ulpOfOne); this value
comes from the common numerical analysis wisdom that if you don't
know anything about a computation, you should assume that roughly
half the bits may have been lost to rounding.

This seems to suggest that the amount of precision loss you can expect is related to the width of the mantissa. I would expect a fixed amount of loss (measured in bits) regardless of whether the computation is done in Float versus Double.

“Safe” relative tolerance.

This is generally a
pretty safe choice of tolerance--if two values that agree to half
their bits but are not meaningfully almost equal, the computation
is likely ill-conditioned and should be reformulated.

There are two ways of looking at “safe.” My typical use is assertion checking, where I want to verify that a calculated value is equal to an expected value. So I prefer that the tolerance to be on the strict side. Then if the assertion triggers falsely, I can simply loosen it. But if the tolerance is too loose to begin with, the assertion might quietly fail to find an error.

Absolute tolerance.

Wouldn’t absolute tolerance be useful for numbers other than zero? As you say,

If this value is the result of floating-point additions or
subtractions, use a tolerance of .ulpOfOne * n * scale, where
n is the number of terms that were summed and scale is the
magnitude of the largest term in the sum.

It would be nice to have something like

func isAlmostEqual( to other: Self, absoluteTolerance: Self ) -> Bool

rather than having to subtract the value you are checking against so you can use isAlmostZero().

Tom

gwendal.roue · April 27, 2019, 3:23pm

I concur. Especially if (a-b).isAlmostZero(absoluteTolerance: ...) is a naive and wrong way to do it (it's easy to grow a small amount of paranoia when dealing with floats).

James_Dempsey · April 27, 2019, 4:41pm

What is your evaluation of the proposal?

+1 overall

Two pieces of feedback:

I prefer isApproximately to isAlmostEqual. I believe it reads better.

I really wish there was a way to eliminate the isAlmostZero case. I believe that a fair number of users will naively use the wrong API, passing in zero, which according to the proposal will almost never give the desired result.

If it remains two methods, I think it would be very useful for there to be a warning if zero is passed into the non-zero-specific method.

Instead of a generalized single hybrid method, might it be possible to special case zero, so the tolerance is an absolute tolerance in just that case?

In both approaches there is a special case for zero:

I see two main users of the API:

A basic user that passes in a value and uses the default tolerance.
An advanced user that needs to understand the difference between relative and absolute tolerances to use the zero and non-zero versions correctly.

With a second zero-specific method, the basic user needs to know both methods exist. The advanced user needs to understand which method uses which kind of tolerance (made clearer by the argument labels).

With a single method with zero special-cased, the basic user can use one method that always gets the default behavior. The advanced user needs to understand the special case of the type of tolerance with zero (not as clear as argument labels).

To me, a single method with zero special-cased makes things less complicated for the basic user of the API and makes the API a little more difficult to use for an advanced user of the API (needing to know about the special case for the kind of tolerance zero uses).

But, I am definitely not an expert in floating point. I mainly worry that users would fall into the trap of using the wrong method when checking for approximate equality for zero.

Is the problem being addressed significant enough to warrant a change to Swift?
Yes

Does this proposal fit well with the feel and direction of Swift?
Yes

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
No - but have used languages without this feature where I had to implement this myself, so understand its benefit.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
More than a quick reading. Less than an in-depth study.

nnnnnnnn · April 27, 2019, 6:15pm

This is a big improvement over (I think) every language I've ever used — having an API to actually provide the correct comparison is so much better than everyone rolling their own.

I tried this out and it works as advertised:

let x = 1.0 / 10.0       // 0.10000000000000001
let y = 0.5 - 0.4        // 0.099999999999999978
let z = x - y            // 0.000000000000000027755575615628914

x == y                   // false
x.isAlmostEqual(to: y)   // true
z == 0                   // false
z.isAlmostZero()         // true

Like others, I'm concerned about having two separate methods for this purpose, primarily because what seems like the more general method doesn't work when applied to zero:

z.isAlmostEqual(to: 0)   // false

@scanon — would it be possible to add additional checks to isAlmostEqual(to:) so that it falls back to isAlmostZero when one of the inputs is almost zero? Adding these checks makes the check in the code above return true.

if other.isAlmostZero(absoluteTolerance: tolerance) { 
    return self.isAlmostZero(absoluteTolerance: tolerance) 
}
if self.isAlmostZero(absoluteTolerance: tolerance) { 
    return other.isAlmostZero(absoluteTolerance: tolerance) 
}

We could keep isAlmostZero as an optimized version for people who only need to perform that check.

nnnnnnnn · April 27, 2019, 6:37pm

Another reason to choose "approximately" for the name of these is that it's easy to interpret "almost" in an unfortunate way:

x.isAlmostEqual(to: y) → "x is close to y, but not exactly y"

jcrang · April 28, 2019, 3:13pm

I'm quite a strong -1 on this proposal.

TL;DR I think this proposal adds a function that overstates its correctness and will lead to further confusion. It's consciously imperfect, and such functions should have a very high bar of entry into the standard library.

As the proposal admits, floating point comparisons are difficult and prone to error. I disagree with the proposal’s core idea that "we can define approximate equality functions that are better than what most people will come up with without assistance from the standard library", rather the final solution is "the best we can do with floating point numbers that can vary massively in scale without any additional context from the caller". It turns a non-simple problem into something deceptively simple. Providing it as a standard library function implies correctness and may deter users from understanding its quirks under the assumption they're using correct functions.

I find the handling of 0 a particular concern. The big selling point of this function is that it'll do something reasonable given any two floating point numbers. The caveat being if either operand is 0 it won't. If we've already conceded that we don't know the scale of the operands to this function, there's no reason to expect them to be non-zero. One solution proposed in this thread is to require the programmer to know their inputs and fall back to an absolute tolerance if they know they can be zero. We're now back in the realm of "rolling your own", which is what this proposal seeks to avoid. Python's similar function math.isclose provides an absolute tolerance for near-zero comparisons which, if nothing else, documents this quirk.

Perhaps I'm not in the field where this function really shines. The only time I've wanted to compare floating point numbers for equality is when writing unit tests. Many unit test frameworks will provide functions for this express purpose, often allowing you to choose between relative, absolute, and units in the last place (ULPs). In most cases, unit tests will be comparing against a known, constant, expected value and so a scale-agnostic comparison is not necessary. In my opinion such functions belong quite happily in a unit test framework.

Further, the times I can see this function being useful outside unit testing is slim. Or, at least, times it's better than an alternative are few:

If you know the scale and accuracy of your inputs, use an appropriate absolute tolerance that takes into account that as well as accuracy lost in calculation.
If you're displaying to a user, use a rounding factor that they'd expect (e.g. to one decimal place) and compare
Prefer using inequalities (>, <) where appropriate.

To me the following snippet is a demonstration of code that works as expected, but would be confusing to new programers once they're told x.isAlmostEqual is the correct function to use:

[(10.0, 10.1), (1.0, 1.1), (0.0, 0.000001)].map {
    (x, y) in x.isAlmostEqual(to: y, tolerance: 0.01) // [true, false, false]
}

10.0 is almost equal to 10.1. Exactly what I wanted!
1.0 is not almost equal to 1.1; that's odd, it's the same difference as 10.0 to 10.1
0 is not almost equal to 0.000001; that's really odd. They're basically exactly the same!

I think it's fair to be confused in this case, just as it's fair to be confused when apparently equal floating point numbers aren't equal. isAlmostEqual is not the function they were looking for. Instead, they wanted a function that compared within an absolute tolerance, something this proposal singles out as being "wrong more often than it is right". We've just given them yet another function to learn the appropriate use of, and delayed the introduction of how floating point numbers are represented.

Again, perhaps I'm not the indented audience. I certainly feel like I'm in the minority. Could people who are positive about this provide real world examples when they're trying to compare two floating point numbers where this would be better than an alternative above?

It's a real problem, but not one I consider the be solvable with such a simple API.

In spirt, yes. In implementation, no.

I've used similar functions in unit testing frameworks. I've seen that Python includes math.isclose but never used it.

A fair amount. I've read the proposal and this thread. I've tried the prototype implementation.