SE-0259: Approximate Equality for Floating Point

Moximillian · April 30, 2019, 4:30pm

For me, the questions ends up being: given we need relative and absolute tolerance parameters, is it acceptable in service of pragmatism to provide default value to both, which does the right thing in enough cases? Or are there too many cases where errors occur, caused by default values that one or both defaults should not be provided.

In the end it should still be just one function, not two.

scanon · April 30, 2019, 4:35pm

Throwing would be an really heavy-weight approach, and wouldn't even really solve the problem (it just changes the discussion to "under what conditions do you throw?")

Joe_Groff · April 30, 2019, 4:45pm

For prior art as to how to handle zero and infinity, we could look at APL dialects, which generally support relative tolerant comparison by default, and have long been successfully used for numerics-heavy applications. In J, for example:

Comparisons with 0

A [...] consequence is that comparisons with 0 are exact.

Comparisons with Infinity

Tolerant comparison is meaningless with infinite arguments. By definition, x=!.t y if (|x-y) <:!.0 t * x>.&|y . If x and y are both infinite with the same sign (both _ or both __ ), the phrase x-y is a NaN error . Otherwise, if x or y is infinite or if both are infinite but with different signs,
(|x-y) <:!.0 t * x>.&|y _ <:!.0 t * _ _ <:!.0 _ 1
That is, infinity or negative infinity is "tolerantly equal" to every other number except itself.

lorentey · April 30, 2019, 6:35pm

Tino:

Still, the imho best way to resolve the issues this proposals aims to tackle will be possible when generics become more powerful and allow us an easy way to define custom numeric types:
typealias DrivingDistance = CustomDouble<precision: 0.01> // one centimeter
let a = 4000 as DrivingDistance // four kilometers
let b = 4000 + 0.008 as DrivingDistance // eight millimeters more
a == b // true
But that might never happen, so adding some simple methods (I'd still include helpers for absolute tolerance ;-) now will at least help those who are already aware of the problems with float equality.

We can do this today by defining a tiny wrapper struct around a Double. Note though that (however we do it) the resulting type cannot conform to Equatable, because its equality relation isn't transitive. (This also rules out Comparable and Hashable conformances.) We can force Equatable to limp through NaN's isolated reflexivity violation, but a general lack of transitivity would be fatal.

nnnnnnnn · April 30, 2019, 10:46pm

It feels like we're talking past each other — I'm not trying to be clever or design a magic API, just push the hoops that people will need to jump through to use these APIs down into a single method.

For example, let's say I want to add a method to collections of floating-point values, called firstIndex(ofApproximately value: Element). The implementation would necessarily need to look something like this:

extension Collection where Element: FloatingPoint {
    func firstIndex(ofApproximately value: Element) -> Index? {
        if value.isAlmostZero() {
            return firstIndex(where: { $0.isAlmostZero() })
        } else {
            return firstIndex(where: { $0.isAlmostEqual(to: value) })
        }
    }
}

I need to check whether the value that's passed in is approximately zero, since there's no guarantee that the caller knows for sure, and then call the right comparison method. Is there a different or better way to write this extension method?

What I'd like instead is for this check to be inside the comparison method itself:

extension FloatingPoint {
    public func isAlmostEqual(
        to other: Self,
        relativeTolerance tolerance: Self = Self.ulpOfOne.squareRoot(),
        absoluteToleranceNearZero: Self = Self.ulpOfOne.squareRoot()
    ) -> Bool {
        switch (self.isAlmostZero(absoluteTolerance: absoluteToleranceNearZero),
            other.isAlmostZero(absoluteTolerance: absoluteToleranceNearZero)) {
        case (true, true): return true
        case (true, false), (false, true): return false
        case (false, false): break
        }

        // Remainder of method
    }
}

With that, I don't have the complexity in my firstIndex(ofApproximately value: Element), because I get the expected behavior out of the single comparison method. In addition, both kinds of tolerances are there in the method signature, so I "know what I don't know" and can visit the documentation for more information if I need to customize those values. Are there values that would end in a different result for this version of the method, compared to how I'm using the two methods in my Collection extension above?

gian_enrico_conti · May 4, 2019, 6:21am

only a minor note.
when we have to check if we are inside "tolerance", we are again in the same problem: to test we must write something like abs(n1 - n2) < Tolerance. ...

so also for tolerance we should have toleranceForTolerance?

we are back to the same problem., unless we state Tolerance will approximate the comparison.

I personally will stay with jayton Jens Ayton ...

and let programmers (that must have read FP implementation issues in their courses...) find the correct solution for their domain.

MrBee · May 5, 2019, 1:05am

Huge +1 from me. This is a great feature for floating point computation in Swift.

However, as the others have pointed it out, the naming is a bit awkward. I agree it should be renamed to isApproximately(to: tolerance:) or isApproximatelyEqual(to: tolerance:) for more verbose API. And I think isAlmostZero is unnecessary, we could just simply use to: 0, although the internal implementation is different. It's just for ease of use.

The tolerance: parameter should also has default common value, like 0.0000001 (a milionth). So for common use case we simply call number.isApproximately(to: 1.1). Simple and easy, also very beginner friendly.

I also expect this API –if approved and implemented– is applied to == operator for floating point types in Swift by default, using default tolerance: parameter. This will completely makes every Swift code use correct floating point computation. Manual usage only necessary if one need custom parameters value.

Joe_Groff · May 6, 2019, 12:51am

I think there are two operations in this space, like many have noted, but the way I would classify them is “comparing two unknown values” and “comparing an unknown value against a fixed constant”. The former is where you want scaled tolerant comparison, and where the properties Steve noted are important. In a case like yours, searching an array for a specific value, it seems more like the latter—you have a fixed value you’re comparing multiple values against, and you’re more likely to also be able to define an absolute interval of close-enough values you’d accept. Code like yours that conditionally handles the zero case seems like a code smell to me; it would be better to have it take a range, or a value and absolute scale.

scanon · May 6, 2019, 12:57am

Just to follow-up on this, I asked the core team to send this back for revision. I'll put up a new pitch tomorrow that responds to the feedback in this thread in more detail, and has a more uniform handling of zero. (And Ben will post a more official announcement.)

Ben_Cohen · May 6, 2019, 4:31pm

Proposal Returned for Revision

As Steve mentioned, he's asked the core team to return the proposal for revision so he can account for feedback during the review regarding handling of zero.

solak · May 6, 2019, 5:59pm

I've just gotten here late and have 2¢ to add which I hope will be brief and worth your time.

My experience in this area derives from using APL's forward-looking implementation before any other language. APL does fuzzy comparison with a global variable called Comparison Tolerance, which implemented a relative-difference model. The value could be changed locally (dynamic scoping was still OK back then) and often had to be because as discussed above, every context has a different origin for its values and therefore different accuracy of the computation so far. Since it was only a relative tolerance, zero compared equal only to zero. The resulting system was better than not having it, but was still unsatisfactory.

Suggestion: What I have considered since then is the following: Instead of fuzzy comparison with a specified fuzz, have a fuzzy numerical data type. Each value is implemented as a pair: the value and its uncertainty. Doing computation on such values computes the new value from the input values by the conventional means, then computes the uncertainty based on the input values and uncertainties. Then the comparison tolerance is not something the user/programmer has to guess at, but is inherent in the computation that has been done so far. There's no need for relative versus absolute tolerance. If the values plus-or-minus their uncertainties produce areas that overlap, then they might be equal.

I've never built this because it entails writing an entire math library for this new type (analogous to implementing complex numbers based on real numbers), so you may wish to have a salt shaker available while considering my proposal.

scanon · May 6, 2019, 6:09pm

This is interval arithmetic, there's an enormous literature about it that's been in development for 30-or-so years, and it sadly doesn't work in practice for most non-trivial problems that are not set up just so, because without applying algebraic / transcendental transforms to error bounds that are non-computable in general, the error bounds you're maintaining grow exponentially, such that the entire space is "almost equal".

As a super-simple example of why this doesn't work, consider the expression x - x. If we have an error bound of δ on x, we compute that the result is ±2δ. That's nonsense, of course; x - x is exactly zero, regardless of the error bound on x. But if the expression is more complex, we cannot prove that the LHS and RHS are exactly equal algebraically (because real arithmetic is not computable), so we're stuck with the pessimistic bound. These bounds are carried forward by the computation being performed and almost universally grow exponentially in the number of steps in the computation.

solak · May 6, 2019, 6:17pm

Thanks. I should have guessed that someone else would have had the same idea and explored it already.

Tino · May 6, 2019, 6:33pm

Did I hear someone shout „unums!“? ;-)

Nevin · May 7, 2019, 6:28pm

Did I hear someone shout “non-ascii operators”?

…no?

Too bad! Here’s a half-baked idea:

if x ≈ y {
  // Default tolerance
}

if x ≈ y ± 1e-6 {
  // Absolute tolerance
}

if x ≈ y ± 5% {
  // Relative tolerance
}

Essentially, “±” is a range-formation operator, postfix “%” creates a RelativeTolerance<T> instance, and “≈” calls contains.

David_Sweeris · May 7, 2019, 6:48pm

Nevin:

Did I hear someone shout “non-ascii operators”?

…no?

Too bad! Here’s a half-baked idea:
if x ≈ y {
  // Default tolerance
}

if x ≈ y ± 1e-6 {
  // Absolute tolerance
}

if x ≈ y ± 5% {
  // Relative tolerance
}
Essentially, “±” is a range-formation operator, postfix “%” creates a RelativeTolerance<T> instance, and “≈” calls contains.

+1 on including this as syntactic sugar around however it's actually spelled since, tragically (as far as these things go), unicode isn't as "uni" as we'd all like.

Come to think of it, I believe =~ and +/- are valid ascii operators and not otherwise is use. I don't think +- is taken yet, either, or postfix %

SDGGiesbrecht · May 7, 2019, 8:41pm

Not as half‐baked as you think; I have been using that for over two years: ≈, ±

cukr · May 7, 2019, 8:55pm

Good thing about ascii operators is that you don't have to copy-paste them on most keyboard layouts. For example I cannot type either of them on my mac or windows, and they have very unintuitive shortcuts on my ubuntu (alt+shift+7 and 9)
On mac you can use ctrl+cmd+space thingie, and you can set up compose key on linux(not enabled by default on ubuntu), but I don't think there's a standard way of typing non-ascii on windows, unfortunately.

Non-ascii operators would be awesome if I could type them :( Maybe we should think of some universal solution for typing non-ascii operators?

fclout · May 7, 2019, 11:09pm

I know that you're not entirely serious, but a problem with the relative tolerance expressed as a percentage is that the recommended default, the square root of the ULP of one, falls way outside the range of values that you reasonably want to express as a percentage. You'd be looking at something like 0.00000149%.

David_Sweeris · May 7, 2019, 11:42pm

That's a function of the OS and is way outside the scope of this project. Or any project, really, aside from macOS, Windows, and Linux themselves.

Additionally, since Swift is a multi-platform project, we'd still have to wait until unicode characters are easy to type on all the platforms we support. This probably means waiting for unicode support in USB HID class drivers (the USB Consortium has been asked by at least one keyboard firmware dev... the response was, um, "not encouraging", IIRC).