Generic "math functions"

In order to better support writing numerical algorithms, we should provide bindings for <math.h>-style elementary functions that are generic over scalar types (including allowance for future complex types) and SIMD vectors.

Motivation

BinaryFloatingPoint (and the protocols it refines) provides a powerful set of abstractions for writing numerical code, but it does not include the transcendental operations defined by the C math library, which are instead imported by the platform overlay as a set of overloaded concrete free functions. This makes it impossible to write generic code that uses, say exp or sin without defining your own trampolines or converting everything to and from Double, neither of which is very satisfying.

For example, suppose that we want to write a sigmoid function that works for any floating-point type. We "should" be able to write something like the following:

func sigmoid<T>(_ x: T) -> T where T: FloatingPoint {
  return 1/(1 + exp(-x))
}

but that isn't valid, because the FloatingPoint protocol only binds IEEE 754 operations, which exp is not. Instead we can currently do something like:

  return 1/(1 + T(exp(-Double(x))))

but that's messy, inefficient if T is less precise than Double, and inaccurate if T is more precise than Double. We can and should do better in Swift.

This issue has come up a few times in past discussion, and was recently mentioned in a fast.ai article about using Swift for numerical computing. I finally have a little bit of breathing room to tackle it, so I would like to push on getting this functionality into Swift 5.1. A link to the draft proposal can be found below, and I'll publish a full draft implementation in the next couple of days.

33 Likes

This sounds great, and is definitely something Iā€™ve been waiting for!

I think you made the right choices regarding alternatives considered, with the obvious caveat that the name Mathable deserves bikeshedding.

As Iā€™m sure youā€™re aware, @xwuā€™s Numeric Annex calls the equivalent protocol ā€œMathā€, and also contains a refined protocol ā€œRealā€.

Bikeshedding `Mathable`

I might suggest one of the following:

Math
MathProtocol
MathFunctions
ElementaryFunctions

ā€¢ Math is appealing for its brevity, but may not be specific enough. And as a noun it tends to imply that a conforming type ā€œis a mathā€, which doesnā€™t make sense. (Compare how Array, eg, ā€œis a collectionā€.)

ā€¢ MathProtocol avoids the ā€œis aā€ problem, at the price of adding the somewhat-awkward ā€œprotocolā€ suffix. The standard library does use that occasionally, so itā€™s at least plausible.

ā€¢ MathFunctions would (I believe) be the first time a standard-library protocol was plural, which might seem strange. It is intended to imply ā€œhas math functionsā€.

ā€¢ ElementaryFunctions is great for mathematical precision, but probably isnā€™t as clear to non-mathematicians. It is also rather verbose, and has the same plural-noun problem as above.

At present, either MathProtocol or MathFunctions would be at the top of my list.

2 Likes

This is a hugely welcome addition, because I've also found myself writing code very similar to this to get myself generic FP algorithms.

One thing that I really love about this proposal that hadn't occurred to me is the use of the associated type to avoid having the implementations clutter the type namespace itself. I had been adding the static functions directly to the types and then using the free functions to delegate to them, but this is much more elegant.

Regarding naming, what about MathCapable? That keeps the -able naming convention of the protocol but might roll off the tongue a little better.

2 Likes

This is delightful; I decided to look into possible designs for such a protocol with NumericAnnex but am not wedded to Math and Real.

The bikeshedding exercise can and will surely take place later, but I'll throw Mathematical into the ring for the protocol.


The bigger design point where I do have a specific opinion is with respect to the idea of the namespaced members under Math. While I can appreciate that square root is an IEEE required operation and these others are not, and while certainly I agree with the idea that adding a whole slew of global functions is suboptimal, this is (IMO) ultimately a poor justification for why a member function for computing the cosine should be in a different namespace from a member function for computing the square root.

I think we'd do well to preserve the design where member functions are spelled out (a la squareRoot) and found on their corresponding floating-point type, with global functions that can be opted into with import Math using the shorter customary names (a la sqrt). I do not think that users will find it 'polluting' to have access to a member named cosine on a floating-point type.

I have tried out such a variant of such a design in NumericAnnex and find it to be pretty usable.


Finally--although this may exceed what we can feasibly do for Swift at the moment--my hesitancy for putting forward NumericAnnex for consideration as part of such a proposal has been that it--and this proposal--shims the C library. However, across platforms, there is substantial variability in the precision of these functions. Even after relaxing the expected precision on tests, I have sometimes been surprised that a test passing on macOS will fail on Linux unless I relax the precision still more. Whatever we can do to make a guarantee about the minimum precision in ulps guaranteed by Swift, and where necessary re-implementing instead of shimming to provide that precision, would be pretty great.

11 Likes

+1 on the idea of adding these. I'm with @xwu on the naming issues, though; I think member functions is the right way to go, with the global functions under a separate import. I'd be less strong on that if it weren't for squareRoot() already existing.

I'd also be okay with the member functions only existing (i.e. being provided by an extension) if the MathFunctions (or whatever it's called) module is imported, but I do think they should be member functions.

It would also be possible to make sqrt part of the proposal, with a conditional default implementation that calls squareRoot.

1 Like

+1

I have been using something similar since Swift 1, under the names [SomeLevel]Arithmetic. Most of this falls under RealArithmetic, though division is RationalArithmetic and exponentiation is WholeArithmetic. The standard library independently adopted AdditiveArithmetic, which I had been using for years in the exact same form. Having a hierarchy like this is useful when it comes to conforming outside types. Exponents are meaningful on an infiniteā€precision integer, but trigonometry is not. Division works on an infiniteā€precision fraction, but logarithms do not.


As for the names taken directly from Cā€™s math.h:

I like the names that are straight from mathematical notation, like sin(Īø) and log(x). But the ones that are not look categorically out of place to me: waterPistol.sqrt(), comicBookFist.pow(). Maybe I just do not like things that have the odour of a workaround for a technical limitation, but the API Design Guidelines seem to be saying moreā€orā€less the same thing. I would prefer they actually receive Swift names like x.to(power: y) or x.raised(to: y).

6 Likes

Replace Mathable with Mathematical and you have my axe :wink:

7 Likes

Thanks for the detailed feedback, and thanks also for exploring some of the design space for us =)

If we were designing the FloatingPoint protocols today, I might group many of the other methods and properties in a similar fashion: x.IEEE754.significand, etc.; I think it's a nice way to bind together related functionality in a way that autocomplete makes easier to discover. We are not going to change to that style now, of course, but I don't think the weight of precedent is so large that we can't adopt it for future additions where it makes sense.

Square root, admittedly, is a little bit of an edge case--the IEEE 754 standard and modern CPUs view it as a basic operation, which is qualitatively different from most other math functions, but historically it's been presented in most programming languages as just another function, more similar to cos( ) than to +. Ultimately, I don't think that this is a big deal either way, and we could have both names without any real fuss.

We have the existing name for squareRoot, but more generally I don't see a lot of value in having two names for every operation. sine and cosine aren't too bad, but it starts to get pretty silly: inverseHyperbolicTangent and logOfAbsoluteValueOfGamma? How do those add value compared to using the terse but standard names atanh and lgamma, which are cryptic to a novice but trivially searchable if someone wants to learn about them?

In the short-term, this is something that we simply have to deal with. It's annoying, but not a dealbreaker (in particular, no worse than the status quo). I view it as a bug to be fixed, annoying, but significantly lower priority than getting better library bindings for the existing functionality.

3 Likes

Yes, this is a perfectly reasonable workaround. I dislike having multiple names for a single thing, but this may be the least painful solution.

1 Like

The API guidelines also say "donā€™t optimize terms for the total beginner at the expense of conformance to existing culture." These functions have the same names in nearly every single programming language. That doesn't mean that we have to call them those names, but the burden for doing something else is large. In particular, for a non-english speaker, it will be much, much easier to find documentation (possibly for some other programming language) for pow(x, y) in their native language than it will be for x.raised(to: y).

5 Likes

For operations like sin(_:) and cos(_:), there's no question to me that it's best to stick with the existing terms of art. That said, for two argument functions such as pow, I think it's doing a disservice to users not to use Swift's features (like parameter labels, perhaps) to clarify which of the arguments are which. Even something like pow(x, exponent: y) would be more clear, and still pretty familiar.

Is there an implementation yet, or a full list of the added declarations? I don't recognize several of the functions listed in the proposal.

11 Likes

Which functions do you have questions about?

pow was right on the borderline for me, but argument labels would definitely help with the atan2 problem mentioned in the proposal.

This list:

cos , sin , tan , acos , asin , atan , atan2 , exp , exp2 , exp10 , expm1 , log , log2 , log10 , logp1 , pow , root , cosh , sinh , tanh , acosh , asinh , atanh , erf , erfc , gamma , lgamma .

Right. Which ones would you like clarified?

And I agree with those words. I guess where you end up with them depends on your interpretation of ā€œexisting cultureā€ and ā€œterm of artā€. To me, the prestigious Oxford English Dictionary is existing culture, but the ASCII byte table is a reduction of that existing culture for the sake of a primitive machine. A term of art is something artfully and thoughtfully designed to represent something in the best possible way. āˆšx and x2 are the real terms of art here. When they are unfeasible to the machine, something inline with the surrounding language gets chosen. sqrt fits into C right alongside strcat and strcmp , but it looks out of place among append(contentsOf:) and lexicographicallyPrecedes(_:).

I speak several languages so I tried it.

The Wikipedia article for Potenz (German for exponent) lists alternate notations to try when searching for the function in an arbitrary programming language. It actually suggests ā†‘ first. ^, ^^, ** are then followed by ā‹…, another nonā€ASCII operator, and then the list ends with expt. pow does not even appear in the list proper, only in a note afterward.

Under Ī”Ļ…ĢĪ½Ī±Ī¼Ī· (Greek for exponent), pow is in the list, but only after ā†‘, ^, ^^, **, ā‹… and Power.

Under חזקה (Hebrew for exponent), only ^ and ** are mentioned.

Exponentiation in French offers no alternative to the traditional superscript notation.

Given the native word and fishing for the English equivalent to use as the search term instead, Wiktionary helps the German, French and Greek find the full word ā€œpowerā€. (Hebrews get no assistance from it at all.) The word itself, being absent from the function name, is also absent from any HTML heading or navigation elements on any C documentation site, so they fall way down the search results list after suggestions for other programing languages using variants of Power() (not to mention a lot of extremely irrelevant hits...)

Going in reverseā€”seeing pow already used and wondering what it meansā€”and using the Google search string for "pow deutsch", "pow francĢ§ais", "pow ĪµĪ»Ī»Ī·Ī½Ī¹ĪŗĪ±Ģ", "pow עב×Øי×Ŗ" produces pages about prisoners of war, powā€wows and Super Marioā€™s P.O.W. block.

Which do you think you would figure out faster by searching? x.eĢleveĢ(aĢ€Puissance: y) or pui(x, y)? Four of the first five hits already give the right answer when the first option is transformed into a search string. The second is so short it could be an acronym for almost anything and the search provides a wide array of completely unrelated thingsā€”no two hits have to do with the same thing (though the correct answer is admittedly does not exist yet to be found among them in the first place).

I think x.raised(toPower: y), x.raised(toPowerOf: y) or x.toPower(of: y) seems like a much more selfā€documenting and discoverable name. (But you are right that raised(to:) is probably vague.) There is vastly more information about the general math concept out there in various languages than there is about the specific programming contexts, and none of the general materials help you figure out pow.

11 Likes

Iā€™m going to assume you know most of them, and youā€™re probably talking about some or all of these:

expm1, logp1, gamma, lgamma, erf, erfc

The last two (erf and erfc) are the error function and its complement.

The first two exist for numerical-precision reasons. When x ā‰ˆ 0 we have exp(x) ā‰ˆ 1, thus expm1 is defined as (exp(x) - 1) and implemented so the small bits are not lost on floating-point types. Similarly, to go the other way, we have logp1(x) := log(x+1).

The gamma function is basically the factorial shifted by one and extended to as much of the complex plane as possible, and lgamma is its natural logarithm.

4 Likes

To be clear, I'm mainly talking about comprehension, so the relevant search is "I see pow, what does that do?" People who are using pow generally already know what it does.

"pow funktion" and "pow fonction" both turn up immediately clear results (significantly better than "raised to") on German and French google searches. My Greek and Hebrew isn't good enough to reach any conclusions.

Overall this sounds great to me.

  • I think adding argument labels to atan2(y:x:) would be valuable. This is a place where we can improve over the C function without affecting name recognition or ease of conversion from C code (since the arguments can stay in the same order and we should produce a fixit to add labels). I prefer not to have argument labels for pow, since I personally find that one obvious, but it wouldn't be a problem for me either way.

  • I don't find the Math namespace under the floating point types compelling. I would have expected either x.sin() or Double.sin(x), not Double.Math.sin(x). Not a big deal either way.

6 Likes