[Pitch] Comprehensive Number

Hexley · March 14, 2024, 1:08pm

Happy Pi Day!

In celebration of Pi Day, I'm thrilled to share with you all a mathematical delight—a Pi Calculator written in the Swift language. This nifty tool not only calculates Pi to your desired number of digits but also showcases the power and elegance of Swift for complex numerical computations.

Number does just that! Focusing on ease-of-use, API clarity, and Swift's progressive disclosure, Number's simple set of structs and enums model the various types of numeric concepts, ranging from the everyday to the scientific. Here's just a slice of the whole proposal:

Number

Proposal: SE-NNNN
Author: C. Heath

Introduction

Numeric types are messy. Swift has a multitude of Numeric types: UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float, Float16, Double. As well as types that take on platform-specific meaning (UInt, Int) or may not even exist on some platforms or architectures, such as: Float80. All of this before we even look outside the standard library to types such as: CGFloat, Decimal or NSNumber. These types are not interoperable and often require explicit conversions or coercions to perform arithmetic operations or to pass them as arguments to functions.

This proposal introduces a comprehensive Number type with String-like simplicity for numeric values, dramatically simplifying the use of numbers and enabling advanced scientific computation.

[...]

Proposed solution

Number itself is a String-like type, built around a simple enumeration responsible for marshalling work to a series of specialized components.

enum Number {
    case real(RealNumber)
    case imaginary(ImaginaryNumber)
    case complex(ComplexNumber)
}

String does this with Character and StaticString. String does not care what the underlying Characters are, whether Latin script, Arabic, Kanji or Emoji; one can quickly and easily compose a String containing all of them. Number embraces this paradigm. Whether actually a NaturalNumber, IrrationalNumber or ComplexNumber; Number's operations return a Number containing the exact result, every time.

[...]

You can test Number today by downloading from github or by it's package with:

.package(url: "https://github.com/hexleytheplatypus/swift-se0000-number.git", from: "1.0.1"),

And adding Number to your dependencies:

.product(name: "SE0000_Number", package: "swift-se0000-number"),

Then importing it:

import SE0000_Number

Check out computing the pi approximation in the test file.

I believe that by enhancing Swift's numerical capabilities, we can unlock new potentials and applications. We can make Swift the go-to language for not only app development but also for scientific and mathematical computing.

Happy Pi Day once again! Don't forget to check out the whole proposal, comment and ask questions here or with a direct message.

wadetregaskis · March 14, 2024, 4:08pm

It's an interesting idea, although a big departure from convention so it might require some getting used to.

Replacing incorrect uses of `Int`

I certainly do like the idea of fixing various APIs that use signed types when in fact negative values are absolutely invalid. And this Number type seems like it moots all the arguments for using signed types for values that are not in fact signed. So big positive energy there, as Casey Liss would say.

However, I'm not sure how plausible it is to actually change all those APIs, irrespective of how good the replacement is. Especially since many are in Foundation (distributed separately from the compiler & stdlib, and with the requirement to be compatible across even major compiler version bumps). I believe changing Int to UInt for collection indices and counts has already been unilaterally rejected by the Swift team, repeatedly.

"Policy" aside, from a technical perspective the replacement WholeNumber (or whatever) would need to be at least as fast and efficient (size-wise) as Int, at least in release builds, otherwise I think this particular application is untenable. Bounds checking is already a performance limiter for some applications, and counter-proposals like "don't use standard collections for performance sensitive code" or "use special APIs to bypass bounds checks" go against the current grain and Swift's larger philosophy of 'safety'.

Performance

It's the Swift team's policy that proposals don't have to generally address runtime performance, and may ignore known performance problems (on the presumption they can be fixed as implementation details). I don't think that's wise in general (see also: macros) but especially in this case, reasonable performance needs to be assured. Such a pervasive change, as redoing essentially the entire numerics system of the language, needs to be certain of good performance and to have that good performance from the outset (Swift is no longer a seedling language; noticeable regression of standard library performance is untenable).

As such, it'd be good to see benchmark results in the proposal, and performance examined more broadly.

Of course, if the proposal reduces its ambition somewhat (re. replacing Int etc) then performance expectations can be appropriately reduced. I would certainly hate to see something like this rejected from consideration just because it's not quite as fast as the primitive numeric types, because it obviously has great value otherwise and might well be worth the trade-off in other use-cases.

Vectorisation

I'd say this particular aspect of performance and implementation can be largely deferred, because it can't possibly be pivotal to acceptable performance. Most numbers in most programs are small, standalone scalars and don't benefit from SIMD except by loop unrolling - only that aspect (ensuring this doesn't interfere with or preclude existing compiler optimisations) need be addressed.

It also seems pretty straightforward to add SIMD enhancements to the underlying arbitrary precision number types, for [the relatively rare] cases where they are large enough to benefit (65 bits or larger, I'd guess). Therefore I'm not worried about that particular aspect being punted to a later time.

Async

Re. the proposal's vague mention of async, it's not clear to me how that could plausibly improve performance. Is it merely an attempt to make it easier to do numeric calculations in parallel by [ab]using async let? Because otherwise I wouldn't think numeric calculations contain any suspension points internally, so nominally have no need nor benefit from being async. And being async introduces significant overheads to basic execution.

Full safety (re. "Throwing Operations")

Re. cases like division by zero, it seems like the proposal as it stands already eliminates enough categories of errors (overflow & underflow etc) as to make throwing operations viable to introduce for the remainder. Swift has previously (repeatedly) rejected the notion of having basic arithmetic operators throw on the premise that it would make them too inconvenient to work with. But if throws only applied to a limited subset of operators, like division, then perhaps folks will find it more palatable (even without further assistance from the compiler, such as implicit conversion to non-throwing for known-good inputs).

And being able to say that Swift Numbers are completely safe (i.e. won't randomly crash your whole app!) would be a huge boon both for the language (PR-wise) and for all its users.

Protocol conformances

The proposal makes no mention of things like Codable. Those should be covered, at least insofar as there should be a clear roadmap even if the implementation details are omitted.

Precision Tracking

Adding 'precision tracking' seems like more trouble than it's worth, at least on first blush. I think it has to be taken as a given, and understood by all programmers, that the language is only as good as its inputs. e.g. if the user provides pre-rounded input (let-alone outright wrong input), that can't possibly be the responsibility of the basic numeric types. I can see how maybe them having a facility to track semantic precision could be helpful to end users, but I worry that it'd spill that complexity into use-cases that don't have a need for that. It might be better to leave that kind of metadata management to a higher layer, as it is today.

That said, it might just be that this is so far from typical programming language convention that I'm being a bit of a luddite about it. I do welcome more discussion on this topic, particularly from anyone that has actually had to deal with / implement this sort of thing in real-world code (I have but like twenty years ago, so it's all too faded from memory).

And if it did track that, presumably it would need more nuance than a boolean ("precise or not"). Precision varies differently depending on the arithmetic operations applied (re. additive vs compounding errors, etc).

Technically irrelevant

Kudos for using Monodraw (or at least, that's my intuition based on the style). Monodraw is one of the most delightful programs I've ever used. I can't believe someone actually made something so niche that's so bloody well done.

Hexley · March 14, 2024, 5:07pm

Thanks for covering so much! I've got a bit for you here, so I'll go point by point.

But first things first, absolutely Monodraw! It is a wonderful app and I highly recommend it!

On Future Direction - Precision Tracking:
You're correct, it's not so much intended to prevent wrong inputs so much as let you know when you're dealing with a number that was rounded to a specific location. (Not Rounded; Double always to 16 places; Pi to the requested 42 digits, etc) This has really populated my mind this morning, and I've been thinking I have a good concept to implement it an quickly, so I might just do it this weekend and test if it's truly a useful future direction.
On protocol conformances, noted I should be more explicit on that list. I'll note for future readers here (and I'll add this into the proposal) that all the types in the Number hierarchy are Comparable, Codable, Hashable and Sendable; with the Number struct itself also being Numeric. That should cover the majority. (See the document again later today for a more elaborate listing.)
On Future Direction - Throwing Operations:
Division is the only one I have a current suggestion for as all of the other operations yield a result. Although, I do think it might have a place for some of the more specific uses. .factorial() could exist on Number proper if it threw or returned nil for values which are NaturalNumber or WholeNumber-- but that's a deep rabbit hole, hence the possible future, but no plans at the moment.
On Future Direction - Async Operations:
There's some potential benefit in async let but that's likely defeated by the cost of overhead as you point out. The value in async operations is actually in allowing a long-running scientific calculation not block. For example .pi(1_000_000_000) would block for so long that it's impractical and splitting the work among CPUs may make it faster. A more usual example is an App where computation takes 2.5 seconds before the answer arrives, this way the computation never blocked the main thread and the App can continue to take input.
On Future Direction - SIMD/Vectorization:
You're 100% spot-on here again! I need to make clear in the proposal this is only a benefit for supremely large numbers that exceed particular sizes.
On Performance:
I do have some bare bones benchmarking coming along, I'll make sure to add reference/links to it in the proposal. In naive, entirely-unoptimized and with no-inlining; things are looking decent so far:

chart2560×960 394 KB
Compile-Time Performance Sidebar:
@philippe_hausler Pointed out that during SE-0329 adding new operators inherently impacted compile time, especially for codebases the size of the compatibility suite. This will have to be tested against the compatibility test suite and Number adds at least 10 new "+" operators for its types and likely more for interoperability. I have suggested that some interoperability could be sacrificed if an issue arises, but testing the reality of this is where to go first.
On Replacing Int:
This is HUGE and a very difficult task. UInt has always better represented the intent and limitations of Array<Element>.count but I've understood it to partly be a mostly two-pronged issue. First when the language was created the question "What is the basic number type new programmers will use?" Int became the answer, and it's a decent choice because of the second prong: "Coercions are potentially unsafe". As you rightly note, WholeNumber to Number is ALWAYS safe. In fact, every operation of a specific number type performed with another is safe, because the result is always a Number and the places where you'd request a specific type back from a Number are always failable initializers. Safety is a core tenant of this proposal, so much so that none of the code is unsafe (a few fatalErrors still exist while the pitch occurs, but they'll be removed before/during review).

Thanks for the support! Please feel free to delve even deeper or ask any questions, it's been a lot of work and I'm really excited to talk about Number.

scanon · March 14, 2024, 5:30pm

Something is very wrong with that graph; computers are fast, but they cannot do anything at a rate of 1ps per element. I assume that the input size isn't actually scaling and so we're seeing a constant time divided by ever larger values or something similar.

wadetregaskis · March 14, 2024, 5:42pm

If WholeNumber (or other specific subtypes) were explicitly used, such as for collection indices or bounds, then wouldn't you also have to consider underflow (results that are negative)?

It sounds like the intent is to require explicit casts (or equivalent) from Number (the universal result type for arithmetic) to the specific subtype expected. This seems to force an uneasy trade-off between different use-cases. e.g.

var a: Number
var b: WholeNumber

…

a = x - y // Never throws, seems convenient.

// …but:
b = x - y // ❌ No implicit conversion to WholeNumber, so…

b = try .init(x - y) // Hmmm…
b = try x - y // Much nicer than having to type-cast, but
              // now we can't have `a = x - y`…?

I suppose there could be overloads for the relevant operators based on the return type, although that introduces its own ergonomic issues by requiring all use sites to choose the desired return type.

Maybe the throws vs not distinction might help here? If you write try x - y it would presumably only match overloads which are throws… although if there's still more than one such overload, it doesn't really solve the ergonomics issues.

Re. use of init for conversions, we'd also have to consider ergonomics with optional chaining - Swift has a lot of use of init to convert between types, which is frankly really annoying and results in lots of individual reinvention of asOtherType computed property extensions. It feels gross to admit it but this is an area where Java is superior to Swift.

I suppose if an API were presented that's sufficiently effortless and had no negative consequences, then that would be very welcome. I'm just not immediately sure what that would look like, or if it's possible.

The cases you're describing are a mix of "not technically a problem" (taking a long time for an entirely locally-compute-bound activity is not technically an issue from Swift Concurrency's perspective) and things that already & conventionally require additional work (e.g. to spin up a background calculation thread, which likely entails reporting progress & supporting cancellation, expressing that in the user interface, etc).

I welcome a revolutionary improvement in how we can do those things, but I also would never presume that something as simple as a numeric type need explicitly address these aspects of their use.

Of course, having async variants for specific, known-long-running operations is a slightly different proposition. It's fine to have an async overload of .pi(…), for example. It just wouldn't be good if it were only available as async.

Note also that parallelism is entirely possible within a sync API; func pi(_ digits: Number) -> Number is perfectly within its rights to spin up helper threads to help execute its work).

Perhaps I'm misinterpreting the chart - though thanks for providing that, that's exactly the sort of data I'd like to see included in the proposal - but it seems that integer equality is an order of magnitude slower, and integer addition is two orders of magnitude slower?

My intuition is that such a gap is untenable for standard library use. Could be wrong, though (pathological microbenchmarks of course overlook the effect of operation interleaving in real-world applications, which can significantly hide nominal inefficiencies).

I realise that's very early numbers from as-yet-unoptimised code. I'm just saying, I think it needs more optimisation.

Yeah, this has come up quite a few times before, in various contexts. I view that as an existing problem that already needs to be addressed, so (in principle) it shouldn't affect this proposal other than perhaps an ordering constraint (fix operator performance first, then apply this proposal).

Alejandro · March 14, 2024, 5:43pm

Effect on ABI stability
This feature primarily adds to the standard library.

However, if Number is to become the default Type-Inference for IntegerLiterals and FloatLiterals (and possibly a FractionLiteral See Future directions) and values such as Array.count are to move from Int to WholeNumber, this change would be breaking and migration diagnostics are required.

This proposal's position is that such changes land with Swift 6.0, removing non-descript declarations of Int which do not actually describe the API intent, i.e. Array.count: Int. Number usage would make these sites more expressive while simplifying operations between these types.

We cannot replace current uses of Int in our API to WholeNumber because that is ABI breaking and we cannot break ABI. These API like Array.count: WholeNumber would have to be in addition to the current status quo.

Hexley · March 14, 2024, 6:12pm

I thought so myself. I've not used the collections-benchmark before, it may be entirely the wrong tool, I'm not sure. The benchmarking code is incredibly simple.

self.add(title: "Number.+", input: Int.self) { input in
            let lhs = identity(Number(input))
            let rhs = lhs
            return { timer in
                blackHole(lhs + rhs)
            }
        }

My own embarrassment aside, I'll leave the chart inline for the sake of possibly helping someone in the future.

If anyone can give it a look, I've added a link below. I'd love to understand this issue myself.
Benchmark Package Link

Hexley · March 14, 2024, 6:24pm

@Alejandro You're certainly right on how that change would be ABI breaking. It's been my understanding that ABI breaking changes can be deemed worthwhile across a major version number change. For Number, Swift 6.0 is probably not on the table given the branch date is tomorrow (March 15th 2024).

However, I consider this an actual benefit. Getting to live with Number for a while before considering the ABI breaking change for Swift 7.0. If the community finds Number truly useful, we'll be clamoring for the change. In the intervening time

let count = WholeNumber(array.count)

is entirely supported!

Outside the Standard Library, various SDK APIs could benefit from Number too. All of these changes will take time beyond this initial pitch.

Alejandro · March 14, 2024, 6:31pm

Generally speaking, the Swift project has not taken any ABI breaks yet. It's up to the Language Steering Group to determine to accept a proposal, and I'm unsure if platform vendors (like Apple) necessarily have to take the accepted changes. It is theoretically possible to take ABI breaks between major versions, but we're not doing that with Swift 6 (because Swift 5 was the first release with ABI stability on Apple platforms) and it's unclear if that's desirable at all with a hypothetical Swift 7.

scanon · March 14, 2024, 7:51pm

Setting aside various practical implementation and performance issues, any change like this would be not only ABI-breaking but also source-breaking, on a scale that's unprecedented for even breaking changes in Swift. AFAIK the only transition that would even come close would be the Swift 2 -> Swift 3 change, which some people are still sore about. So it's pretty unlikely that we'd see this sort of widespread adoption in stdlib API.

(Which is not to say that you shouldn't build this thing, just that widespread adoption throughout existing API in the standard library is probably off the table.)

Hexley · March 14, 2024, 8:05pm

I'm sorry @wadetregaskis! I seem to have scrolled right past this when I last responded to @scanon about the chart.

Let me hop in here and respond now.

Operations have differing return types, say for WholeNumber your example of negative would be an Integer proper in Number's type hierarchy, which is why WholeNumber's returns Integer for all of it's subtraction functions; here's the code for WholeNumber minus a NaturalNumber as the natural could be bigger.

static func - (lhs: WholeNumber, rhs: NaturalNumber) -> Integer

On the async stuff, you're right about probably wanting a specific use case and offering both async and non-async variants in most case-- again, if we go that direction in the future.

Lastly with performance, yeah I've been considering implementing a straight-passthrough of smaller types and only if an overflow is incurred do we grow to an array of storage. This is great, but muddles up the code some so I wanted community input first. It's worth noting String has a similar invocation in _StringGuts for using minimal storage for speed first and only more if needed. What do you think, should we implement this pass thru?

Hexley · March 14, 2024, 8:17pm

There is definitely ABI cost to that, and some precedent a lot like String between Swift 2 -> Swift 4 or how the Swift overlay dramatically changed the CoreGraphics.framework API from Swift 2 -> Swift 3. It was painful, exceedingly so for new developers. But in the end the ease-of-use was deemed worthwhile and it sort-of set the expectations for Swift API design.

To agree with @scanon I don't think we shouldn't do things just because they are hard. While I understand the concerns about landing the breaking changing features in the Standard Library, I do think time solves this problem. Some primitives may never be wholesale rewritten again, but we should leave room for modern variants of APIs, allowing habits to change over time before deprecation of an existing one. I'm not suggesting this per se, but I want to leave room for us to make it possible, rather than shut the door on this Future Direction altogether.

extension Array {
    var objectCount: WholeNumber {
        return WholeNumber(self.count)
    }
}

(It's worth noting I'll be moving this to a dedicated section of the Future Directions in the proposal.)

David_Smith · March 14, 2024, 8:58pm

I think there may be a misunderstanding about the implications of ABI breaks. Source breaks impact developers by forcing them to change their code when they adopt new versions of the language. ABI breaks cause existing Swift applications to stop working for end users.

Well-justified source breaks are on the table for Swift 6, but as far as I know the only situation in which an ABI break would be possible is when switching CPU architectures, since there's no existing code to be compatible with in that situation.

It's difficult to imagine a situation where it would be worth attempting to explain to nontechnical users that none of their existing software will work if they upgrade their OS.

scanon · March 14, 2024, 8:59pm

Even then, barely. "No existing Swift code will compile" is very much not on the table.

Hexley · March 14, 2024, 9:06pm

+1

wadetregaskis · March 14, 2024, 9:27pm

Well, that's just a matter of magnitude. Apple OS updates break existing software all the time. APIs get removed, API's get outright broken (sometimes intentionally even knowing the ramifications), etc.

It's not that Apple never break compatibility, it's just not normally done quite so much at once.

Linux is even worse in this regard - heck, a minor update to any random package can break the entire system. The strategy employed by most Linux users is to just not update things, as much as possible. And to just do clean installs for OS updates and rebuild the world from scratch. Not that I'm bitter from too many years of being forced to use Linux professionally.

(not that there is a stable Swift ABI for Linux, just hypothesising as if there were)

Windows actually has a history of pretty serious backwards compatibility, with plenty of program-specific hacks preserved across years if not decades, by Microsoft. But Swift doesn't offer a stable ABI there [yet], so formally there's no issues there.

I don't think it's unreasonable to consider breakages on a case-by-case basis. I think @Hexley is being pretty reasonable here, simply pointing out what we could have in exchange; offering the option. The likely answer may well be 'no', but there's still value in actually knowing what's being denied.

wadetregaskis · March 14, 2024, 9:35pm

I figure that's "implementation details" that pivot on performance. I'm all for optimisation in principle, but it ultimately comes down to how much performance benefit there is vs implementation complexity (or other costs). I don't have a good intuition for what the right trade-off is here; I don't know the current implementation of Number.

I do think, tangentially, that it's worth being very cautious about introducing any extra bits of data beyond the core values themselves (e.g. a "isPrecise" bit alluded to earlier). Unless some pretty clever stuff is done internally to 'steal' that bit from the value's bits - e.g. first internal word is actually 63 bits instead of 64, to make space - then the cost of doubling the type's effective size¹ is almost certainly not worth it. Even with those sorts of clever optimisations, it might not be (given the cost of bitmasking and such).

¹ Meaning the aligned size (or "stride" as Swift calls it) will go from e.g. 64-bits to 128-bits. And even for standalone instances, it means things like using two registers instead of one to pass a simple numeric value.

tera · March 14, 2024, 10:28pm

Happened quite a few times for me when I was not able opening old apps:
first interfaceLib apps stopped working, then Carbon apps, then 32 bit apps, then apps which were not from "App Store or identified developers". The explanation is easy to do: question mark on the app icon and "This app is not compatible with this OS version" alert when the app is being launched.

I wonder about something else though. Our disks are huge. What if we store a few different runtime versions (e.g. 5) and depending upon the app in question pick the compatible runtime library? Then ABI breaking is no longer an issue (for the supported range of library versions), or is it?

John_McCall · March 14, 2024, 10:53pm

Hexley, you are perfectly welcome, as any member of the Swift community would be, to develop this idea as a package that people can use. It is quite possible that other people in the community will find it useful. However, it is very unlikely that it would ever be integrated into the standard library, and I can say with total confidence that there is no path for it to supplant the existing numeric types in the pervasive way that you are imagining. The Language Steering Group is not going to spend any time considering that; it is off the table.

We do not need to have an earnest philosophical discussion in this thread about the conceptual limits of source and binary stability, and I will simply close the thread if it continues.

Dmitriy_Ignatyev · April 3, 2024, 4:30pm

Looks very impressive!

While I'm against of changing current stdlib api in such intensive way, I'm delighted with the solution in general. It would also be great to hear about application of this solution in specific domain areas with highlighting of benefits and advantages (in comparison to builtin Numerics family). May be you have some articles about it.

[Pitch] Comprehensive Number

Number

Introduction

Proposed solution

[...]

Replacing incorrect uses of Int

Performance

Vectorisation

Async

Full safety (re. "Throwing Operations")

Protocol conformances

Precision Tracking

Technically irrelevant

Replacing incorrect uses of `Int`