Decimal IEEE 754-2019

Dear all,
I wrote here some (very high level) considerations about the implementation of "Decimal" (GitHub issue #236).

Please, let me know what do you think about it.

Design

Protocol name and definition

For what concerns the name, I would simply call it Decimal rather than DecimalFloatingPoint, It's simpler and shorter. As far as I can see, the only argument that can be made for calling it DecimalFloatingPoint is symmetry with respect to BinaryFloatingPoint.

Here I sketched the basic scheme of the Decimal protocol

uml_decimal

Besides FloatingPoint and ExpressibleByFloatLiteral's properties and methods, I think it would be nice to implement some other useful operations and constants like

@available( ... )
public protocol Decimal: FloatingPoint, ExpressibleByFloatLiteral {
    /* Euler's number, approximately equal to 2.71828 */
    var e: Self { get }

    /* Constant value 0 */
    var zero: Self { get }

    /* Constant value 1 */
    var one: Self { get }

    /* returns the integer part of self ÷ other */
    func integerDivision(_ other: Self) -> Self
}

Implementation details

The table 3.6 of the IEEE 754-2019 document (p. 23) contains the specifications to implement any decimal floating point number of size k bits, with k being a multiple of 32. For practical reasons, I would stop at 128.

However, it is necessary to decide how the fields containing the information of the floating point number (sign, exponent and mantissa) will be implemented. From my point of view, implementing the structure at a low level (in C) would have the advantage of being able to (spatially) optimize it like that

typedef struct __attribute__((__packed__)) _decimal32 {
    unsigned int sign:1;
    unsigned int exponent:11;
    unsigned int mantissa:20;
} decimal32;

And then write the high level APIs in Swift - keeping in C the structure, the basic operations and the comparisons. The C packed structures would be incredibly useful to implement the Decimal128.

The other option is to write it entirely in Swift as the BinaryFloatingPoint protocol as the BinaryFloatingPoint protocol does

public protocol BinaryFloatingPoint: FloatingPoint, ExpressibleByFloatLiteral {

  /// A type that represents the encoded significand of a value.
  associatedtype RawSignificand: UnsignedInteger
  /// A type that represents the encoded exponent of a value.
  associatedtype RawExponent: UnsignedInteger

  ...
}

I don't know whether you try to discourage the use of C within the library itself. For this reason, I leave this decision to be made by those more knowledgeable than me.

Summary

I summarize here some of the key point discussed before:

  • Name of the protocol;
  • Properties and methods;
  • Implementation of the classes: C vs. Swift data structures.

cc @icharleshu

2 Likes

Foundation Decimal has some binary- and API-compatibility requirements that make direct replacement with IEEE 754 Decimal not possible. We have to continue to work with binaries that use the existing Decimal type on ABI-stable platforms. What is probably possible is to bridge the two types so that IEEE 754 Decimal128 becomes the representation of Decimal for new code.

However, the first step in that is to bring up Decimal128 as its own type outside of Foundation. There are already a few Swift implementations kicking around, but they could all use some cleaning up and more testing coverage. There's also Intel's BSD-licensed C implementation, which is generally reasonably high quality, but entirely in C which is a bit of a bummer, and it's definitely not the most readable codebase, which makes it not great for people to learn from (that's OK, but it would be nice to make it more approachable).

Protocol naming details are pretty superficial, and not really worth worrying about too much in the short term. No design survives contact with the enemy--build an implementation, use it for a few real projects, and see what it needs. However, some notes on what you suggest:

  1. The constant e is significantly less common in computing than the exp(_:) function itself, which already lets us spell e as exp(1). If you provide e then people will sometimes write pow(.e, x) when they really want exp(x), which is generally both less accurate and less efficient (and hence a footgun). I'm open to re-evaluation if/when someone comes forward with a compelling need for the constant (this is why it's not defined on [Binary]FloatingPoint).
  2. zero and one are already required by the Numeric conformance (also, e, zero, and one should all be static var; they're properties of the type, not of values).
  3. integerDivision is interesting, but it has a major complication: the integer part of the quotient is generally not representable as Self (it can be emax - emin digits wide, which is much wider than the significand of any fixed-width Decimal type). I think that it's possible to design an API like this that is useful despite this limitation, but one would have to think carefully about what uses it's actually going to be put to and how to document the limitations.
  4. Decimal as a protocol name isn't an option because there's no good source migration strategy from existing use of the Decimal type.

Decimal32 is fun, but as far as I know, no one has done anything interesting with it. Decimal64 and 128 are the two types that are really worth implementing (and 128 is by far the more important of the two).

Using C bitfields is highly undesirable. Someone will port your C code to another compiler/platform and it will be wrong, because layout and ordering of bitfields in C is entirely implementation-defined. We know that in Swift it will always be Clang's definition, but better to just define accessors in Swift and call it a day. Once you write them, it's just as good as having the bitfield definition.

11 Likes

I might be out of my depth here, but I think conforming Decimal to ExpressibleByFloatLiteral perpetuates a poor design choice in the original type, which allows for things like these to happen:

let foo: Decimal = 3.133
print(foo) // 3.132999999999999488

Instead, we should either completely bar Decimal from being expressible by literal, or have Swift implement a native ExpressibleByDecimalLiteral.

This has been extensively discussed in other threads, like:

3 Likes

No, conforming Decimal to ExpressibleByFloatLiteral makes good sense, and the thing you're worried about would not happen¹: any decimal literal with up to 34 digits and an exponent in -6143 ... 6144 is exactly representable. The mechanics of ExpressibleByFloatLiteral in the compiler need some improvement to make it work, but the semantic conformance to the protocol is sound.

It would make sense to separate out decimal floating-point literals from hex / binary floating-point literals, since you never want to specify a Decimal that way, but that's trickier from a source and ABI stability perspective.


¹ would not happen once the above-mentioned improvements to float literals were made, but I would expect that to be part of such a project.

9 Likes

is this true of Decimal64 as well?

Decimal64 exactly represents 16 digits with an exponent range of -383 ... 384.

so in @davdroman 's example, wouldn't

round to 3.132999999999999?

Yes, and that's fine.

We would also have initializers that require values be exactly representable, but this is very much the expected behavior for floating-point literals.

I don't think this is true, Numeric doesn't add any static properties of it's own, and AdditiveArithmetic only requires zero. Numeric does refine ExpressibleByIntegerLiteral, so you can always create one from a literal, but there's no static property named one in any of those protocols.

2 Likes

That's true, but one is still there, it's just spelled 1. I'm not really convinced there's a lot of value in giving it another name.

The naming here is a bit confusing. Decimal64 obviously means a number that stores 64 decimal digits, right? Because it's obviously analogous to Int64, just a decimal version (as opposed to binary).

It'd be nice to decouple the numerical nature (integer vs real etc) from the radix. The word "decimal" does not intrinsically mean real (let-alone fixed- vs floating-point or other such axes).

This might seem moot in the context of working to the existing NSDecimal API, but I suggest that API should be abandoned anyway. NSDecimal was a bit awkward to actually use even in Objective-C, and the Swift interface to it is worse. If breaking compatibility is already on the table - re. the discussion about whether to conform to IEEE 754 this time around - then I say break it properly and get rid of all the legacy warts.

The only value would be if it were added to a protocol below Numeric, but I don't see how you could add a default implementation for that, so it's a moot point.

I think having the N in Decimal{N} be the number of bits of storage is a term of art at this point. It's definitely named that in IEEE-754, so it'd probably more confusing to make the number be the maximum amount of decimal digits.

3 Likes

Decimal64 "obviously" means the IEEE 754 decimal64 type (which is 64 bits wide and represents 16 decimal digits).

(And breaking compatibility is very much not on the table for any part of the Foundation rewrite, except in cases where the current behavior is very clearly a bug, and even then only with great care. Otherwise we'd have to maintain the old behavior on existing Apple platforms, and we'd be right back where we started with two separate source bases. Unifying the implementation has more value than fixing every wart.)

3 Likes

Can you elaborate? Foundation has sunset APIs in the past. Is that truly off the cards forever?

2 Likes

Sunsetting APIs is very different from changing the behavior of an existing API without replacing it.

If there are APIs whose behavior is fundamentally wrong, but widely depended on, we would generally introduce replacement API while maintaining the existing API's behavior for a prolonged deprecation period, rather than changing the behavior of the existing API.

7 Likes

@scanon
as you suggested, I started working on decimal128 on a separate repository (not yet public). I looked at several implementations of it as a reference, including the Intel one you mentioned. I have to say I'm not so sure about using BID encoding. The cases described in the standard (p. 21) are simpler than those for DPD (also described on page 21). However, it seems to me that the optimizations applicable to DPD (i.e. keeping the encoded 10 bit numbers in a key value map), are more readable than those of BID.
Do you have any preference regarding the e coding of the mantissa?