Decimal32, Decimal64, and Decimal128 Design Approach

I've completed an initial design of the Decimal math module with Decimal32 being the first implementation. The design attempts to use generic algorithms so that additional types can be easily defined by specifying a few parameters. For example, the Decimal32 type is defined by the following (updated and simplified):

/// Definition of the data storage for the Decimal32 floating-point data type.
/// the `IntDecimal` protocol defines many supporting operations
/// including packing and unpacking of the Decimal32 sign, exponent, and
/// significand fields.  By specifying some key bit positions, it is possible
/// to completely define many of the Decimal32 operations.  The `data` word
/// holds all 32 bits of the Decimal32 data type.
struct IntDecimal32 : IntDecimal {
  typealias RawSignificand = UInt32
  typealias RawData = UInt32
  typealias RawBitPattern = UInt
  var data: RawData = 0
  init(_ word: RawData) { = word }
  init(sign:Sign = .plus, expBitPattern:Int=0, sigBitPattern:RawBitPattern) {
    self.sign = sign
    self.set(exponent: expBitPattern, sigBitPattern: sigBitPattern)
  // Define the fields and required parameters
  static var exponentBias:       Int {  101 }
  static var maxEncodedExponent: Int {  191 }
  static var maximumDigits:      Int {    7 }
  static var exponentBits:       Int {    8 }
  static var largestNumber: RawBitPattern { 9_999_999 }

These definitions parametrize a set of generic algorithms defined in IntDecimal to support the essential requirements of IEEE 754-2008 for Decimal numbers. Note: This is an internal data field within the Decimal32 type and as such is not exposed to the public. Sorry for any confusion.

Completed test cases exist for the Decimal32 type with a few failures to address. Some initial tests for the 64- and 128-bit Decimal numbers have confirmed this approach is feasible for string and bid/dpd conversions. The Decimal types also conform to a new DecimalFloatingPoint protocol.

Code is available here for reference: GitHub - mgriebling/DecimalNumbers: Proposed fixed-point decimal numbers package. PRELIMINARY NOT SUITABLE FOR PRODUCTION.

IEEE 754 (2008) or (2019). I'm not sure what 764 was, but whatever it was, it's lapsed. 754 is the standard for floating-point arithmetic, and was most recently renewed in 2019 (without any significant changes from 2008 that would effect what you have).

754 is very clear about the name for this being Significand. I have misgivings about both names ("mantissa" means something else already, "significand" collides with "sign" when abbreviated; I semi-jokingly propose "magnificand" from time to time), but that's the standard term.

These are not the maximum and minimum exponents; they're the maximum and minimum exponent encodings. They certainly should not be public API under these confusing names, and maybe shouldn't be public API at all.

By analogy with BFP's significandBitCount, this would be significandDigitCount.


public static var exponentBits:    Int {    8 }
public static var largeMantissaBits: IntRange { 0...22 }
public static var smallMantissaBits: IntRange { 0...20 }

These are a technical detail of the encoding and not relevant to users of the type. They should not be public API; these have no effect on the observable values of the type at all.


public var data: RawData = 0
public init(_ word: RawData) { = word }

These would be var bitPattern and init(bitPattern:) by analogy with BFP.

1 Like

Thanks for your comments, @scanon. Note: The IntegerDecimal32 is an internal data field within the Decimal32 type used to store the actual data bits and as such is not exposed to the public. Sorry for any confusion.

I will incorporate your comments. Many of your concerns are addressed by the IntDecimal (an internal protocol) which specifies that the exponents, for example, are the encoded exponents. In the actual Decimal32 type, there is a distinction between the raw exponents (biased or encoded) and the real exponent of a number. At any rate, I'll repost the actual Decimal32 code after incorporating your comments. I'll also make the visibility clearer.

The main reason for having IntDecimal fields visible is to allow these items to be defined when setting up the Decimal32 datat type. Since Swift currently doesn't allow protocol parameters, that was my solution to define the bit fields, maximum exponents, etc.

BTW, I don't have access to the actual IEEE standard so much of my work is based on reverse-engineered code. If I am not fully IEEE 754 compliant, please bear with me.

The Wikipedia page for 754 is generally accurate. Reverse engineering should not be necessary.

you might know this already but you can internalize frozen types (and stored properties) with @usableFromInline internal, just like @inlinable internal for callable things. (which i see is already widespread in your code.)

Thanks, that's what I've been using as a basis. The reverse engineering comes from trying to implement new algorithms based on the original Intel code for encoding/decoding, arithmetic operations, rounding algorithms, etc.

My goal is to try to make all the algorithms generic in the sense they could be used for any Decimal number. Intel's goal was to use copy & paste as much as possible. :wink: They also have lots of hardcoded numbers and obscure look-up tables, many of which have been replaced.

Here's the updated Decimal32 interface and top-level implementation. Names have been changed as suggested. I'm currently not showing the data encoding structure used to store the raw data to avoid confusion (please see the first post for more details). More comments have been added.

BTW, the lack of ExpressibleByFloatLiteral is sorely missed (even though it is not an exact representation). It definitely makes initializing Decimal numbers cumbersome.
We really need something like ExpressibleByDecimalLiteral which really only needs to be something handled differently by the compiler.

All the Decimal32 tests are running successfully. Some Decimal64 and Decimal128 tests have also been run -- primarily conversion to/from Strings and integer literals and each other.

/// Implementation of the 32-bit Decimal32 floating-point operations from
/// IEEE STD 754-2008 for Floating-Point Arithmetic.
/// The IEEE Standard 754-2008 for Floating-Point Arithmetic supports two
/// encoding formats: the decimal encoding format, and the binary encoding
/// format. The Intel(R) Decimal Floating-Point Math Library supports primarily
/// the binary encoding format for decimal floating-point values, but the
/// decimal encoding format is supported too in the library, by means of
/// conversion functions between the two encoding formats.
public struct Decimal32 : Codable, Hashable {
  typealias ID = IntDecimal32
  public typealias RawExponent = UInt
  public typealias RawSignificand = UInt32
  // Internal data store for the binary integer decimal encoded number.
  // The internal representation is always binary integer decimal.
  var bid =
  // Raw data initializer -- only for internal use.
  public init(bid: UInt32) { = ID(bid) }
  init(bid: ID)            { = bid }
  /// Creates a NaN ("not a number") value with the specified payload.
  /// NaN values compare not equal to every value, including themselves. Most
  /// operations with a NaN operand produce a NaN result. Don't use the
  /// equal-to operator (`==`) to test whether a value is NaN. Instead, use
  /// the value's `isNaN` property.
  ///     let x = Decimal32(nan: 0, signaling: false)
  ///     print(x == .nan)
  ///     // Prints "false"
  ///     print(x.isNaN)
  ///     // Prints "true"
  /// - Parameters:
  ///   - payload: The payload to use for the new NaN value.
  ///   - signaling: Pass `true` to create a signaling NaN or `false` to create
  ///     a quiet NaN.
  public init(nan payload: RawSignificand, signaling: Bool) { = ID.init(nan: payload, signaling: signaling)

extension Decimal32 : AdditiveArithmetic {
  public static func - (lhs: Self, rhs: Self) -> Self {
    var addIn = rhs
    if !rhs.isNaN { addIn.negate() }
    return lhs + addIn
  public mutating func negate() { ID.signBit) }
  public static func + (lhs: Self, rhs: Self) -> Self {
    Self(bid: ID.add(,, rounding: ID.rounding))
  public static var zero: Self { Self(bid: }

extension Decimal32 : Equatable {
  public static func == (lhs: Self, rhs: Self) -> Bool {
    ID.equals(lhs:, rhs:

extension Decimal32 : Comparable {
  public static func < (lhs: Self, rhs: Self) -> Bool {
    ID.lessThan(lhs:, rhs:
  public static func >= (lhs: Self, rhs: Self) -> Bool {
    ID.greaterOrEqual(lhs:, rhs:
  public static func > (lhs: Self, rhs: Self) -> Bool {
    ID.greaterThan(lhs:, rhs:

extension Decimal32 : CustomStringConvertible {
  public var description: String {
    string(from: bid)

extension Decimal32 : ExpressibleByIntegerLiteral {
  public init(integerLiteral value: IntegerLiteralType) {
    if IntegerLiteralType.isSigned {
      let x = Int(value).magnitude
      bid = UInt64(x), ID.rounding)
      if value.signum() < 0 { self.negate() }
    } else {
      bid = UInt64(value), ID.rounding)

extension Decimal32 : ExpressibleByStringLiteral {
  public init(stringLiteral value: StringLiteralType) {
    self.init(stringLiteral: value, round: .toNearestOrEven)
  public init(stringLiteral value: StringLiteralType, round: Rounding) {
    bid = numberFromString(value, round: round) ??

extension Decimal32 : Strideable {
  public func distance(to other: Self) -> Self { other - self }
  public func advanced(by n: Self) -> Self { self + n }

extension Decimal32 : FloatingPoint {
  // MARK: - Initializers for FloatingPoint
  public init(sign: Sign, exponent: Int, significand: Self) { = ID(sign: sign, expBitPattern: exponent+ID.exponentBias,
  public mutating func round(_ rule: Rounding) { = ID.round(, rule)
  // MARK: - DecimalFloatingPoint properties and attributes
  public static var exponentBitCount: Int      { ID.exponentBits }
  public static var significandDigitCount: Int { ID.maximumDigits }
  public static var nan: Self                  { Self(nan:0, signaling:false) }
  public static var signalingNaN: Self         { Self(nan:0, signaling:true) }
  public static var infinity: Self             { Self(bid:ID.infinite()) }
  public static var greatestFiniteMagnitude: Self {
  public static var leastNormalMagnitude: Self {
  public static var leastNonzeroMagnitude: Self {
    Self(bid: ID(expBitPattern: ID.minEncodedExponent, sigBitPattern: 1))
  public static var pi: Self {
    Self(bid: ID(expBitPattern: ID.exponentBias-ID.maximumDigits+1,
                 sigBitPattern: 3_141_593))
  // MARK: - Instance properties and attributes
  public var ulp: Self            { nextUp - self }
  public var nextUp: Self         { Self(bid: ID.nextup( }
  public var sign: Sign           { bid.sign }
  public var isNormal: Bool       { bid.isNormal }
  public var isSubnormal: Bool    { bid.isSubnormal }
  public var isFinite: Bool       { bid.isFinite }
  public var isZero: Bool         { bid.isZero }
  public var isInfinite: Bool     { bid.isInfinite && !bid.isNaN }
  public var isNaN: Bool          { bid.isNaN }
  public var isSignalingNaN: Bool { bid.isSNaN }
  public var isCanonical: Bool    { bid.isCanonical }
  public var exponent: Int        { bid.expBitPattern - ID.exponentBias }
  public var significand: Self {
    let (_, _, man, valid) = bid.unpack()
    if !valid { return self }
    return Self(bid: ID(expBitPattern: Int(exponentBitPattern),
                        sigBitPattern: man))

  // MARK: - Floating-point basic operations
  public static func * (lhs: Self, rhs: Self) -> Self {
    Self(bid: ID.mul(,, ID.rounding))
  public static func *= (lhs: inout Self, rhs: Self) { lhs = lhs * rhs }
  public static func / (lhs: Self, rhs: Self) -> Self {
    Self(bid: ID.div(,, ID.rounding))
  public static func /= (lhs: inout Self, rhs: Self) { lhs = lhs / rhs }
  public mutating func formRemainder(dividingBy other: Self) {
    bid = ID.rem(,
  public mutating func formTruncatingRemainder(dividingBy other: Self) {
    let q = (self/other).rounded(.towardZero)
    self -= q * other
  public mutating func formSquareRoot() {
    bid = ID.sqrt(, ID.rounding)
  public mutating func formSquareRoot(round: Rounding) {
    bid = ID.sqrt(, round)
  public mutating func addProduct(_ lhs: Self, _ rhs: Self) {
    bid = ID.fma(,,, ID.rounding)
  public func isEqual(to other: Self) -> Bool  { self == other }
  public func isLess(than other: Self) -> Bool { self < other }
  public func isLessThanOrEqualTo(_ other: Self) -> Bool {
    isEqual(to: other) || isLess(than: other)
  public var magnitude: Self { Self(bid: ID.signBit)) }

extension Decimal32 : DecimalFloatingPoint {
  // MARK: - Initializers for DecimalFloatingPoint
  /// Creates a new instance from the specified sign and bit patterns.
  /// The values passed as `exponentBitPattern` and `significandBitPattern` are
  /// interpreted in the decimal interchange format defined by the [IEEE 754
  /// specification][spec].
  /// [spec]:
  /// The `significandBitPattern` are the big-endian, binary integer decimal
  /// digits of the number. For example, the integer number `314` represents a
  /// significand of `314`.
  /// - Parameters:
  ///   - sign: The sign of the new value.
  ///   - exponentBitPattern: The bit pattern to use for the exponent field of
  ///     the new value.
  ///   - significandBitPattern: Bit pattern to use for the significand field
  ///     of the new value.
  public init(sign: Sign, exponentBitPattern: RawExponent,
              significandBitPattern: RawSignificand) {
    bid = ID(sign: sign, expBitPattern: Int(exponentBitPattern),
             sigBitPattern: ID.RawBitPattern(significandBitPattern))
  // MARK: - Instance properties and attributes
  /// The raw encoding of the value's significand field.
  public var significandBitPattern: UInt32 { UInt32(bid.sigBitPattern) }
  /// The raw encoding of the value's exponent field.
  /// This value is unadjusted by the type's exponent bias.
  public var exponentBitPattern: UInt { UInt(bid.expBitPattern) }
  //  Conversions to/from binary integer decimal encoding.  These are not part
  //  of the DecimalFloatingPoint prototype because there's no guarantee that
  //  an integer type of the same size actually exists (e.g. Decimal128).
  //  If we want them in a protocol at some future point, that protocol should
  //  be "InterchangeFloatingPoint" or "PortableFloatingPoint" or similar, and
  //  apply to IEEE 754 "interchange types".
  /// The bit pattern of the value's encoding. A `bid` prefix indicates a
  /// binary integer decimal encoding; while a `dpd` prefix indicates a
  /// densely packed decimal encoding.
  /// The bit patterns are extracted using the `bidBitPattern` and
  /// `dpdBitPattern` accessors. A new decimal floating point number is
  /// created by passing an appropriate bit pattern to the
  /// `init(bidBitPattern:)` and `init(dpdBitPattern:)` initializers.
  /// If incorrect bit encodings are used, there are no guarantees about
  /// the resultant decimal floating point number.
  /// The bit patterns match the decimal interchange format defined by the
  /// [IEEE 754 specification][spec].
  /// For example, a Decimal32 number has been created with the value "1000.3".
  /// Using the `bidBitPattern` accessor, a 32-bit unsigned integer encoded
  /// value of `0x32002713` is returned.  The `dpdBitPattern` returns the
  /// 32-bit unsigned integer encoded value of `0x22404003`. Passing these
  /// numbers to the appropriate initializer recreates the original value
  /// "1000.3".
  /// [spec]:
  public var bidBitPattern: RawSignificand { }
  public var dpdBitPattern: RawSignificand { bid.dpd }
  public init(bidBitPattern: RawSignificand) { = bidBitPattern
  public init(dpdBitPattern: RawSignificand) {
    bid = ID(dpd: ID.RawData(dpdBitPattern))
  public var significandDigitCount: Int {
    guard bid.isValid else { return -1 }
    return _digitsIn(bid.sigBitPattern)
  /// The floating-point value with the same sign and exponent as this value,
  /// but with a significand of 1.0.
  /// A *decade* is a set of decimal floating-point values that all have the
  /// same sign and exponent. The `decade` property is a member of the same
  /// decade as this value, but with a unit significand.
  /// In this example, `x` has a value of `21.5`, which is stored as
  /// `2.15 * 10**1`, where `**` is exponentiation. Therefore, `x.decade` is
  /// equal to `1.0 * 10**1`, or `10.0`.
  /// let x = 21.5
  /// // x.significand == 2.15
  /// // x.exponent == 1
  /// let y = x.decade
  /// // y == 10.0
  /// // y.significand == 1.0
  /// // y.exponent == 1
  public var decade: Self {
    guard bid.isValid else { return self } // For infinity, Nan, sNaN
    return Self(bid: ID(expBitPattern: bid.expBitPattern, sigBitPattern: 1))
1 Like

Overall, the API surface seems quite reasonable with the incorporated modifications.

To align with standard library usage, these should be initializers on their respective destination types; ideally, they'd be implementable generically on BinaryInteger and BinaryFloatingPoint.

See comments in other thread :slight_smile:

Yes, agree these should be moved to the appropriate types.

Sorry, which thread was that?

Your thread about the protocol—looks like you found it :)

Having this be a static property of the type is deeply weird. What's going on here?

I agree, deeply weird, I am open to alternatives regarding how to round the Decimals.
This is primarily used when calculations become inexact so that the answer will be rounded to be as accurate as possible. This can occur with overflow/underflow or when converting from/to some other type. The rounding code takes up about 50% of the total code BTW; handling infinite/Nan/etc. takes another 30%. Most of the infinite/nan handling is now generic but the rounding needs work. Taking it out would be easiest.

The publicly-accessible rounding mode has been removed. There still exists an internal rounding mode that defaults to Rounding.toNearestOrEven. Most of my functions have a rounding parameter that use this round mode.

The reason for the rounding was that it is used in some of the test cases, many of which specified certain rounding modes. As @xwu has pointed out, the FloatingPointRoundingRule, which was used here, was intended for integer rounding. However, given the lack of an alternative, and because many algorithms need to round when converting numbers in underflow or overflow, this rounding mode needs to stay.

When the BinaryFloatingPoint uses rounding, the DecimalFloatingPoint can be updated.

To be clear, my objection was not these functions rounding, it's making the rounding rule API on the type or values, rather than a parameter on arithmetic operations. Having it be a property of the type or values makes no sense. In the IEEE 754 model, types and values do not round, operations do.

This is a misapprehension. It was added to support the rounding API, but FloatingPointRoundingRule is an appropriate type to use for arithmetic rounding as well.

What I would expect to see in a type that supports rounding control would be something like:

// uses default (round to nearest, ties to even)
static func *(a: Self, b: Self) -> Self

// allows rounding to be specified
func multiplied(
  by other: Self,
  rounding rule: FloatingPointRoundingRule = . toNearestTiesToEven
) -> Self

Thanks for the clarification.

That is the intent to have operations support rounding.

I stand corrected! I suppose it’s fine on concrete operations for concrete types as this doesn’t imply that all of the rounding rules need to be supported by all types.

In some cases there exist APIs for functions like addProduct(lhs:rhs:) without a rounding argument. It forces me to define two APIs to be protocol compliant as follows:

  public mutating func addProduct(_ lhs: Self, _ rhs: Self) {
    self.addProduct(lhs, rhs, round: .toNearestOrEven)
  /// Rounding method equivalent of the `addProduct`
  public mutating func addProduct(_ lhs: Self, _ rhs: Self, round: Rounding) {
    bid = ID.fma(,,, round)

Is there any way to work around this so that the API just appears once with a default rounding argument as you've shown?

Protocol requirements cannot have default arguments. You can write an extension that forwards the no-specified-rounding case to the one with the extra parameter, but there's no way around having both of them appear. Happily, this has never been a real problem.

Here are the proposed APIs for the operations with rounding:

  public func adding(to other: Self, rounding rule: Rounding) -> Self
  public func subtracting(_ other: Self, rounding rule: Rounding) -> Self
  public func multiplying(by other: Self, rounding rule: Rounding) -> Self 
  public func dividing(by other: Self, rounding rule: Rounding) -> Self
  public mutating func formSquareRoot(rounding rule: Rounding)
  public mutating func addProduct(_ lhs: Self, _ rhs: Self,  rounding rule: Rounding)
  public init(stringLiteral value: StringLiteralType, rounding rule: Rounding)

Please advise if anything appears to be missing. Note: I also had a float literal initializer but it was suggested to remove this. Not sure if there should be both mutating and non-mutating operations. The squareRoot and addProduct methods seem to have both.

There also seem to be inconsistencies in how the ed is used. There is a rounded() method that returns a value while addingProduct(::) returns a value.