[Proposal draft] Enhanced floating-point protocols

Enhanced floating-point protocols

Proposal: SE-NNNN <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md&gt;
Author(s): Stephen Canon <https://github.com/stephentyrone&gt;
Status: Awaiting review
Review manager: TBD
<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#introduction&gt;Introduction

The current FloatingPoint protocol is quite limited, and provides only a small subset of the features expected of an IEEE 754 conforming type. This proposal expands the protocol to cover most of the expected basic operations, and adds a second protocol, BinaryFloatingPoint, that provides a number of useful tools for generic programming with the most commonly used types.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#motivation&gt;Motivation

Beside the high-level motivation provided by the introduction, the proposed prototype schema addresses a number of issues and requests that we've received from programmers:

FloatingPoint should conform to Equatable, and Comparable
FloatingPoint should conform to FloatLiteralConvertible
Deprecate the % operator for floating-point types
Provide basic constants (analogues of C's DBL_MAX, etc.)
Make Float80 conform to FloatingPoint
It also puts FloatingPoint much more tightly in sync with the work that is being done on protocols for Integers, which will make it easier to provide a uniform interface for arithmetic scalar types.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#detailed-design&gt;Detailed design

A new protocol, Arithmetic, is introduced that provides the most basic operations (add, subtract, multiply and divide) as well as Equatable and IntegerLiteralConvertible, and is conformed to by both integer and floating- point types.

There has been some resistance to adding such a protocol, owing to differences in behavior between floating point and integer arithmetic. While these differences make it difficult to write correct generic code that operates on all "arithmetic" types, it is nonetheless convenient to provide a single protocol that guarantees the availability of these basic operations. It is intended that "number-like" types should provide these APIs.

/// Arithmetic protocol declares methods backing binary arithmetic operators,
/// such as `+`, `-` and `*`; and their mutating counterparts. These methods
/// operate on arguments of the same type.
///
/// Both mutating and non-mutating operations are declared in the protocol, but
/// only the mutating ones are required. Should conforming type omit
/// non-mutating implementations, they will be provided by a protocol extension.
/// Implementation in that case will copy `self`, perform a mutating operation
/// on it and return the resulting value.
public protocol Arithmetic: Equatable, IntegerLiteralConvertible {
  /// Initialize to zero
  init()

  /// The sum of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `add` operation.
  @warn_unused_result
  func adding(rhs: Self) -> Self

  /// Adds `rhs` to `self`.
  mutating func add(rhs: Self)

  /// The result of subtracting `rhs` from `self`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `subtract` operation.
  @warn_unused_result
  func subtracting(rhs: Self) -> Self

  /// Subtracts `rhs` from `self`.
  mutating func subtract(rhs: Self)

  /// The product of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `multiply` operation.
  @warn_unused_result
  func multiplied(by rhs: Self) -> Self

  /// Multiplies `self` by `rhs`.
  mutating func multiply(by rhs: Self)

  /// The quotient of `self` dividing by `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `divide` operation.
  @warn_unused_result
  func divided(by rhs: Self) -> Self

  /// Divides `self` by `rhs`.
  mutating func divide(by rhs: Self)
}

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self
}
The usual arithmetic operators are then defined in terms of the implementation hooks provided by Arithmetic and SignedArithmetic, so providing those operations are all that is necessary for a type to present a "number-like" interface.

The FloatingPoint protocol is split into two parts; FloatingPoint and BinaryFloatingPoint, which conforms to FloatingPoint. If decimal types were added at some future point, they would conform to DecimalFloatingPoint.

FloatingPoint is expanded to contain most of the IEEE 754 basic operations, as well as conformance to SignedArithmetic and Comparable.

/// A floating-point type that provides most of the IEEE 754 basic (clause 5)
/// operations. The base, precision, and exponent range are not fixed in
/// any way by this protocol, but it enforces the basic requirements of
/// any IEEE 754 floating-point type.
///
/// The BinaryFloatingPoint protocol refines these requirements and provides
/// some additional useful operations as well.
public protocol FloatingPoint: SignedArithmetic, Comparable {

  /// An unsigned integer type that can represent the significand of any value.
  ///
  /// The significand (http://en.wikipedia.org/wiki/Significand\) is frequently
  /// also called the "mantissa", but this terminology is slightly incorrect
  /// (see the "Use of 'mantissa'" section on the linked Wikipedia page for
  /// more details). "Significand" is the preferred terminology in IEEE 754.
  associatedtype RawSignificand: UnsignedInteger

  /// 2 for binary floating-point types, 10 for decimal.
  ///
  /// A conforming type may use any integer radix, but values other than
  /// 2 or 10 are extraordinarily rare in practice.
  static var radix: Int { get }

  /// Positive infinity. Compares greater than all finite numbers.
  static var infinity: Self { get }

  /// A quiet NaN (not-a-number). Compares not equal to every value,
  /// including itself.
  static var nan: Self { get }

  /// NaN with specified `payload`.
  ///
  /// Compares not equal to every value, including itself. Most operations
  /// with a NaN operand will produce a NaN result. Note that it is generally
  /// not the case that all possible significand values are valid
  /// NaN `payloads`. `FloatingPoint` types should either treat inadmissible
  /// payloads as zero, or mask them to create an admissible payload.
  @warn_unused_result
  static func nan(payload payload: RawSignificand, signaling: Bool) -> Self

  /// The greatest finite number.
  ///
  /// Compares greater than or equal to all finite numbers, but less than
  /// infinity. Corresponds to the C macros `FLT_MAX`, `DBL_MAX`, etc.
  /// The naming of those macros is slightly misleading, because infinity
  /// is greater than this value.
  static var greatestFiniteMagnitude: Self { get }

  // NOTE: Rationale for "ulp" instead of "epsilon":
  // We do not use that name because it is ambiguous at best and misleading
  // at worst:
  //
  // - Historically several definitions of "machine epsilon" have commonly
  // been used, which differ by up to a factor of two or so. By contrast
  // "ulp" is a term with a specific unambiguous definition.
  //
  // - Some languages have used "epsilon" to refer to wildly different values,
  // such as `leastMagnitude`.
  //
  // - Inexperienced users often believe that "epsilon" should be used as a
  // tolerance for floating-point comparisons, because of the name. It is
  // nearly always the wrong value to use for this purpose.

  /// The unit in the last place of 1.0.
  ///
  /// This is the weight of the least significant bit of the significand of 1.0,
  /// or the positive difference between 1.0 and the next greater representable
  /// number. Corresponds to the C macros `FLT_EPSILON`, `DBL_EPSILON`, etc.
  static var ulp: Self { get }

  /// The unit in the last place of `self`.
  ///
  /// This is the unit of the least significant digit in the significand of
  /// `self`. For most numbers `x`, this is the difference between `x` and
  /// the next greater (in magnitude) representable number. There are some
  /// edge cases to be aware of:
  ///
  /// - `greatestFiniteMagnitude.ulp` is a finite number, even though
  /// the next greater representable value is `infinity`.
  /// - `x.ulp` is `NaN` if `x` is not a finite number.
  /// - If `x` is very small in magnitude, then `x.ulp` may be a subnormal
  /// number. On targets that do not support subnormals, `x.ulp` may be
  /// flushed to zero.
  ///
  /// This quantity, or a related quantity is sometimes called "epsilon" or
  /// "machine epsilon". We avoid that name because it has different meanings
  /// in different languages, which can lead to confusion, and because it
  /// suggests that it is an good tolerance to use for comparisons,
  /// which is almost never is.
  ///
  /// (See Machine epsilon - Wikipedia for more detail)
  var ulp: Self { get }

  /// The least positive normal number.
  ///
  /// Compares less than or equal to all positive normal numbers. There may
  /// be smaller positive numbers, but they are "subnormal", meaning that
  /// they are represented with less precision than normal numbers.
  /// Corresponds to the C macros `FLT_MIN`, `DBL_MIN`, etc. The naming of
  /// those macros is slightly misleading, because subnormals, zeros, and
  /// negative numbers are smaller than this value.
  static var leastNormalMagnitude: Self { get }

  /// The least positive number.
  ///
  /// Compares less than or equal to all positive numbers, but greater than
  /// zero. If the target supports subnormal values, this is smaller than
  /// `leastNormalMagnitude`; otherwise they are equal.
  static var leastMagnitude: Self { get }

  /// `true` iff the signbit of `self` is set. Implements the IEEE 754
  /// `signbit` operation.
  ///
  /// Note that this is not the same as `self < 0`. In particular, this
  /// property is true for `-0` and some NaNs, both of which compare not
  /// less than zero.
  // TODO: strictly speaking a bit and a bool are slightly different
  // concepts. Is another name more appropriate for this property?
  // `isNegative` is incorrect because of -0 and NaN. `isSignMinus` might
  // be acceptable, but isn't great. `signBit` is the IEEE 754 name.
  var signBit: Bool { get }

  /// The integer part of the base-r logarithm of the magnitude of `self`,
  /// where r is the radix (2 for binary, 10 for decimal). Implements the
  /// IEEE 754 `logB` operation.
  ///
  /// Edge cases:
  ///
  /// - If `x` is zero, then `x.exponent` is `Int.min`.
  /// - If `x` is +/-infinity or NaN, then `x.exponent` is `Int.max`
  var exponent: Int { get }

  /// The significand satisfies:
  ///
  /// ~~~
  /// self = (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// If radix is 2 (the most common case), then for finite non-zero numbers
  /// `1 <= significand` and `significand < 2`. For other values of `x`,
  /// `x.significand` is defined as follows:
  ///
  /// - If `x` is zero, then `x.significand` is 0.0.
  /// - If `x` is infinity, then `x.significand` is 1.0.
  /// - If `x` is NaN, then `x.significand` is NaN.
  ///
  /// For all floating-point `x`, if we define y by:
  ///
  /// ~~~
  /// let y = Self(signBit: x.signBit, exponent: x.exponent,
  /// significand: x.significand)
  /// ~~~
  ///
  /// then `y` is equivalent to `x`, meaning that `y` is `x` canonicalized.
  var significand: Self { get }

  /// Initialize from signBit, exponent, and significand.
  ///
  /// The result is:
  ///
  /// ~~~
  /// (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// (where `**` is exponentiation) computed as if by a single correctly-
  /// rounded floating-point operation. If this value is outside the
  /// representable range of the type, overflow or underflow occurs, and zero,
  /// a subnormal value, or infinity may result, as with any basic operation.
  /// Other edge cases:
  ///
  /// - If `significand` is zero or infinite, the result is zero or infinite,
  /// regardless of the value of `exponent`.
  ///
  /// - If `significand` is NaN, the result is NaN.
  ///
  /// Note that for any floating-point `x` the result of
  ///
  /// `Self(signBit: x.signBit,
  /// exponent: x.exponent,
  /// significand: x.significand)`
  ///
  /// is "the same" as `x`; it is `x` canonicalized.
  ///
  /// Because of these properties, this initializer implements the IEEE 754
  /// `scaleB` operation.
  init(signBit: Bool, exponent: Int, significand: Self)

  /// A floating point value whose exponent and signficand are taken from
  /// `magnitude` and whose signBit is taken from `signOf`. Implements the
  /// IEEE 754 `copysign` operation.
  // TODO: better argument names would be great.
  init(magnitudeOf magnitude: Self, signOf: Self)

  /// The least representable value that compares greater than `self`.
  ///
  /// - If `x` is `-infinity`, then `x.nextUp` is `-greatestMagnitude`.
  /// - If `x` is `-leastMagnitude`, then `x.nextUp` is `-0.0`.
  /// - If `x` is zero, then `x.nextUp` is `leastMagnitude`.
  /// - If `x` is `greatestMagnitude`, then `x.nextUp` is `infinity`.
  /// - If `x` is `infinity` or `NaN`, then `x.nextUp` is `x`.
  var nextUp: Self { get }

  /// The greatest representable value that compares less than `self`.
  ///
  /// `x.nextDown` is equivalent to `-(-x).nextUp`
  var nextDown: Self { get }

  /// Remainder of `self` divided by `other`.
  ///
  /// For finite `self` and `other`, the remainder `r` is defined by
  /// `r = self - other*n`, where `n` is the integer nearest to `self/other`.
  /// (Note that `n` is *not* `self/other` computed in floating-point
  /// arithmetic, and that `n` may not even be representable in any available
  /// integer type). If `self/other` is exactly halfway between two integers,
  /// `n` is chosen to be even.
  ///
  /// It follows that if `self` and `other` are finite numbers, the remainder
  /// `r` satisfies `-|other|/2 <= r` and `r <= |other|/2`.
  ///
  /// `formRemainder` is always exact, and therefore is not affected by
  /// rounding modes.
  mutating func formRemainder(dividingBy other: Self)

  /// Remainder of `self` divided by `other` using truncating division.
  ///
  /// If `self` and `other` are finite numbers, the truncating remainder
  /// `r` has the same sign as `other` and is strictly smaller in magnitude.
  /// It satisfies `r = self - other*n`, where `n` is the integral part
  /// of `self/other`.
  ///
  /// `formTruncatingRemainder` is always exact, and therefore is not
  /// affected by rounding modes.
  mutating func formTruncatingRemainder(dividingBy other: Self)

  /// Mutating form of square root.
  mutating func formSquareRoot( )

  /// Fused multiply-add, accumulating the product of `lhs` and `rhs` to `self`.
  mutating func addProduct(lhs: Self, _ rhs: Self)

  /// Remainder of `self` divided by `other`.
  @warn_unused_result
  func remainder(dividingBy other: Self) -> Self

  /// Remainder of `self` divided by `other` using truncating division.
  @warn_unused_result
  func truncatingRemainder(dividingBy other: Self) -> Self

  /// Square root of `self`.
  @warn_unused_result
  func squareRoot( ) -> Self

  /// `self + lhs*rhs` computed without intermediate rounding.
  @warn_unused_result
  func addingProduct(lhs: Self, _ rhs: Self) -> Self

  /// The minimum of `x` and `y`. Implements the IEEE 754 `minNum` operation.
  ///
  /// Returns `x` if `x <= y`, `y` if `y < x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// min(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func minimum(x: Self, _ y: Self) -> Self

  /// The maximum of `x` and `y`. Implements the IEEE 754 `maxNum` operation.
  ///
  /// Returns `x` if `x >= y`, `y` if `y > x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// max(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func maximum(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has lesser magnitude. Implements the IEEE 754
  /// `minNumMag` operation.
  ///
  /// Returns `x` if abs(x) <= abs(y), `y` if abs(y) < abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func minimumMagnitude(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has greater magnitude. Implements the IEEE 754
  /// `maxNumMag` operation.
  ///
  /// Returns `x` if abs(x) >= abs(y), `y` if abs(y) > abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func maximumMagnitude(x: Self, _ y: Self) -> Self

  /// IEEE 754 equality predicate.
  ///
  /// -0 compares equal to +0, and NaN compares not equal to anything,
  /// including itself.
  @warn_unused_result
  func isEqual(to other: Self) -> Bool

  /// IEEE 754 less-than predicate.
  ///
  /// NaN compares not less than anything. -infinity compares less than
  /// all values except for itself and NaN. Everything except for NaN and
  /// +infinity compares less than +infinity.
  @warn_unused_result
  func isLess(than other: Self) -> Bool

  /// IEEE 754 less-than-or-equal predicate.
  ///
  /// NaN compares not less than or equal to anything, including itself.
  /// -infinity compares less than or equal to everything except NaN.
  /// Everything except NaN compares less than or equal to +infinity.
  ///
  /// Because of the existence of NaN in FloatingPoint types, trichotomy does
  /// not hold, which means that `x < y` and `!(y <= x)` are not equivalent.
  /// This is why `isLessThanOrEqual(to:)` is a separate implementation hook
  /// in the protocol.
  ///
  /// Note that this predicate does not impose a total order. The `totalOrder`
  /// predicate provides a refinement satisfying that criteria.
  @warn_unused_result
  func isLessThanOrEqual(to other: Self) -> Bool

  /// IEEE 754 unordered predicate. True if either `self` or `other` is NaN,
  /// and false otherwise.
  @warn_unused_result
  func isUnordered(with other: Self) -> Bool

  /// True if and only if `self` is normal.
  ///
  /// A normal number uses the full precision available in the format. Zero
  /// is not a normal number.
  var isNormal: Bool { get }

  /// True if and only if `self` is finite.
  ///
  /// If `x.isFinite` is `true`, then one of `x.isZero`, `x.isSubnormal`, or
  /// `x.isNormal` is also `true`, and `x.isInfinite` and `x.isNan` are
  /// `false`.
  var isFinite: Bool { get }

  /// True iff `self` is zero. Equivalent to `self == 0`.
  var isZero: Bool { get }

  /// True if and only if `self` is subnormal.
  ///
  /// A subnormal number does not use the full precision available to normal
  /// numbers of the same format. Zero is not a subnormal number.
  var isSubnormal: Bool { get }

  /// True if and only if `self` is infinite.
  ///
  /// Note that `isFinite` and `isInfinite` do not form a dichotomy, because
  /// they are not total. If `x` is `NaN`, then both properties are `false`.
  var isInfinite: Bool { get }

  /// True if and only if `self` is NaN ("not a number").
  var isNan: Bool { get }

  /// True if and only if `self` is a signaling NaN.
  var isSignalingNan: Bool { get }

  /// The IEEE 754 "class" of this type.
  var floatingPointClass: FloatingPointClassification { get }

  /// True if and only if `self` is canonical.
  ///
  /// Every floating-point value of type Float or Double is canonical, but
  /// non-canonical values of type Float80 exist, and non-canonical values
  /// may exist for other types that conform to FloatingPoint.
  ///
  /// The non-canonical Float80 values are known as "pseudo-denormal",
  /// "unnormal", "pseudo-infinity", and "pseudo-NaN".
  /// (Extended precision - Wikipedia)
  var isCanonical: Bool { get }

  /// True if and only if `self` preceeds `other` in the IEEE 754 total order
  /// relation.
  ///
  /// This relation is a refinement of `<=` that provides a total order on all
  /// values of type `Self`, including non-canonical encodings, signed zeros,
  /// and NaNs. Because it is used much less frequently than the usual
  /// comparisons, there is no operator form of this relation.
  @warn_unused_result
  func totalOrder(with other: Self) -> Bool

  /// True if and only if `abs(self)` preceeds `abs(other)` in the IEEE 754
  /// total order relation.
  @warn_unused_result
  func totalOrderMagnitude(with other: Self) -> Bool

  /// The closest representable value to the argument.
  init<Source: Integer>(_ value: Source)

  /// Fails if the argument cannot be exactly represented.
  init?<Source: Integer>(exactly value: Source)
}
The BinaryFloatingPoint protocol provides a number of additional APIs that only make sense for types with fixed radix 2:

/// A radix-2 (binary) floating-point type that follows the IEEE 754 encoding
/// conventions.
public protocol BinaryFloatingPoint: FloatingPoint {

  /// The number of bits used to represent the exponent.
  ///
  /// Following IEEE 754 encoding convention, the exponent bias is:
  ///
  /// bias = 2**(exponentBitCount-1) - 1
  ///
  /// The least normal exponent is `1-bias` and the largest finite exponent
  /// is `bias`. The all-zeros exponent is reserved for subnormals and zeros,
  /// and the all-ones exponent is reserved for infinities and NaNs.
  static var exponentBitCount: Int { get }

  /// For fixed-width floating-point types, this is the number of fractional
  /// significand bits.
  ///
  /// For extensible floating-point types, `significandBitCount` should be
  /// the maximum allowed significand width (without counting any leading
  /// integral bit of the significand). If there is no upper limit, then
  /// `significandBitCount` should be `Int.max`.
  ///
  /// Note that `Float80.significandBitCount` is 63, even though 64 bits
  /// are used to store the significand in the memory representation of a
  /// `Float80` (unlike other floating-point types, `Float80` explicitly
  /// stores the leading integral significand bit, but the
  /// `BinaryFloatingPoint` APIs provide an abstraction so that users don't
  /// need to be aware of this detail).
  static var significandBitCount: Int { get }

  /// The raw encoding of the exponent field of the floating-point value.
  var exponentBitPattern: UInt { get }

  /// The raw encoding of the significand field of the floating-point value.
  ///
  /// `significandBitPattern` does *not* include the leading integral bit of
  /// the significand, even for types like `Float80` that store it explicitly.
  var significandBitPattern: RawSignificand { get }

  /// Combines `signBit`, `exponent` and `significand` bit patterns to produce
  /// a floating-point value.
  init(signBit: Bool,
       exponentBitPattern: UInt,
       significandBitPattern: RawSignificand)

  /// The least-magnitude member of the binade of `self`.
  ///
  /// If `x` is `+/-significand * 2**exponent`, then `x.binade` is
  /// `+/- 2**exponent`; i.e. the floating point number with the same sign
  /// and exponent, but with a significand of 1.0.
  var binade: Self { get }

  /// The number of bits required to represent significand.
  ///
  /// If `self` is not a finite non-zero number, `significandWidth` is
  /// `-1`. Otherwise, it is the number of bits required to represent the
  /// significand exactly (less `1` because common formats represent one bit
  /// implicitly).
  var significandWidth: Int { get }

  @warn_unused_result
  func isEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isLess<Other: BinaryFloatingPoint>(than other: Other) -> Bool

  @warn_unused_result
  func isLessThanOrEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isUnordered<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  @warn_unused_result
  func totalOrder<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  /// `value` rounded to the closest representable value.
  init<Source: BinaryFloatingPoint>(_ value: Source)

  /// Fails if `value` cannot be represented exactly as `Self`.
  init?<Source: BinaryFloatingPoint>(exactly value: Source)
}
Float, Double, Float80 and CGFloat will conform to all of these protocols.

A small portion of the implementation of these APIs is dependent on new Integer protocols that will be proposed separately. Everything else is implemented in draft form on the branch floating-point-revision of my fork <https://github.com/stephentyrone/swift&gt;\.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#impact-on-existing-code&gt;Impact on existing code

The % operator is no longer available for FloatingPoint types. We don't believe that it was widely used correctly, and the operation is still available via the formTruncatingRemainder method for people who need it.

To follow the naming guidelines, NaN and isNaN are replaced with nan and isNan.

The redundant property quietNaN is removed.

isSignaling is renamed isSignalingNan.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#alternatives-considered&gt;Alternatives considered

N/A.

* I do use % for floating point but not as much as I first thought before I started searching through my code after reading your e-mail. But when I do use it, it's nice to have a really familiar symbol rather than a big word. What were the ways that it was used incorrectly? Do you have some examples?

* I don't quite get how equatable is going to work. Do you mind explaining that in more detail?

-- E

···

On Apr 14, 2016, at 5:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:
Impact on existing code

The % operator is no longer available for FloatingPoint types. We don't believe that it was widely used correctly, and the operation is still available via the formTruncatingRemainder method for people who need it.

First of all: I do *not* do crazy things with floating-point numbers, so there's a good chance I'm missing the point with some of this. Consider anything having to do with NaNs, subnormals, or other such strange denizens of the FPU to be prefixed with "I may be totally missing the point, but…"

public protocol Arithmetic

Is there a rationale for the name "Arithmetic"? There's probably nothing wrong with it, but I would have guessed you'd use the name "Number".

: Equatable, IntegerLiteralConvertible

Are there potential conforming types which aren't Comparable?

func adding(rhs: Self) -> Self
mutating func add(rhs: Self)

Is there a reason we're introducing methods and calling them from the operators, rather than listing the operators themselves as requirements?

  func negate() -> Self

Should this be `negated()`? Should there be a mutating `negate()` variant, even if we won't have an operator for it?

(If a mutating `negate` would be an attractive nuisance, we can use `negative()`/`formNegative()` instead.)

  /// NaN `payloads`. `FloatingPoint` types should either treat inadmissible
  /// payloads as zero, or mask them to create an admissible payload.
  static func nan(payload payload: RawSignificand, signaling: Bool) -> Self

This seems unusually tolerant of bad inputs. Should this instead be a precondition, and have an (elidable in unchecked mode) trap if it's violated?

static var greatestFiniteMagnitude: Self { get }
static var leastNormalMagnitude: Self { get }
static var leastMagnitude: Self { get }

Reading these, I find the use of "least" a little bit misleading—it seems like they should be negative. I wonder if instead, we can use ClosedIntervals/ClosedRanges to group together related values:

  static var positiveNormals: ClosedRange<Self> { get }
  static var positiveSubnormals: ClosedRange<Self> { get }

  Double.positiveNormals.upperBound // DBL_MAX
  Double.positiveNormals.lowerBound // DBL_MIN
  Double.positiveSubnormals.upperBound // Self.positiveNormals.lowerBound.nextDown
  Double.positiveSubnormals.lowerBound // 0.nextUp

  // Alternatively, you could have `positives`, running from 0.nextUp to infinity

Technically, you could probably implement calls like e.g. isNormal in terms of the positiveNormals property, but I'm sure separate calls are much, much faster.

(It might also be helpful if you could negate signed ClosedIntervals, which would negate and swap the bounds.)

public protocol FloatingPoint: SignedArithmetic, Comparable {
  func isLess(than other: Self) -> Bool
  func totalOrder(with other: Self) -> Bool

Swift 2's Comparable demands a strict total order. However, the documentation here seems to imply that totalOrder is *not* what you get from the < operator. Is something getting reshuffled here?

  init<Source: Integer>(_ value: Source)
  init?<Source: Integer>(exactly value: Source)

  init<Source: BinaryFloatingPoint>(_ value: Source)
  init?<Source: BinaryFloatingPoint>(exactly value: Source)

It's great to have both of these, but I wonder how they're going to be implemented—if Integer can be either signed or unsigned, I'm not sure how you get the top bit of an unsigned integer out.

Also, since `init(_:)` is lossy and `init(exactly:)` is not, shouldn't their names technically be switched? Or perhaps `init(_:)` should be exact and trapping, `init(exactly:)` should be failable, and `init(closest:)` should always return something or other?

···

--
Brent Royal-Gordon
Architechies

This proposal looks really really great, let me know when you want to start the review process (or just submit a PR for the -evolution repo) and I’ll happily review manage it for you.

Provide basic constants (analogues of C's DBL_MAX, etc.)

Nice, have you considered adding pi/e and other common constants? I’d really really like to see use of M_PI go away… :-)

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self

Should this be negated / negate? negate() seems like an in-place mutating version.

/// A floating-point type that provides most of the IEEE 754 basic (clause 5)

Dumb Q, but is it “IEEE 754” or “IEEE-754”?

/// operations. The base, precision, and exponent range are not fixed in
/// any way by this protocol, but it enforces the basic requirements of
/// any IEEE 754 floating-point type.
///
/// The BinaryFloatingPoint protocol refines these requirements and provides
/// some additional useful operations as well.
public protocol FloatingPoint: SignedArithmetic, Comparable {

  static var ulp: Self { get }
  var ulp: Self { get }

Swift supports instance and type members with the same names, but this is controversial, leads to confusion, and may go away in the future. It would be great to avoid this in your design.

  // TODO: strictly speaking a bit and a bool are slightly different
  // concepts. Is another name more appropriate for this property?
  // `isNegative` is incorrect because of -0 and NaN. `isSignMinus` might
  // be acceptable, but isn't great. `signBit` is the IEEE 754 name.
  var signBit: Bool { get }

I think you have this right, by calling it a bit and typing it as a Bool -using a Bool to represent a specific bit from Self seems right.

  /// The significand satisfies:
  ///
  /// ~~~
  /// self = (signBit ? -1 : 1) * significand * radix**exponent

** isn’t a defined swift operator, it would be nice to change the comment to use something a swift programmer would recognize.

  /// ~~~
  ///
  /// If radix is 2 (the most common case), then for finite non-zero numbers
  /// `1 <= significand` and `significand < 2`. For other values of `x`,
  /// `x.significand` is defined as follows:
  ///
  /// - If `x` is zero, then `x.significand` is 0.0.
  /// - If `x` is infinity, then `x.significand` is 1.0.
  /// - If `x` is NaN, then `x.significand` is NaN.

...

  var significand: Self { get }

I’m certainly not a floating point guru, but I would have expected significant to be of type RawSignificand, and thought that the significant of a nan would return its payload. Does this approach make sense?

… later: I see that you have this on the binary FP type, so I assume there is a good reason for this :-)

  /// Because of these properties, this initializer implements the IEEE 754
  /// `scaleB` operation.
  init(signBit: Bool, exponent: Int, significand: Self)

Stylistic question, but why list the initializers after members?

  /// Mutating form of square root.
  mutating func formSquareRoot( )

extra space in the parens?

  /// Fused multiply-add, accumulating the product of `lhs` and `rhs` to `self`.
  mutating func addProduct(lhs: Self, _ rhs: Self)

Stylistic, but it is easier to read with the mutating next to the non-mutating pairs.

  /// True if and only if `self` is subnormal.
  ///
  /// A subnormal number does not use the full precision available to normal
  /// numbers of the same format. Zero is not a subnormal number.
  var isSubnormal: Bool { get }

I’m used to this being called a “Denormal”, but I suspect that “subnormal” is the actually right name? Maybe it would be useful to mention the “frequently known as denormal” in the comment, like you did with mantissa earlier.

Impact on existing code

The % operator is no longer available for FloatingPoint types. We don't believe that it was widely used correctly, and the operation is still available via the formTruncatingRemainder method for people who need it.

Also worth mentioning that this operator is not supported in popular languages like C either.

-Chris

···

On Apr 14, 2016, at 4:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:

In general, I think this is fantastic. In particular, I *really* like the notion that `BinaryFloatingPoint` conforms to `FloatingPoint`. I would do a few things differently, though:

public protocol FloatingPoint: SignedArithmetic, Comparable {
  ...
  /// The greatest finite number.
  ///
  /// Compares greater than or equal to all finite numbers, but less than
  /// infinity. Corresponds to the C macros `FLT_MAX`, `DBL_MAX`, etc.
  /// The naming of those macros is slightly misleading, because infinity
  /// is greater than this value.
  static var greatestFiniteMagnitude: Self { get }
  ...
}

Why put this in FloatingPoint? The concept is valid for any real type (IMHO, it’s valid for rectangular Complex/Quaternion/etc types as well... I’m not sure if it is for the various polar formats, though). I think a better place for it is either in `Arithmetic`, or another protocol to which `Arithmetic` conforms:
protocol HasMinAndMaxFiniteValue {
    static var maxFinite: Self {get} // I think “max”/“min" are clear enough (especially if the docs are explicit), but I understand the objection
    static var minFinite: Self {get} // 0 for unsigned types
}
This would unify the syntax for getting a numeric type’s min or max finite value across all the built-in numeric types (this means that `Int.max` would become `Int.maxFinite`). Similarly, IMHO infinity shouldn’t be tied to floating point types. While it’s true that the *native* integer types don’t support infinity, arbitrary precision integer types might.
protocol HasInfinity {
    static var infinity: Self {get}
}

/// Arithmetic protocol declares methods backing binary arithmetic operators,
/// such as `+`, `-` and `*`; and their mutating counterparts. These methods
/// operate on arguments of the same type.
...
public protocol Arithmetic: Equatable, IntegerLiteralConvertible {
  init()
  func adding(rhs: Self) -> Self
  mutating func add(rhs: Self)
  func subtracting(rhs: Self) -> Self
  mutating func subtract(rhs: Self)
  func multiplied(by rhs: Self) -> Self
  mutating func multiply(by rhs: Self)
  func divided(by rhs: Self) -> Self
  mutating func divide(by rhs: Self)
}

I’d restructure this a bit:
protocol Arithmetic { //AFAIK *all* numeric types should be able to do these
    init()
    func adding(rhs: Self) -> Self
    mutating func add(rhs: Self)
    func subtracting(rhs: Self) -> Self
    mutating func subtract(rhs: Self)
}
protocol ScalarArithmetic : Arithmetic { //These can be iffy for non-scalar types
    func multiplied(by rhs: Self) -> Self
    mutating func multiply(by rhs: Self)
    func divided(by rhs: Self) -> Self
    mutating func divide(by rhs: Self)
}

Multiplication isn't always defined for any two arbitrarily-dimensioned matrices (plus, there are so many reasonable matrix “subtypes” that there's no guarantee that the return type should always be the same), and I don’t think there’s a generally agreed-upon meaning for matrix division at all.

[Slight_Rabbit_Trail] For a while, I was trying to get work around the issue in my own code by doing something like (this was before the change to “associatedtype”):
public protocol MathLibNumberType {
    typealias AddType
    typealias AddReturnType
    ...
    func +(_: Self, _: Self.AddType) -> Self.AddReturnType
    ...
}
But there was some problem when I got to this part:
public protocol ScalarType : MathLibNumberType {
    typealias AddType = Self
    typealias AddReturnType = Self
    ...
}
extension Int : ScalarType {}
I can’t remember what the exact problem was anymore. It’s been a while… I think maybe even pre-Swift 2. Hmm… maybe I should try it again...
[/Slight_Rabbit_Trail]

Anyway, those are my thoughts on the matter.

- Dave Sweeris

···

On Apr 14, 2016, at 6:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:

  /// The quotient of `self` dividing by `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `divide` operation.
  @warn_unused_result
  func divided(by rhs: Self) -> Self

  /// Divides `self` by `rhs`.
  mutating func divide(by rhs: Self)

When dealing with integer arithmetic, I often find useful a `divmod` function which produces a (quotient, remainder) pair.
It could be argued that such a pair is the primary result of division on integers. It would be great to have such a function included in the design.

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self
}

It might make sense to also have a

public protocol InvertibleArithmetic : Arithmetic {
  func inverted() -> Self
}

FloatingPoint would conform to this protocol, returning 1/x, while integer types would not.

···

--
Nicola

+1; this is great!

I have nothing but good things to say about the proposal itself.

I have two smaller questions, however; I apologize if they are off-topic.

One is if there’s any ETA or similar for a glimpse at the “complete picture” of Swift’s revised numeric protocols; these floating-point protocols look really, really good, but this is also (I think) the first glimpse at the new `Arithmetic` protocol, and there’s also a new “Integer” protocol coming…and it’d be nice to get a sense of the complete vision here.

My other question is potentially subsumed by the above, but I want to raise it now: it’d be great if there was some standardized protocol/vocabulary to use when converting between various numeric representations that was:

- easy for custom numeric types to *adopt* correctly (e.g. if one were to write a fixed-point type, or a rational type, etc.)
- easy for non-experts to *use* correctly for non-expert purposes

…since such conversions from one representation to another are at least IMHO a dangerous area; if you know what you’re doing it’s not dangerous, but e.g. even if someone is only trying to go from Double -> Int:

- they probably aren’t an expert, doing expert numerical things
- they may not have a solid understanding of floating point (NaN, infinities, etc.)
- they thus may not know they may *need* to be careful here
- they may not know *how* to be careful, even if they know they *should* be
- they may not be able to be careful *correctly*, even if they attempt it

…and so it’d again be great if the revised numeric protocols allow as broad a range of such conversions as possible to be handled by generic code in the standard library.

It certainly looks like `FloatingPoint` protocol itself provides enough information to allow an expert to write generic version of most floating point -> integer conversion variants I can think of, but I’m not an expert…but it’d be great if e.g. there was some simpler protocol other custom numeric types could adopt to take advantage of expert-written generic conversions to other numeric types.

I can provide examples if this is unclear, and if it’s off-topic it can wait for another time.

This `FloatingPoint` revision itself looks really really good!

···

On Apr 14, 2016, at 6:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:

Enhanced floating-point protocols

The floating-point API design looks great. However, I'm concerned about providing a generic Arithmetic protocol:

···

On Apr 14, 2016, at 4:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:

A new protocol, Arithmetic, is introduced that provides the most basic operations (add, subtract, multiply and divide) as well as Equatable and IntegerLiteralConvertible, and is conformed to by both integer and floating- point types.

There has been some resistance to adding such a protocol, owing to differences in behavior between floating point and integer arithmetic. While these differences make it difficult to write correct generic code that operates on all "arithmetic" types, it is nonetheless convenient to provide a single protocol that guarantees the availability of these basic operations. It is intended that "number-like" types should provide these APIs.

There are many other things we could do because they're "convenient", but we don't because they're wrong or mislead users into design cul-de-sacs. For example, we could provide integer indexing into strings, which would certainly be convenient, but we don't do that because it would lead to misguided accidentally-quadratic algorithms all over the place. This feels like a similar accommodation—while convenient, it makes it too easy for users to write naive real-number-arithmetic code and apply it blindly to numeric representations with very different error and overflow behavior. By including "divides" in the protocol, you're also implying a common abstraction over two *completely different* operations—integer quotient and floating-point division don't share many properties other than unfortunately sharing an operator in C.

-Joe

+1 great addition.

Would suggest the naming could be more consistent, in particular:

   1. Anything returning Self could be named xxxed. In the current proposal
   this naming convention is sometimes used, e.g. divided, and sometimes not,
   e.g. subtracting. Suggest all unified with the xxxed convention.
   2. Anything returning Bool could be named isXxx. In some cases this is
   used, e.g. isUnordered, but not others, e.g. totalOrder.

  -- Howard.

···

On 15 April 2016 at 09:55, Stephen Canon via swift-evolution < swift-evolution@swift.org> wrote:

Enhanced floating-point protocols

   - Proposal: SE-NNNN
   <https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md&gt;
   - Author(s): Stephen Canon <https://github.com/stephentyrone&gt;
   - Status: *Awaiting review*
   - Review manager: TBD

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#introduction&gt;
Introduction

The current FloatingPoint protocol is quite limited, and provides only a
small subset of the features expected of an IEEE 754 conforming type. This
proposal expands the protocol to cover most of the expected basic
operations, and adds a second protocol, BinaryFloatingPoint, that provides
a number of useful tools for generic programming with the most commonly
used types.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#motivation&gt;
Motivation

Beside the high-level motivation provided by the introduction, the
proposed prototype schema addresses a number of issues and requests that
we've received from programmers:

   - FloatingPoint should conform to Equatable, and Comparable
   - FloatingPoint should conform to FloatLiteralConvertible
   - Deprecate the % operator for floating-point types
   - Provide basic constants (analogues of C's DBL_MAX, etc.)
   - Make Float80 conform to FloatingPoint

It also puts FloatingPoint much more tightly in sync with the work that is
being done on protocols for Integers, which will make it easier to provide
a uniform interface for arithmetic scalar types.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#detailed-design&gt;Detailed
design

A new protocol, Arithmetic, is introduced that provides the most basic
operations (add, subtract, multiply and divide) as well as Equatable and
IntegerLiteralConvertible, and is conformed to by both integer and
floating- point types.

There has been some resistance to adding such a protocol, owing to
differences in behavior between floating point and integer arithmetic.
While these differences make it difficult to write correct generic code
that operates on all "arithmetic" types, it is nonetheless convenient to
provide a single protocol that guarantees the availability of these basic
operations. It is intended that "number-like" types should provide these
APIs.

/// Arithmetic protocol declares methods backing binary arithmetic operators,/// such as `+`, `-` and `*`; and their mutating counterparts. These methods/// operate on arguments of the same type.////// Both mutating and non-mutating operations are declared in the protocol, but/// only the mutating ones are required. Should conforming type omit/// non-mutating implementations, they will be provided by a protocol extension./// Implementation in that case will copy `self`, perform a mutating operation/// on it and return the resulting value.public protocol Arithmetic: Equatable, IntegerLiteralConvertible {
  /// Initialize to zero
  init()

  /// The sum of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `add` operation.
  @warn_unused_result
  func adding(rhs: Self) -> Self

  /// Adds `rhs` to `self`.
  mutating func add(rhs: Self)

  /// The result of subtracting `rhs` from `self`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `subtract` operation.
  @warn_unused_result
  func subtracting(rhs: Self) -> Self

  /// Subtracts `rhs` from `self`.
  mutating func subtract(rhs: Self)

  /// The product of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `multiply` operation.
  @warn_unused_result
  func multiplied(by rhs: Self) -> Self

  /// Multiplies `self` by `rhs`.
  mutating func multiply(by rhs: Self)

  /// The quotient of `self` dividing by `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `divide` operation.
  @warn_unused_result
  func divided(by rhs: Self) -> Self

  /// Divides `self` by `rhs`.
  mutating func divide(by rhs: Self)
}
/// SignedArithmetic protocol will only be conformed to by signed numbers,/// otherwise it would be possible to negate an unsigned value.////// The only method of this protocol has the default implementation in an/// extension, that uses a parameterless initializer and subtraction.public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self
}

The usual arithmetic operators are then defined in terms of the
implementation hooks provided by Arithmetic and SignedArithmetic, so
providing those operations are all that is necessary for a type to present
a "number-like" interface.

The FloatingPoint protocol is split into two parts; FloatingPoint and
BinaryFloatingPoint, which conforms to FloatingPoint. If decimal types
were added at some future point, they would conform to
DecimalFloatingPoint.

FloatingPoint is expanded to contain most of the IEEE 754 basic
operations, as well as conformance to SignedArithmetic and Comparable.

/// A floating-point type that provides most of the IEEE 754 basic (clause 5)/// operations. The base, precision, and exponent range are not fixed in/// any way by this protocol, but it enforces the basic requirements of/// any IEEE 754 floating-point type.////// The BinaryFloatingPoint protocol refines these requirements and provides/// some additional useful operations as well.public protocol FloatingPoint: SignedArithmetic, Comparable {

  /// An unsigned integer type that can represent the significand of any value.
  ///
  /// The significand (http://en.wikipedia.org/wiki/Significand\) is frequently
  /// also called the "mantissa", but this terminology is slightly incorrect
  /// (see the "Use of 'mantissa'" section on the linked Wikipedia page for
  /// more details). "Significand" is the preferred terminology in IEEE 754.
  associatedtype RawSignificand: UnsignedInteger

  /// 2 for binary floating-point types, 10 for decimal.
  ///
  /// A conforming type may use any integer radix, but values other than
  /// 2 or 10 are extraordinarily rare in practice.
  static var radix: Int { get }

  /// Positive infinity. Compares greater than all finite numbers.
  static var infinity: Self { get }

  /// A quiet NaN (not-a-number). Compares not equal to every value,
  /// including itself.
  static var nan: Self { get }

  /// NaN with specified `payload`.
  ///
  /// Compares not equal to every value, including itself. Most operations
  /// with a NaN operand will produce a NaN result. Note that it is generally
  /// not the case that all possible significand values are valid
  /// NaN `payloads`. `FloatingPoint` types should either treat inadmissible
  /// payloads as zero, or mask them to create an admissible payload.
  @warn_unused_result
  static func nan(payload payload: RawSignificand, signaling: Bool) -> Self

  /// The greatest finite number.
  ///
  /// Compares greater than or equal to all finite numbers, but less than
  /// infinity. Corresponds to the C macros `FLT_MAX`, `DBL_MAX`, etc.
  /// The naming of those macros is slightly misleading, because infinity
  /// is greater than this value.
  static var greatestFiniteMagnitude: Self { get }

  // NOTE: Rationale for "ulp" instead of "epsilon":
  // We do not use that name because it is ambiguous at best and misleading
  // at worst:
  //
  // - Historically several definitions of "machine epsilon" have commonly
  // been used, which differ by up to a factor of two or so. By contrast
  // "ulp" is a term with a specific unambiguous definition.
  //
  // - Some languages have used "epsilon" to refer to wildly different values,
  // such as `leastMagnitude`.
  //
  // - Inexperienced users often believe that "epsilon" should be used as a
  // tolerance for floating-point comparisons, because of the name. It is
  // nearly always the wrong value to use for this purpose.

  /// The unit in the last place of 1.0.
  ///
  /// This is the weight of the least significant bit of the significand of 1.0,
  /// or the positive difference between 1.0 and the next greater representable
  /// number. Corresponds to the C macros `FLT_EPSILON`, `DBL_EPSILON`, etc.
  static var ulp: Self { get }

  /// The unit in the last place of `self`.
  ///
  /// This is the unit of the least significant digit in the significand of
  /// `self`. For most numbers `x`, this is the difference between `x` and
  /// the next greater (in magnitude) representable number. There are some
  /// edge cases to be aware of:
  ///
  /// - `greatestFiniteMagnitude.ulp` is a finite number, even though
  /// the next greater representable value is `infinity`.
  /// - `x.ulp` is `NaN` if `x` is not a finite number.
  /// - If `x` is very small in magnitude, then `x.ulp` may be a subnormal
  /// number. On targets that do not support subnormals, `x.ulp` may be
  /// flushed to zero.
  ///
  /// This quantity, or a related quantity is sometimes called "epsilon" or
  /// "machine epsilon". We avoid that name because it has different meanings
  /// in different languages, which can lead to confusion, and because it
  /// suggests that it is an good tolerance to use for comparisons,
  /// which is almost never is.
  ///
  /// (See Machine epsilon - Wikipedia for more detail)
  var ulp: Self { get }

  /// The least positive normal number.
  ///
  /// Compares less than or equal to all positive normal numbers. There may
  /// be smaller positive numbers, but they are "subnormal", meaning that
  /// they are represented with less precision than normal numbers.
  /// Corresponds to the C macros `FLT_MIN`, `DBL_MIN`, etc. The naming of
  /// those macros is slightly misleading, because subnormals, zeros, and
  /// negative numbers are smaller than this value.
  static var leastNormalMagnitude: Self { get }

  /// The least positive number.
  ///
  /// Compares less than or equal to all positive numbers, but greater than
  /// zero. If the target supports subnormal values, this is smaller than
  /// `leastNormalMagnitude`; otherwise they are equal.
  static var leastMagnitude: Self { get }

  /// `true` iff the signbit of `self` is set. Implements the IEEE 754
  /// `signbit` operation.
  ///
  /// Note that this is not the same as `self < 0`. In particular, this
  /// property is true for `-0` and some NaNs, both of which compare not
  /// less than zero.
  // TODO: strictly speaking a bit and a bool are slightly different
  // concepts. Is another name more appropriate for this property?
  // `isNegative` is incorrect because of -0 and NaN. `isSignMinus` might
  // be acceptable, but isn't great. `signBit` is the IEEE 754 name.
  var signBit: Bool { get }

  /// The integer part of the base-r logarithm of the magnitude of `self`,
  /// where r is the radix (2 for binary, 10 for decimal). Implements the
  /// IEEE 754 `logB` operation.
  ///
  /// Edge cases:
  ///
  /// - If `x` is zero, then `x.exponent` is `Int.min`.
  /// - If `x` is +/-infinity or NaN, then `x.exponent` is `Int.max`
  var exponent: Int { get }

  /// The significand satisfies:
  ///
  /// ~~~
  /// self = (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// If radix is 2 (the most common case), then for finite non-zero numbers
  /// `1 <= significand` and `significand < 2`. For other values of `x`,
  /// `x.significand` is defined as follows:
  ///
  /// - If `x` is zero, then `x.significand` is 0.0.
  /// - If `x` is infinity, then `x.significand` is 1.0.
  /// - If `x` is NaN, then `x.significand` is NaN.
  ///
  /// For all floating-point `x`, if we define y by:
  ///
  /// ~~~
  /// let y = Self(signBit: x.signBit, exponent: x.exponent,
  /// significand: x.significand)
  /// ~~~
  ///
  /// then `y` is equivalent to `x`, meaning that `y` is `x` canonicalized.
  var significand: Self { get }

  /// Initialize from signBit, exponent, and significand.
  ///
  /// The result is:
  ///
  /// ~~~
  /// (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// (where `**` is exponentiation) computed as if by a single correctly-
  /// rounded floating-point operation. If this value is outside the
  /// representable range of the type, overflow or underflow occurs, and zero,
  /// a subnormal value, or infinity may result, as with any basic operation.
  /// Other edge cases:
  ///
  /// - If `significand` is zero or infinite, the result is zero or infinite,
  /// regardless of the value of `exponent`.
  ///
  /// - If `significand` is NaN, the result is NaN.
  ///
  /// Note that for any floating-point `x` the result of
  ///
  /// `Self(signBit: x.signBit,
  /// exponent: x.exponent,
  /// significand: x.significand)`
  ///
  /// is "the same" as `x`; it is `x` canonicalized.
  ///
  /// Because of these properties, this initializer implements the IEEE 754
  /// `scaleB` operation.
  init(signBit: Bool, exponent: Int, significand: Self)

  /// A floating point value whose exponent and signficand are taken from
  /// `magnitude` and whose signBit is taken from `signOf`. Implements the
  /// IEEE 754 `copysign` operation.
  // TODO: better argument names would be great.
  init(magnitudeOf magnitude: Self, signOf: Self)

  /// The least representable value that compares greater than `self`.
  ///
  /// - If `x` is `-infinity`, then `x.nextUp` is `-greatestMagnitude`.
  /// - If `x` is `-leastMagnitude`, then `x.nextUp` is `-0.0`.
  /// - If `x` is zero, then `x.nextUp` is `leastMagnitude`.
  /// - If `x` is `greatestMagnitude`, then `x.nextUp` is `infinity`.
  /// - If `x` is `infinity` or `NaN`, then `x.nextUp` is `x`.
  var nextUp: Self { get }

  /// The greatest representable value that compares less than `self`.
  ///
  /// `x.nextDown` is equivalent to `-(-x).nextUp`
  var nextDown: Self { get }

  /// Remainder of `self` divided by `other`.
  ///
  /// For finite `self` and `other`, the remainder `r` is defined by
  /// `r = self - other*n`, where `n` is the integer nearest to `self/other`.
  /// (Note that `n` is *not* `self/other` computed in floating-point
  /// arithmetic, and that `n` may not even be representable in any available
  /// integer type). If `self/other` is exactly halfway between two integers,
  /// `n` is chosen to be even.
  ///
  /// It follows that if `self` and `other` are finite numbers, the remainder
  /// `r` satisfies `-|other|/2 <= r` and `r <= |other|/2`.
  ///
  /// `formRemainder` is always exact, and therefore is not affected by
  /// rounding modes.
  mutating func formRemainder(dividingBy other: Self)

  /// Remainder of `self` divided by `other` using truncating division.
  ///
  /// If `self` and `other` are finite numbers, the truncating remainder
  /// `r` has the same sign as `other` and is strictly smaller in magnitude.
  /// It satisfies `r = self - other*n`, where `n` is the integral part
  /// of `self/other`.
  ///
  /// `formTruncatingRemainder` is always exact, and therefore is not
  /// affected by rounding modes.
  mutating func formTruncatingRemainder(dividingBy other: Self)

  /// Mutating form of square root.
  mutating func formSquareRoot( )

  /// Fused multiply-add, accumulating the product of `lhs` and `rhs` to `self`.
  mutating func addProduct(lhs: Self, _ rhs: Self)

  /// Remainder of `self` divided by `other`.
  @warn_unused_result
  func remainder(dividingBy other: Self) -> Self

  /// Remainder of `self` divided by `other` using truncating division.
  @warn_unused_result
  func truncatingRemainder(dividingBy other: Self) -> Self

  /// Square root of `self`.
  @warn_unused_result
  func squareRoot( ) -> Self

  /// `self + lhs*rhs` computed without intermediate rounding.
  @warn_unused_result
  func addingProduct(lhs: Self, _ rhs: Self) -> Self

  /// The minimum of `x` and `y`. Implements the IEEE 754 `minNum` operation.
  ///
  /// Returns `x` if `x <= y`, `y` if `y < x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// min(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func minimum(x: Self, _ y: Self) -> Self

  /// The maximum of `x` and `y`. Implements the IEEE 754 `maxNum` operation.
  ///
  /// Returns `x` if `x >= y`, `y` if `y > x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// max(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func maximum(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has lesser magnitude. Implements the IEEE 754
  /// `minNumMag` operation.
  ///
  /// Returns `x` if abs(x) <= abs(y), `y` if abs(y) < abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func minimumMagnitude(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has greater magnitude. Implements the IEEE 754
  /// `maxNumMag` operation.
  ///
  /// Returns `x` if abs(x) >= abs(y), `y` if abs(y) > abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func maximumMagnitude(x: Self, _ y: Self) -> Self

  /// IEEE 754 equality predicate.
  ///
  /// -0 compares equal to +0, and NaN compares not equal to anything,
  /// including itself.
  @warn_unused_result
  func isEqual(to other: Self) -> Bool

  /// IEEE 754 less-than predicate.
  ///
  /// NaN compares not less than anything. -infinity compares less than
  /// all values except for itself and NaN. Everything except for NaN and
  /// +infinity compares less than +infinity.
  @warn_unused_result
  func isLess(than other: Self) -> Bool

  /// IEEE 754 less-than-or-equal predicate.
  ///
  /// NaN compares not less than or equal to anything, including itself.
  /// -infinity compares less than or equal to everything except NaN.
  /// Everything except NaN compares less than or equal to +infinity.
  ///
  /// Because of the existence of NaN in FloatingPoint types, trichotomy does
  /// not hold, which means that `x < y` and `!(y <= x)` are not equivalent.
  /// This is why `isLessThanOrEqual(to:)` is a separate implementation hook
  /// in the protocol.
  ///
  /// Note that this predicate does not impose a total order. The `totalOrder`
  /// predicate provides a refinement satisfying that criteria.
  @warn_unused_result
  func isLessThanOrEqual(to other: Self) -> Bool

  /// IEEE 754 unordered predicate. True if either `self` or `other` is NaN,
  /// and false otherwise.
  @warn_unused_result
  func isUnordered(with other: Self) -> Bool

  /// True if and only if `self` is normal.
  ///
  /// A normal number uses the full precision available in the format. Zero
  /// is not a normal number.
  var isNormal: Bool { get }

  /// True if and only if `self` is finite.
  ///
  /// If `x.isFinite` is `true`, then one of `x.isZero`, `x.isSubnormal`, or
  /// `x.isNormal` is also `true`, and `x.isInfinite` and `x.isNan` are
  /// `false`.
  var isFinite: Bool { get }

  /// True iff `self` is zero. Equivalent to `self == 0`.
  var isZero: Bool { get }

  /// True if and only if `self` is subnormal.
  ///
  /// A subnormal number does not use the full precision available to normal
  /// numbers of the same format. Zero is not a subnormal number.
  var isSubnormal: Bool { get }

  /// True if and only if `self` is infinite.
  ///
  /// Note that `isFinite` and `isInfinite` do not form a dichotomy, because
  /// they are not total. If `x` is `NaN`, then both properties are `false`.
  var isInfinite: Bool { get }

  /// True if and only if `self` is NaN ("not a number").
  var isNan: Bool { get }

  /// True if and only if `self` is a signaling NaN.
  var isSignalingNan: Bool { get }

  /// The IEEE 754 "class" of this type.
  var floatingPointClass: FloatingPointClassification { get }

  /// True if and only if `self` is canonical.
  ///
  /// Every floating-point value of type Float or Double is canonical, but
  /// non-canonical values of type Float80 exist, and non-canonical values
  /// may exist for other types that conform to FloatingPoint.
  ///
  /// The non-canonical Float80 values are known as "pseudo-denormal",
  /// "unnormal", "pseudo-infinity", and "pseudo-NaN".
  /// (Extended precision - Wikipedia)
  var isCanonical: Bool { get }

  /// True if and only if `self` preceeds `other` in the IEEE 754 total order
  /// relation.
  ///
  /// This relation is a refinement of `<=` that provides a total order on all
  /// values of type `Self`, including non-canonical encodings, signed zeros,
  /// and NaNs. Because it is used much less frequently than the usual
  /// comparisons, there is no operator form of this relation.
  @warn_unused_result
  func totalOrder(with other: Self) -> Bool

  /// True if and only if `abs(self)` preceeds `abs(other)` in the IEEE 754
  /// total order relation.
  @warn_unused_result
  func totalOrderMagnitude(with other: Self) -> Bool

  /// The closest representable value to the argument.
  init<Source: Integer>(_ value: Source)

  /// Fails if the argument cannot be exactly represented.
  init?<Source: Integer>(exactly value: Source)
}

The BinaryFloatingPoint protocol provides a number of additional APIs
that only make sense for types with fixed radix 2:

/// A radix-2 (binary) floating-point type that follows the IEEE 754 encoding/// conventions.public protocol BinaryFloatingPoint: FloatingPoint {

  /// The number of bits used to represent the exponent.
  ///
  /// Following IEEE 754 encoding convention, the exponent bias is:
  ///
  /// bias = 2**(exponentBitCount-1) - 1
  ///
  /// The least normal exponent is `1-bias` and the largest finite exponent
  /// is `bias`. The all-zeros exponent is reserved for subnormals and zeros,
  /// and the all-ones exponent is reserved for infinities and NaNs.
  static var exponentBitCount: Int { get }

  /// For fixed-width floating-point types, this is the number of fractional
  /// significand bits.
  ///
  /// For extensible floating-point types, `significandBitCount` should be
  /// the maximum allowed significand width (without counting any leading
  /// integral bit of the significand). If there is no upper limit, then
  /// `significandBitCount` should be `Int.max`.
  ///
  /// Note that `Float80.significandBitCount` is 63, even though 64 bits
  /// are used to store the significand in the memory representation of a
  /// `Float80` (unlike other floating-point types, `Float80` explicitly
  /// stores the leading integral significand bit, but the
  /// `BinaryFloatingPoint` APIs provide an abstraction so that users don't
  /// need to be aware of this detail).
  static var significandBitCount: Int { get }

  /// The raw encoding of the exponent field of the floating-point value.
  var exponentBitPattern: UInt { get }

  /// The raw encoding of the significand field of the floating-point value.
  ///
  /// `significandBitPattern` does *not* include the leading integral bit of
  /// the significand, even for types like `Float80` that store it explicitly.
  var significandBitPattern: RawSignificand { get }

  /// Combines `signBit`, `exponent` and `significand` bit patterns to produce
  /// a floating-point value.
  init(signBit: Bool,
       exponentBitPattern: UInt,
       significandBitPattern: RawSignificand)

  /// The least-magnitude member of the binade of `self`.
  ///
  /// If `x` is `+/-significand * 2**exponent`, then `x.binade` is
  /// `+/- 2**exponent`; i.e. the floating point number with the same sign
  /// and exponent, but with a significand of 1.0.
  var binade: Self { get }

  /// The number of bits required to represent significand.
  ///
  /// If `self` is not a finite non-zero number, `significandWidth` is
  /// `-1`. Otherwise, it is the number of bits required to represent the
  /// significand exactly (less `1` because common formats represent one bit
  /// implicitly).
  var significandWidth: Int { get }

  @warn_unused_result
  func isEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isLess<Other: BinaryFloatingPoint>(than other: Other) -> Bool

  @warn_unused_result
  func isLessThanOrEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isUnordered<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  @warn_unused_result
  func totalOrder<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  /// `value` rounded to the closest representable value.
  init<Source: BinaryFloatingPoint>(_ value: Source)

  /// Fails if `value` cannot be represented exactly as `Self`.
  init?<Source: BinaryFloatingPoint>(exactly value: Source)
}

Float, Double, Float80 and CGFloat will conform to all of these protocols.

A small portion of the implementation of these APIs is dependent on new
Integer protocols that will be proposed separately. Everything else is
implemented in draft form on the branch floating-point-revision of my fork
<https://github.com/stephentyrone/swift&gt;\.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#impact-on-existing-code&gt;Impact
on existing code

   1.

   The % operator is no longer available for FloatingPoint types. We
   don't believe that it was widely used correctly, and the operation is still
   available via the formTruncatingRemainder method for people who need
   it.
   2.

   To follow the naming guidelines, NaN and isNaN are replaced with nan
    and isNan.
   3.

   The redundant property quietNaN is removed.
   4.

   isSignaling is renamed isSignalingNan.

<https://github.com/stephentyrone/swift-evolution/blob/master/NNNN-floating-point-protocols.md#alternatives-considered&gt;Alternatives
considered
N/A.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Oh, a couple more things I just thought of:

public protocol Arithmetic: Equatable, IntegerLiteralConvertible {

If your goals include supporting complex numbers, how is IntegerLiteralConvertible going to fit in there?

  /// Initialize to zero
  init()

0 is valuable as the additive identity. Should there also be a way to get 1, the multiplicative identity? If you need both, should these be static properties instead of initializers?

···

--
Brent Royal-Gordon
Architechies

Hi Stephen,

You write

  * FloatingPoint should conform to Equatable, and Comparable

but the documentation for Equatable and Comparable states that == and < must implement an equivalence relation and a strict total order, which is incompatible with the default IEEE-754 implementation of these operators when NaN values are involved. How do you resolve this conflict?

- Stephan

These do feel a bit awkward. Perhaps something over-engineered to handle the typical cases more readably?

  public protocol FloatingPoint: SignedArithmetic, Comparable {
    enum Sign {
      case Plus
      case Minus
      init(bit: Bool) { return bit ? .Minus : .Plus }
      var bit: Bool { get { return self == Minus } }
    }
  
    var sign: Sign { get }
    var signBit: Bool { get }
  
    init(sign: Sign, exponent: Int, significand: Self)
    init(signBit: Bool, exponent: Int, significand: Self)
  
    …
  }

…and perhaps each sign/signBit pair would provides one as a default implementation that calls the other.

Then we can often write something more readable than signBit:

  let x = Float(sign: .Plus, exponent: 2, signficand: 2)
  if x.sign == .Plus { … }

Alternatively or additionally, perhaps signBit ought to be an Int because the people writing code using signBit would probably prefer to use literal 1 and 0 instead of true and false. (Hasn't the distinction between Bit and Bool come up before?)

···

On Apr 14, 2016, at 4:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org> wrote:

  /// `true` iff the signbit of `self` is set. Implements the IEEE 754
  /// `signbit` operation.
  ///
  /// Note that this is not the same as `self < 0`. In particular, this
  /// property is true for `-0` and some NaNs, both of which compare not
  /// less than zero.
  // TODO: strictly speaking a bit and a bool are slightly different
  // concepts. Is another name more appropriate for this property?
  // `isNegative` is incorrect because of -0 and NaN. `isSignMinus` might
  // be acceptable, but isn't great. `signBit` is the IEEE 754 name.
  var signBit: Bool { get }

  init(signBit: Bool, exponent: Int, significand: Self)

--
Greg Parker gparker@apple.com Runtime Wrangler

I'd like to have something like Summable with 'add', 'adding' and 'zero' being a separate protocol as well as somthing like Multiplicative with 'multiply', 'multiplied' and 'one' being a separate protocol, because these are universally interesting for other cases, e.g. Summable would be useful for defining path lengths in a graph library.

Would you mind adding that to the proposal?

-Thorsten

···

Am 15.04.2016 um 01:55 schrieb Stephen Canon via swift-evolution <swift-evolution@swift.org>:

Enhanced floating-point protocols
Proposal: SE-NNNN
Author(s): Stephen Canon
Status: Awaiting review
Review manager: TBD
Introduction

The current FloatingPoint protocol is quite limited, and provides only a small subset of the features expected of an IEEE 754 conforming type. This proposal expands the protocol to cover most of the expected basic operations, and adds a second protocol, BinaryFloatingPoint, that provides a number of useful tools for generic programming with the most commonly used types.

Motivation

Beside the high-level motivation provided by the introduction, the proposed prototype schema addresses a number of issues and requests that we've received from programmers:

FloatingPoint should conform to Equatable, and Comparable
FloatingPoint should conform to FloatLiteralConvertible
Deprecate the % operator for floating-point types
Provide basic constants (analogues of C's DBL_MAX, etc.)
Make Float80 conform to FloatingPoint
It also puts FloatingPoint much more tightly in sync with the work that is being done on protocols for Integers, which will make it easier to provide a uniform interface for arithmetic scalar types.

Detailed design

A new protocol, Arithmetic, is introduced that provides the most basic operations (add, subtract, multiply and divide) as well as Equatable and IntegerLiteralConvertible, and is conformed to by both integer and floating- point types.

There has been some resistance to adding such a protocol, owing to differences in behavior between floating point and integer arithmetic. While these differences make it difficult to write correct generic code that operates on all "arithmetic" types, it is nonetheless convenient to provide a single protocol that guarantees the availability of these basic operations. It is intended that "number-like" types should provide these APIs.

/// Arithmetic protocol declares methods backing binary arithmetic operators,
/// such as `+`, `-` and `*`; and their mutating counterparts. These methods
/// operate on arguments of the same type.
///
/// Both mutating and non-mutating operations are declared in the protocol, but
/// only the mutating ones are required. Should conforming type omit
/// non-mutating implementations, they will be provided by a protocol extension.
/// Implementation in that case will copy `self`, perform a mutating operation
/// on it and return the resulting value.
public protocol Arithmetic: Equatable, IntegerLiteralConvertible {
  /// Initialize to zero
  init()

  /// The sum of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `add` operation.
  @warn_unused_result
  func adding(rhs: Self) -> Self

  /// Adds `rhs` to `self`.
  mutating func add(rhs: Self)

  /// The result of subtracting `rhs` from `self`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `subtract` operation.
  @warn_unused_result
  func subtracting(rhs: Self) -> Self

  /// Subtracts `rhs` from `self`.
  mutating func subtract(rhs: Self)

  /// The product of `self` and `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `multiply` operation.
  @warn_unused_result
  func multiplied(by rhs: Self) -> Self

  /// Multiplies `self` by `rhs`.
  mutating func multiply(by rhs: Self)

  /// The quotient of `self` dividing by `rhs`.
  // Arithmetic provides a default implementation of this method in terms
  // of the mutating `divide` operation.
  @warn_unused_result
  func divided(by rhs: Self) -> Self

  /// Divides `self` by `rhs`.
  mutating func divide(by rhs: Self)
}

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self
}
The usual arithmetic operators are then defined in terms of the implementation hooks provided by Arithmetic and SignedArithmetic, so providing those operations are all that is necessary for a type to present a "number-like" interface.

The FloatingPoint protocol is split into two parts; FloatingPoint and BinaryFloatingPoint, which conforms to FloatingPoint. If decimal types were added at some future point, they would conform to DecimalFloatingPoint.

FloatingPoint is expanded to contain most of the IEEE 754 basic operations, as well as conformance to SignedArithmetic and Comparable.

/// A floating-point type that provides most of the IEEE 754 basic (clause 5)
/// operations. The base, precision, and exponent range are not fixed in
/// any way by this protocol, but it enforces the basic requirements of
/// any IEEE 754 floating-point type.
///
/// The BinaryFloatingPoint protocol refines these requirements and provides
/// some additional useful operations as well.
public protocol FloatingPoint: SignedArithmetic, Comparable {

  /// An unsigned integer type that can represent the significand of any value.
  ///
  /// The significand (http://en.wikipedia.org/wiki/Significand\) is frequently
  /// also called the "mantissa", but this terminology is slightly incorrect
  /// (see the "Use of 'mantissa'" section on the linked Wikipedia page for
  /// more details). "Significand" is the preferred terminology in IEEE 754.
  associatedtype RawSignificand: UnsignedInteger

  /// 2 for binary floating-point types, 10 for decimal.
  ///
  /// A conforming type may use any integer radix, but values other than
  /// 2 or 10 are extraordinarily rare in practice.
  static var radix: Int { get }

  /// Positive infinity. Compares greater than all finite numbers.
  static var infinity: Self { get }

  /// A quiet NaN (not-a-number). Compares not equal to every value,
  /// including itself.
  static var nan: Self { get }

  /// NaN with specified `payload`.
  ///
  /// Compares not equal to every value, including itself. Most operations
  /// with a NaN operand will produce a NaN result. Note that it is generally
  /// not the case that all possible significand values are valid
  /// NaN `payloads`. `FloatingPoint` types should either treat inadmissible
  /// payloads as zero, or mask them to create an admissible payload.
  @warn_unused_result
  static func nan(payload payload: RawSignificand, signaling: Bool) -> Self

  /// The greatest finite number.
  ///
  /// Compares greater than or equal to all finite numbers, but less than
  /// infinity. Corresponds to the C macros `FLT_MAX`, `DBL_MAX`, etc.
  /// The naming of those macros is slightly misleading, because infinity
  /// is greater than this value.
  static var greatestFiniteMagnitude: Self { get }

  // NOTE: Rationale for "ulp" instead of "epsilon":
  // We do not use that name because it is ambiguous at best and misleading
  // at worst:
  //
  // - Historically several definitions of "machine epsilon" have commonly
  // been used, which differ by up to a factor of two or so. By contrast
  // "ulp" is a term with a specific unambiguous definition.
  //
  // - Some languages have used "epsilon" to refer to wildly different values,
  // such as `leastMagnitude`.
  //
  // - Inexperienced users often believe that "epsilon" should be used as a
  // tolerance for floating-point comparisons, because of the name. It is
  // nearly always the wrong value to use for this purpose.

  /// The unit in the last place of 1.0.
  ///
  /// This is the weight of the least significant bit of the significand of 1.0,
  /// or the positive difference between 1.0 and the next greater representable
  /// number. Corresponds to the C macros `FLT_EPSILON`, `DBL_EPSILON`, etc.
  static var ulp: Self { get }

  /// The unit in the last place of `self`.
  ///
  /// This is the unit of the least significant digit in the significand of
  /// `self`. For most numbers `x`, this is the difference between `x` and
  /// the next greater (in magnitude) representable number. There are some
  /// edge cases to be aware of:
  ///
  /// - `greatestFiniteMagnitude.ulp` is a finite number, even though
  /// the next greater representable value is `infinity`.
  /// - `x.ulp` is `NaN` if `x` is not a finite number.
  /// - If `x` is very small in magnitude, then `x.ulp` may be a subnormal
  /// number. On targets that do not support subnormals, `x.ulp` may be
  /// flushed to zero.
  ///
  /// This quantity, or a related quantity is sometimes called "epsilon" or
  /// "machine epsilon". We avoid that name because it has different meanings
  /// in different languages, which can lead to confusion, and because it
  /// suggests that it is an good tolerance to use for comparisons,
  /// which is almost never is.
  ///
  /// (See Machine epsilon - Wikipedia for more detail)
  var ulp: Self { get }

  /// The least positive normal number.
  ///
  /// Compares less than or equal to all positive normal numbers. There may
  /// be smaller positive numbers, but they are "subnormal", meaning that
  /// they are represented with less precision than normal numbers.
  /// Corresponds to the C macros `FLT_MIN`, `DBL_MIN`, etc. The naming of
  /// those macros is slightly misleading, because subnormals, zeros, and
  /// negative numbers are smaller than this value.
  static var leastNormalMagnitude: Self { get }

  /// The least positive number.
  ///
  /// Compares less than or equal to all positive numbers, but greater than
  /// zero. If the target supports subnormal values, this is smaller than
  /// `leastNormalMagnitude`; otherwise they are equal.
  static var leastMagnitude: Self { get }

  /// `true` iff the signbit of `self` is set. Implements the IEEE 754
  /// `signbit` operation.
  ///
  /// Note that this is not the same as `self < 0`. In particular, this
  /// property is true for `-0` and some NaNs, both of which compare not
  /// less than zero.
  // TODO: strictly speaking a bit and a bool are slightly different
  // concepts. Is another name more appropriate for this property?
  // `isNegative` is incorrect because of -0 and NaN. `isSignMinus` might
  // be acceptable, but isn't great. `signBit` is the IEEE 754 name.
  var signBit: Bool { get }

  /// The integer part of the base-r logarithm of the magnitude of `self`,
  /// where r is the radix (2 for binary, 10 for decimal). Implements the
  /// IEEE 754 `logB` operation.
  ///
  /// Edge cases:
  ///
  /// - If `x` is zero, then `x.exponent` is `Int.min`.
  /// - If `x` is +/-infinity or NaN, then `x.exponent` is `Int.max`
  var exponent: Int { get }

  /// The significand satisfies:
  ///
  /// ~~~
  /// self = (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// If radix is 2 (the most common case), then for finite non-zero numbers
  /// `1 <= significand` and `significand < 2`. For other values of `x`,
  /// `x.significand` is defined as follows:
  ///
  /// - If `x` is zero, then `x.significand` is 0.0.
  /// - If `x` is infinity, then `x.significand` is 1.0.
  /// - If `x` is NaN, then `x.significand` is NaN.
  ///
  /// For all floating-point `x`, if we define y by:
  ///
  /// ~~~
  /// let y = Self(signBit: x.signBit, exponent: x.exponent,
  /// significand: x.significand)
  /// ~~~
  ///
  /// then `y` is equivalent to `x`, meaning that `y` is `x` canonicalized.
  var significand: Self { get }

  /// Initialize from signBit, exponent, and significand.
  ///
  /// The result is:
  ///
  /// ~~~
  /// (signBit ? -1 : 1) * significand * radix**exponent
  /// ~~~
  ///
  /// (where `**` is exponentiation) computed as if by a single correctly-
  /// rounded floating-point operation. If this value is outside the
  /// representable range of the type, overflow or underflow occurs, and zero,
  /// a subnormal value, or infinity may result, as with any basic operation.
  /// Other edge cases:
  ///
  /// - If `significand` is zero or infinite, the result is zero or infinite,
  /// regardless of the value of `exponent`.
  ///
  /// - If `significand` is NaN, the result is NaN.
  ///
  /// Note that for any floating-point `x` the result of
  ///
  /// `Self(signBit: x.signBit,
  /// exponent: x.exponent,
  /// significand: x.significand)`
  ///
  /// is "the same" as `x`; it is `x` canonicalized.
  ///
  /// Because of these properties, this initializer implements the IEEE 754
  /// `scaleB` operation.
  init(signBit: Bool, exponent: Int, significand: Self)

  /// A floating point value whose exponent and signficand are taken from
  /// `magnitude` and whose signBit is taken from `signOf`. Implements the
  /// IEEE 754 `copysign` operation.
  // TODO: better argument names would be great.
  init(magnitudeOf magnitude: Self, signOf: Self)

  /// The least representable value that compares greater than `self`.
  ///
  /// - If `x` is `-infinity`, then `x.nextUp` is `-greatestMagnitude`.
  /// - If `x` is `-leastMagnitude`, then `x.nextUp` is `-0.0`.
  /// - If `x` is zero, then `x.nextUp` is `leastMagnitude`.
  /// - If `x` is `greatestMagnitude`, then `x.nextUp` is `infinity`.
  /// - If `x` is `infinity` or `NaN`, then `x.nextUp` is `x`.
  var nextUp: Self { get }

  /// The greatest representable value that compares less than `self`.
  ///
  /// `x.nextDown` is equivalent to `-(-x).nextUp`
  var nextDown: Self { get }

  /// Remainder of `self` divided by `other`.
  ///
  /// For finite `self` and `other`, the remainder `r` is defined by
  /// `r = self - other*n`, where `n` is the integer nearest to `self/other`.
  /// (Note that `n` is *not* `self/other` computed in floating-point
  /// arithmetic, and that `n` may not even be representable in any available
  /// integer type). If `self/other` is exactly halfway between two integers,
  /// `n` is chosen to be even.
  ///
  /// It follows that if `self` and `other` are finite numbers, the remainder
  /// `r` satisfies `-|other|/2 <= r` and `r <= |other|/2`.
  ///
  /// `formRemainder` is always exact, and therefore is not affected by
  /// rounding modes.
  mutating func formRemainder(dividingBy other: Self)

  /// Remainder of `self` divided by `other` using truncating division.
  ///
  /// If `self` and `other` are finite numbers, the truncating remainder
  /// `r` has the same sign as `other` and is strictly smaller in magnitude.
  /// It satisfies `r = self - other*n`, where `n` is the integral part
  /// of `self/other`.
  ///
  /// `formTruncatingRemainder` is always exact, and therefore is not
  /// affected by rounding modes.
  mutating func formTruncatingRemainder(dividingBy other: Self)

  /// Mutating form of square root.
  mutating func formSquareRoot( )

  /// Fused multiply-add, accumulating the product of `lhs` and `rhs` to `self`.
  mutating func addProduct(lhs: Self, _ rhs: Self)

  /// Remainder of `self` divided by `other`.
  @warn_unused_result
  func remainder(dividingBy other: Self) -> Self

  /// Remainder of `self` divided by `other` using truncating division.
  @warn_unused_result
  func truncatingRemainder(dividingBy other: Self) -> Self

  /// Square root of `self`.
  @warn_unused_result
  func squareRoot( ) -> Self

  /// `self + lhs*rhs` computed without intermediate rounding.
  @warn_unused_result
  func addingProduct(lhs: Self, _ rhs: Self) -> Self

  /// The minimum of `x` and `y`. Implements the IEEE 754 `minNum` operation.
  ///
  /// Returns `x` if `x <= y`, `y` if `y < x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// min(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func minimum(x: Self, _ y: Self) -> Self

  /// The maximum of `x` and `y`. Implements the IEEE 754 `maxNum` operation.
  ///
  /// Returns `x` if `x >= y`, `y` if `y > x`, and whichever of `x` or `y`
  /// is a number if the other is NaN. The result is NaN only if both
  /// arguments are NaN.
  ///
  /// This function is an implementation hook to be used by the free function
  /// max(Self, Self) -> Self so that we get the IEEE 754 behavior with regard
  /// to NaNs.
  @warn_unused_result
  static func maximum(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has lesser magnitude. Implements the IEEE 754
  /// `minNumMag` operation.
  ///
  /// Returns `x` if abs(x) <= abs(y), `y` if abs(y) < abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func minimumMagnitude(x: Self, _ y: Self) -> Self

  /// Whichever of `x` or `y` has greater magnitude. Implements the IEEE 754
  /// `maxNumMag` operation.
  ///
  /// Returns `x` if abs(x) >= abs(y), `y` if abs(y) > abs(x), and whichever of
  /// `x` or `y` is a number if the other is NaN. The result is NaN
  /// only if both arguments are NaN.
  @warn_unused_result
  static func maximumMagnitude(x: Self, _ y: Self) -> Self

  /// IEEE 754 equality predicate.
  ///
  /// -0 compares equal to +0, and NaN compares not equal to anything,
  /// including itself.
  @warn_unused_result
  func isEqual(to other: Self) -> Bool

  /// IEEE 754 less-than predicate.
  ///
  /// NaN compares not less than anything. -infinity compares less than
  /// all values except for itself and NaN. Everything except for NaN and
  /// +infinity compares less than +infinity.
  @warn_unused_result
  func isLess(than other: Self) -> Bool

  /// IEEE 754 less-than-or-equal predicate.
  ///
  /// NaN compares not less than or equal to anything, including itself.
  /// -infinity compares less than or equal to everything except NaN.
  /// Everything except NaN compares less than or equal to +infinity.
  ///
  /// Because of the existence of NaN in FloatingPoint types, trichotomy does
  /// not hold, which means that `x < y` and `!(y <= x)` are not equivalent.
  /// This is why `isLessThanOrEqual(to:)` is a separate implementation hook
  /// in the protocol.
  ///
  /// Note that this predicate does not impose a total order. The `totalOrder`
  /// predicate provides a refinement satisfying that criteria.
  @warn_unused_result
  func isLessThanOrEqual(to other: Self) -> Bool

  /// IEEE 754 unordered predicate. True if either `self` or `other` is NaN,
  /// and false otherwise.
  @warn_unused_result
  func isUnordered(with other: Self) -> Bool

  /// True if and only if `self` is normal.
  ///
  /// A normal number uses the full precision available in the format. Zero
  /// is not a normal number.
  var isNormal: Bool { get }

  /// True if and only if `self` is finite.
  ///
  /// If `x.isFinite` is `true`, then one of `x.isZero`, `x.isSubnormal`, or
  /// `x.isNormal` is also `true`, and `x.isInfinite` and `x.isNan` are
  /// `false`.
  var isFinite: Bool { get }

  /// True iff `self` is zero. Equivalent to `self == 0`.
  var isZero: Bool { get }

  /// True if and only if `self` is subnormal.
  ///
  /// A subnormal number does not use the full precision available to normal
  /// numbers of the same format. Zero is not a subnormal number.
  var isSubnormal: Bool { get }

  /// True if and only if `self` is infinite.
  ///
  /// Note that `isFinite` and `isInfinite` do not form a dichotomy, because
  /// they are not total. If `x` is `NaN`, then both properties are `false`.
  var isInfinite: Bool { get }

  /// True if and only if `self` is NaN ("not a number").
  var isNan: Bool { get }

  /// True if and only if `self` is a signaling NaN.
  var isSignalingNan: Bool { get }

  /// The IEEE 754 "class" of this type.
  var floatingPointClass: FloatingPointClassification { get }

  /// True if and only if `self` is canonical.
  ///
  /// Every floating-point value of type Float or Double is canonical, but
  /// non-canonical values of type Float80 exist, and non-canonical values
  /// may exist for other types that conform to FloatingPoint.
  ///
  /// The non-canonical Float80 values are known as "pseudo-denormal",
  /// "unnormal", "pseudo-infinity", and "pseudo-NaN".
  /// (Extended precision - Wikipedia)
  var isCanonical: Bool { get }

  /// True if and only if `self` preceeds `other` in the IEEE 754 total order
  /// relation.
  ///
  /// This relation is a refinement of `<=` that provides a total order on all
  /// values of type `Self`, including non-canonical encodings, signed zeros,
  /// and NaNs. Because it is used much less frequently than the usual
  /// comparisons, there is no operator form of this relation.
  @warn_unused_result
  func totalOrder(with other: Self) -> Bool

  /// True if and only if `abs(self)` preceeds `abs(other)` in the IEEE 754
  /// total order relation.
  @warn_unused_result
  func totalOrderMagnitude(with other: Self) -> Bool

  /// The closest representable value to the argument.
  init<Source: Integer>(_ value: Source)

  /// Fails if the argument cannot be exactly represented.
  init?<Source: Integer>(exactly value: Source)
}
The BinaryFloatingPoint protocol provides a number of additional APIs that only make sense for types with fixed radix 2:

/// A radix-2 (binary) floating-point type that follows the IEEE 754 encoding
/// conventions.
public protocol BinaryFloatingPoint: FloatingPoint {

  /// The number of bits used to represent the exponent.
  ///
  /// Following IEEE 754 encoding convention, the exponent bias is:
  ///
  /// bias = 2**(exponentBitCount-1) - 1
  ///
  /// The least normal exponent is `1-bias` and the largest finite exponent
  /// is `bias`. The all-zeros exponent is reserved for subnormals and zeros,
  /// and the all-ones exponent is reserved for infinities and NaNs.
  static var exponentBitCount: Int { get }

  /// For fixed-width floating-point types, this is the number of fractional
  /// significand bits.
  ///
  /// For extensible floating-point types, `significandBitCount` should be
  /// the maximum allowed significand width (without counting any leading
  /// integral bit of the significand). If there is no upper limit, then
  /// `significandBitCount` should be `Int.max`.
  ///
  /// Note that `Float80.significandBitCount` is 63, even though 64 bits
  /// are used to store the significand in the memory representation of a
  /// `Float80` (unlike other floating-point types, `Float80` explicitly
  /// stores the leading integral significand bit, but the
  /// `BinaryFloatingPoint` APIs provide an abstraction so that users don't
  /// need to be aware of this detail).
  static var significandBitCount: Int { get }

  /// The raw encoding of the exponent field of the floating-point value.
  var exponentBitPattern: UInt { get }

  /// The raw encoding of the significand field of the floating-point value.
  ///
  /// `significandBitPattern` does *not* include the leading integral bit of
  /// the significand, even for types like `Float80` that store it explicitly.
  var significandBitPattern: RawSignificand { get }

  /// Combines `signBit`, `exponent` and `significand` bit patterns to produce
  /// a floating-point value.
  init(signBit: Bool,
       exponentBitPattern: UInt,
       significandBitPattern: RawSignificand)

  /// The least-magnitude member of the binade of `self`.
  ///
  /// If `x` is `+/-significand * 2**exponent`, then `x.binade` is
  /// `+/- 2**exponent`; i.e. the floating point number with the same sign
  /// and exponent, but with a significand of 1.0.
  var binade: Self { get }

  /// The number of bits required to represent significand.
  ///
  /// If `self` is not a finite non-zero number, `significandWidth` is
  /// `-1`. Otherwise, it is the number of bits required to represent the
  /// significand exactly (less `1` because common formats represent one bit
  /// implicitly).
  var significandWidth: Int { get }

  @warn_unused_result
  func isEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isLess<Other: BinaryFloatingPoint>(than other: Other) -> Bool

  @warn_unused_result
  func isLessThanOrEqual<Other: BinaryFloatingPoint>(to other: Other) -> Bool

  @warn_unused_result
  func isUnordered<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  @warn_unused_result
  func totalOrder<Other: BinaryFloatingPoint>(with other: Other) -> Bool

  /// `value` rounded to the closest representable value.
  init<Source: BinaryFloatingPoint>(_ value: Source)

  /// Fails if `value` cannot be represented exactly as `Self`.
  init?<Source: BinaryFloatingPoint>(exactly value: Source)
}
Float, Double, Float80 and CGFloat will conform to all of these protocols.

A small portion of the implementation of these APIs is dependent on new Integer protocols that will be proposed separately. Everything else is implemented in draft form on the branch floating-point-revision of my fork.

Impact on existing code

The % operator is no longer available for FloatingPoint types. We don't believe that it was widely used correctly, and the operation is still available via the formTruncatingRemainder method for people who need it.

To follow the naming guidelines, NaN and isNaN are replaced with nan and isNan.

The redundant property quietNaN is removed.

isSignaling is renamed isSignalingNan.

Alternatives considered

N/A.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Hi Erica, thanks for the feedback.

* I do use % for floating point but not as much as I first thought before I started searching through my code after reading your e-mail. But when I do use it, it's nice to have a really familiar symbol rather than a big word. What were the ways that it was used incorrectly? Do you have some examples?

As it happens, I have a rationale sitting around from an earlier (internal) discussion:

While C and C++ do not provide the “%” operator for floating-point types, many newer languages do (Java, C#, and Python, to name just a few). Superficially this seems reasonable, but there are severe gotchas when % is applied to floating-point data, and the results are often extremely surprising to unwary users. C and C++ omitted this operator for good reason. Even if you think you want this operator, it is probably doing the wrong thing in subtle ways that will cause trouble for you in the future.

The % operator on integer types satisfies the division algorithm axiom: If b is non-zero and q = a/b, r = a%b, then a = q*b + r. This property does not hold for floating-point types, because a/b does not produce an integral value. If it did produce an integral value, it would need to be a bignum type of some sort (the integral part of DBL_MAX / DBL_MIN, for example, has over 2000 bits or 600 decimal digits).

Even if a bignum type were returned, or if we ignore the loss of the division algorithm axiom, % would still be deeply flawed. Whereas people are generally used to modest rounding errors in floating-point arithmetic, because % is not continuous small errors are frequently enormously magnified with catastrophic results:

  (swift) 10.0 % 0.1
    // r0 : Double = 0.0999999999999995 // What?!

[Explanation: 0.1 cannot be exactly represented in binary floating point; the actual value of “0.1” is 0.1000000000000000055511151231257827021181583404541015625. Other than that rounding, the entire computation is exact.]

Proposed Approach:
Remove the “%” operator for floating-point types. The operation is still be available via the C standard library fmod( ) function (which should be mapped to a Swiftier name, but that’s a separate proposal).

Alternative Considered:
Instead of binding “%” to fmod( ), it could be bound to remainder( ), which implements the IEEE 754 remainder operation; this is just like fmod( ), except instead of returning the remainder under truncating division, it returns the remainder of round-to-nearest division, meaning that if a and b are positive, remainder(a,b) is in the range [-b/2, b/2] rather than [0, b). This still has a large discontinuity, but the discontinuity is moved away from zero, which makes it much less troublesome (that’s why IEEE 754 standardized this operation):

  (swift) remainder(1, 0.1)
    // r1 : Double = -0.000000000000000055511151231257827 // Looks like normal floating-point rounding

The downside to this alternative is that now % behaves totally differently for integer and floating-point data, and of course the division algorithm still doesn’t hold.

* I don't quite get how equatable is going to work. Do you mind explaining that in more detail?

I’m not totally sure what your question is. Are you asking how FloatingPoint will conform to Equatable, or how the Equatable protocol will work?

– Steve

···

On Apr 14, 2016, at 6:29 PM, Erica Sadun <erica@ericasadun.com> wrote:

Hi Howard, thanks for the feedback.

+1 great addition.

Would suggest the naming could be more consistent, in particular:
Anything returning Self could be named xxxed. In the current proposal this naming convention is sometimes used, e.g. divided, and sometimes not, e.g. subtracting. Suggest all unified with the xxxed convention.

The names in the Arithmetic protocol are Dave A’s creation, but I think they’re a reasonable compromise given the constraints placed on us. While consistency would be nice, I think that clarity at use site is more important, under the rationale that code is read more often than written. Also, keep in mind that in practice, “everyone” will use the operators for arithmetic. Dave may have more to say on the subject.

Anything returning Bool could be named isXxx. In some cases this is used, e.g. isUnordered, but not others, e.g. totalOrder.

That’s a reasonable point. isTotallyOrdered(with: ) is simple, but I’m not sure how I would handle totalOrderMagnitude( ) under this scheme. Thoughts?

Thanks,
– Steve

···

On Apr 14, 2016, at 6:05 PM, Howard Lovatt <howard.lovatt@gmail.com> wrote:

Thanks for the feedback, Chris.

This proposal looks really really great, let me know when you want to start the review process (or just submit a PR for the -evolution repo) and I’ll happily review manage it for you.

Provide basic constants (analogues of C's DBL_MAX, etc.)

Nice, have you considered adding pi/e and other common constants? I’d really really like to see use of M_PI go away… :-)

That’s a reasonable suggestion. I’m not sure if FloatingPoint is the right protocol to attach them to, but I’m not sure that it’s wrong either. I’d be interested to hear arguments from the community either way.

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
  func negate() -> Self

Should this be negated / negate? negate() seems like an in-place mutating version.

Yup, that seems right.

/// A floating-point type that provides most of the IEEE 754 basic (clause 5)

Dumb Q, but is it “IEEE 754” or “IEEE-754”?

It’s commonly styled both ways, but I believe IEEE 754 is the “official" one.

/// operations. The base, precision, and exponent range are not fixed in
/// any way by this protocol, but it enforces the basic requirements of
/// any IEEE 754 floating-point type.
///
/// The BinaryFloatingPoint protocol refines these requirements and provides
/// some additional useful operations as well.
public protocol FloatingPoint: SignedArithmetic, Comparable {

  static var ulp: Self { get }
  var ulp: Self { get }

Swift supports instance and type members with the same names, but this is controversial, leads to confusion, and may go away in the future. It would be great to avoid this in your design.

Interesting. Both are definitely useful, but Type(1).ulp is sufficiently simple that only having the instance member may be good enough. Otherwise, ulpOfOne or similar could work.

<snip>

I’m certainly not a floating point guru, but I would have expected significant to be of type RawSignificand, and thought that the significant of a nan would return its payload. Does this approach make sense?

… later: I see that you have this on the binary FP type, so I assume there is a good reason for this :-)

Both are useful to have in practice. I have been attempting to keep the assumptions about representation to a minimum in the top-level FloatingPoint protocol.

<snip a few style notes that I’ll simply take as-is because they’re no-brainers>

  /// True if and only if `self` is subnormal.
  ///
  /// A subnormal number does not use the full precision available to normal
  /// numbers of the same format. Zero is not a subnormal number.
  var isSubnormal: Bool { get }

I’m used to this being called a “Denormal”, but I suspect that “subnormal” is the actually right name? Maybe it would be useful to mention the “frequently known as denormal” in the comment, like you did with mantissa earlier.

Yes, “subnormal” is the preferred IEEE 754 terminology, but I’ll add a note referencing “denormal” as well.

Thanks,
– Steve

···

On Apr 14, 2016, at 9:06 PM, Chris Lattner <clattner@apple.com> wrote:
On Apr 14, 2016, at 4:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Hi Brent, thanks for the feedback.

First of all: I do *not* do crazy things with floating-point numbers, so there's a good chance I'm missing the point with some of this. Consider anything having to do with NaNs, subnormals, or other such strange denizens of the FPU to be prefixed with "I may be totally missing the point, but…”

Noted.

public protocol Arithmetic

Is there a rationale for the name "Arithmetic"? There's probably nothing wrong with it, but I would have guessed you'd use the name "Number”.

Dave A. came up with the name, though I think it’s a good one. Number isn’t bad either.

: Equatable, IntegerLiteralConvertible

Are there potential conforming types which aren't Comparable?

Not at present, but I expect there to be in the future. Modular integers and complex numbers come to mind as the most obvious examples.

func adding(rhs: Self) -> Self
mutating func add(rhs: Self)

Is there a reason we're introducing methods and calling them from the operators, rather than listing the operators themselves as requirements?

func negate() -> Self

Should this be `negated()`? Should there be a mutating `negate()` variant, even if we won't have an operator for it?

(If a mutating `negate` would be an attractive nuisance, we can use `negative()`/`formNegative()` instead.)

Chris noted this too, and I think you’re both right.

/// NaN `payloads`. `FloatingPoint` types should either treat inadmissible
/// payloads as zero, or mask them to create an admissible payload.
static func nan(payload payload: RawSignificand, signaling: Bool) -> Self

This seems unusually tolerant of bad inputs. Should this instead be a precondition, and have an (elidable in unchecked mode) trap if it's violated?

I don’t think that it’s a particularly useful error to detect, and different floating point types may differ greatly in what payloads they support (if any), because they might choose to reserve those encodings for other purposes.

static var greatestFiniteMagnitude: Self { get }
static var leastNormalMagnitude: Self { get }
static var leastMagnitude: Self { get }

Reading these, I find the use of "least" a little bit misleading—it seems like they should be negative.

Magnitudes are strictly positive. In fairness, that may not be immediately obvious to all readers.

I wonder if instead, we can use ClosedIntervals/ClosedRanges to group together related values:

  static var positiveNormals: ClosedRange<Self> { get }
  static var positiveSubnormals: ClosedRange<Self> { get }

  Double.positiveNormals.upperBound // DBL_MAX
  Double.positiveNormals.lowerBound // DBL_MIN
  Double.positiveSubnormals.upperBound // Self.positiveNormals.lowerBound.nextDown
  Double.positiveSubnormals.lowerBound // 0.nextUp

  // Alternatively, you could have `positives`, running from 0.nextUp to infinity

Technically, you could probably implement calls like e.g. isNormal in terms of the positiveNormals property, but I'm sure separate calls are much, much faster.

(It might also be helpful if you could negate signed ClosedIntervals, which would negate and swap the bounds.)

This seems wildly over-engineered to me personally. Every language I surveyed provides (a subset of) these quantities as simple scalar values.

public protocol FloatingPoint: SignedArithmetic, Comparable {
func isLess(than other: Self) -> Bool
func totalOrder(with other: Self) -> Bool

Swift 2's Comparable demands a strict total order. However, the documentation here seems to imply that totalOrder is *not* what you get from the < operator. Is something getting reshuffled here?

The Swift 2 Comparable documentation is probably overly specific. The requirement should really be something like a strict total order on non-exceptional values.

init<Source: Integer>(_ value: Source)
init?<Source: Integer>(exactly value: Source)

init<Source: BinaryFloatingPoint>(_ value: Source)
init?<Source: BinaryFloatingPoint>(exactly value: Source)

It's great to have both of these, but I wonder how they're going to be implemented—if Integer can be either signed or unsigned, I'm not sure how you get the top bit of an unsigned integer out.

These are the bits that are dependent on the new Integer proposal, which should be forthcoming soonish. As you note, they aren’t really implementable without it (at least, not easily). However, I thought that it made the most sense to include them with this proposal.

Also, since `init(_:)` is lossy and `init(exactly:)` is not, shouldn't their names technically be switched? Or perhaps `init(_:)` should be exact and trapping, `init(exactly:)` should be failable, and `init(closest:)` should always return something or other?

That would be a large change in the existing behavior, since we only have the (potentially lossy) init(_:) today. It would also be something of a surprise to users of C family languages. That’s not to say that it’s necessarily wrong, but it would break a lot of people’s code an expectations, and I was trying to make this proposal fairly benign, with a focus on adding missing features.

– Steve

···

On Apr 14, 2016, at 8:34 PM, Brent Royal-Gordon <brent@architechies.com> wrote:

In general, I think this is fantastic. In particular, I *really* like the notion that `BinaryFloatingPoint` conforms to `FloatingPoint`. I would do a few things differently, though:

public protocol FloatingPoint: SignedArithmetic, Comparable {
  ...
  /// The greatest finite number.
  ///
  /// Compares greater than or equal to all finite numbers, but less than
  /// infinity. Corresponds to the C macros `FLT_MAX`, `DBL_MAX`, etc.
  /// The naming of those macros is slightly misleading, because infinity
  /// is greater than this value.
  static var greatestFiniteMagnitude: Self { get }
  ...
}

Why put this in FloatingPoint? The concept is valid for any real type (IMHO, it’s valid for rectangular Complex/Quaternion/etc types as well... I’m not sure if it is for the various polar formats, though). I think a better place for it is either in `Arithmetic`, or another protocol to which `Arithmetic` conforms:

protocol HasMinAndMaxFiniteValue {
    static var maxFinite: Self {get} // I think “max”/“min" are clear enough (especially if the docs are explicit), but I understand the objection
    static var minFinite: Self {get} // 0 for unsigned types
}

As mentioned in reply to Brent, there may very well be Arithmetic types without a real notion of magnitude (modular integers). There’s also a complication that the magnitude of a Complex<T> shouldn’t be complex, but rather real, and also isn’t usually representable as a T (the one exception is polar forms, where it *is* representable =).

This would unify the syntax for getting a numeric type’s min or max finite value across all the built-in numeric types (this means that `Int.max` would become `Int.maxFinite`). Similarly, IMHO infinity shouldn’t be tied to floating point types. While it’s true that the *native* integer types don’t support infinity, arbitrary precision integer types might.

protocol HasInfinity {
    static var infinity: Self {get}
}

I think that integer max and min behave sufficiently differently from the FloatingPoint quantities that it’s not particularly useful to unify their names. Most obviously, all integers between min and max are representable for integers, and this is very much not the case for floating point types.

I’d restructure this a bit:
protocol Arithmetic { //AFAIK *all* numeric types should be able to do these
    init()
    func adding(rhs: Self) -> Self
    mutating func add(rhs: Self)
    func subtracting(rhs: Self) -> Self
    mutating func subtract(rhs: Self)
}
protocol ScalarArithmetic : Arithmetic { //These can be iffy for non-scalar types
    func multiplied(by rhs: Self) -> Self
    mutating func multiply(by rhs: Self)
    func divided(by rhs: Self) -> Self
    mutating func divide(by rhs: Self)
}

Multiplication isn't always defined for any two arbitrarily-dimensioned matrices (plus, there are so many reasonable matrix “subtypes” that there's no guarantee that the return type should always be the same), and I don’t think there’s a generally agreed-upon meaning for matrix division at all.

There’s a natural tension of exactly what structure in the hierarchy of semigroups, groups, rings, etc is the baseline to be “numbery”. While it’s not totally obvious that division should be required, neither is it obvious that it should be excluded.

I should note that for non-scalar types, multiplication and division are frequently more reasonable than addition and subtraction. E.g. the orthogonal matrix groups and unitary quaternions are closed under multiplication but not addition.

Ultimately, we may want to have a finer-grained classification of numeric protocols, but “schoolbook arithmetic” is a pretty reasonable set of operations while we’re picking just one.

– Steve

···

On Apr 14, 2016, at 10:47 PM, davesweeris@mac.com wrote:

On Apr 14, 2016, at 6:55 PM, Stephen Canon via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

I don’t want to speak for the Apple standard library team, who are doing the Integer protocol work, but I’ve seen a reasonably complete draft, so I believe the answer is “soon”.

– Steve

···

On Apr 15, 2016, at 7:23 AM, plx via swift-evolution <swift-evolution@swift.org> wrote:

One is if there’s any ETA or similar for a glimpse at the “complete picture” of Swift’s revised numeric protocols; these floating-point protocols look really, really good, but this is also (I think) the first glimpse at the new `Arithmetic` protocol, and there’s also a new “Integer” protocol coming…and it’d be nice to get a sense of the complete vision here.

/// The quotient of `self` dividing by `rhs`.
// Arithmetic provides a default implementation of this method in terms
// of the mutating `divide` operation.
@warn_unused_result
func divided(by rhs: Self) -> Self

/// Divides `self` by `rhs`.
mutating func divide(by rhs: Self)

When dealing with integer arithmetic, I often find useful a `divmod` function which produces a (quotient, remainder) pair.
It could be argued that such a pair is the primary result of division on integers. It would be great to have such a function included in the design.

I believe that it’s present in the latest draft of the Integer protocols.

/// SignedArithmetic protocol will only be conformed to by signed numbers,
/// otherwise it would be possible to negate an unsigned value.
///
/// The only method of this protocol has the default implementation in an
/// extension, that uses a parameterless initializer and subtraction.
public protocol SignedArithmetic : Arithmetic {
func negate() -> Self
}

It might make sense to also have a

public protocol InvertibleArithmetic : Arithmetic {
func inverted() -> Self
}

FloatingPoint would conform to this protocol, returning 1/x, while integer types would not.

That’s a very reasonable suggestion (and one that can easily be added separately if it doesn’t make it into this batch of changes).

– Steve

···

On Apr 14, 2016, at 11:48 PM, Nicola Salmoria <nicola.salmoria@gmail.com> wrote: