Integers and their protocols

xwu · June 17, 2017, 11:22pm

Hi all,

I promised some time ago to provide some thoughts based on the experience
of using the newly revised integer protocols. I've finally synthesized
these thoughts into one document. It is somewhat long, and for that I
apologize. I hope that it is at least clear in terms of how it lays out the
issues.

Gist: Notes on the user experience of new integer protocols · GitHub

Text:

Notes on the user experience of new integer protocols

Xiaodi Wu
June 17, 2017
<Notes on the user experience of new integer protocols · GitHub;
Introduction

The design, re-design, and implementation of SE-0104
<https://github.com/apple/swift-evolution/blob/master/proposals/0104-improved-integers.md>,
a proposal to revise integer protocols in Swift, is now largely complete in
shipping previews of Swift 4. As an exercise, I have used the new APIs to
develop a set of additional numeric facilities
<https://github.com/xwu/NumericAnnex>\. Here are some insights gained from
that experience--and from two unpublished exercises to implement a BigInt
type (one from scratch, one wrapping GMP)--as well as suggestions for
improvement.
<Notes on the user experience of new integer protocols · GitHub

   1. Performing heterogeneous comparison with integer literals
   2. Conforming to _ExpressibleByBuiltinIntegerLiteral
   3. Overriding implementations of heterogeneous comparison and bit shift
   operators
   4. Defining the semantics of masking shifts for arbitrary-precision
   integers
   5. Using ArithmeticOverflow
   6. Initializing integers from a floating-point source

<Notes on the user experience of new integer protocols · GitHub
comparison with integer literals

SE-0104 added heterogeneous comparison and bit shift operators to the
language to improve the user experience (for example, you can now check if
an Int value is equal to a UInt value).1
<Notes on the user experience of new integer protocols · GitHub;

1 Similar enhancements for operations such as addition are not yet possible
because a design is lacking for how to express *promotion*. When (if) Swift
allows integer constants in generic constraints, this may become a more
realistic prospect as promotion would then be expressible using generic
constraints (e.g. func + <T, U>(lhs: T, rhs: U) -> U where T :
FixedWidthInteger, U : FixedWidthInteger, T.BitWidth < U.BitWidth). This is
meant to be an aside and is certainly outside the scope of correcting or
incrementally improving upon SE-0104.

These operators behave as intended with concrete types, but comparisons in
generic algorithms behave differently. This was encountered during review
of the standard library's implementation of DoubleWidth, which in fact had
a bug as a consequence of the following behavior:

func f() -> Bool {
  return UInt.max == ~0
}
func g() -> Bool {
  return UInt.max == .max
}
func h<T : FixedWidthInteger>(_: T.Type) -> Bool {
  return T.max == ~0
}
func i<T : FixedWidthInteger>(_: T.Type) -> Bool {
  return T.max == .max
}
f() // Returns `true`.g() // Returns
`true`.h(UInt.self) // Returns `false`.i(UInt.self) // Returns `true`.

The reason that h(_:) gives a surprising result is explained as follows:

···

-

   Each concrete integer type implements its own overload of the
   homogeneous comparison operator, whereas protocol extension methods
   implement heterogeneous comparison operators. When an integer literal is
   used on the right-hand side of the comparison, the compiler looks first to
   the concrete type for an suitable implementation of the operator and finds
   the homogeneous overload. Therefore, it does *not* traverse the protocol
   hierarchy and instead infers the literal to be of the same type as the
   left-hand side.
   -

   In generic code, *even if* the most refined protocol implements its own
   overload of the homogeneous comparison operator, the compiler will look for
   all overloads of the operator by traversing the entire protocol hierarchy.
   Since heterogeneous comparison operators are defined somewhere along the
   hierarchy, the compiler will *always* find an overload that accepts the
   "preferred" integer literal type (Swift.IntegerLiteralType, aka Int) and
   infers the literal to be of type Int.

Therefore, in the invocation h(UInt.self), we are actually comparing
UInt.max to ~(0 as Int). This is a surprising result, as evidenced by the
fact that a bug nearly slipped into the standard library itself.
<Notes on the user experience of new integer protocols · GitHub
for improvement

Based on the demonstration that the expression T.max == .max infers the
correct type in both concrete and generic code, I would suggest that type
inference for integer literals be changed based on the following (notional)
rule:

- Any use of an integer literal x should be equivalent to the use of an
implicit member expression .init(integerLiteral: x).

This generally preserves the current behavior that an integer literal will
preferentially be inferred to be of type IntegerLiteralType:

func j(_ x: Int) {
print("Int", x)
}
func j(_ x: UInt) {
print("UInt", x)
}
j(42) // Prints "Int
42".j(.init(integerLiteral: 42)) // Prints "Int 42".

Intriguingly, T.init(integerLiteral: 0) is currently ambiguous where T :
FixedWidthInteger, suggesting that some re-design of the protocol would
also be necessary to have this rule give the appropriate behavior.

func k<T : FixedWidthInteger>(_: T.Type) -> Bool {
return (0 as T) == T.init(integerLiteral: 0)
// error: ambiguous reference to member 'init(integerLiteral:)'}

<Notes on the user experience of new integer protocols · GitHub;
_ExpressibleByBuiltinIntegerLiteral

For a numeric type modeling rational numbers, it is natural to conform that
type to ExpressibleByIntegerLiteral. For instance, let x = 42 as Rational would
create an instance of Rational<Int> with numerator 42 and denominator 1. A
similar line of reasoning applies to a numeric type modeling complex
numbers (and likely other types as well).

To conform to the protocol ExpressibleByIntegerLiteral, the type must
implement an initializer of the form init(integerLiteral: U). However, U must
conform to an underscored protocol _ExpressibleByBuiltinIntegerLiteral.

For a type such as Rational<T>, where the numerator and denominator are of
type T, it is natural to have this initializer take an integer literal of
type T. Currently, therefore, this requires that T be constrained to an
underscored protocol. This is suboptimal for two reasons: (1)
_ExpressibleByBuiltinIntegerLiteral is an underscored protocol not meant
for public use; (2) it prevents an implementation of Rational<T> where T :
_ExpressibleByBuiltinIntegerLiteral from having, say, arbitrary-precision
integers as numerator and denominator (i.e., Rational<BigInt>) unless the
arbitrary-precision integer type itself conforms to
_ExpressibleByBuiltinIntegerLiteral.
<Notes on the user experience of new integer protocols · GitHub
for improvement

From a user perspective, the semantics of ExpressibleByIntegerLiteral suggest

that a generic type that has exclusively stored properties of type T where
T : ExpressibleByIntegerLiteral should be conformable to
ExpressibleByIntegerLiteral by implementing an initializer that accepts a
literal value of type T--without being constrained to any underscored
protocol. I suspect, however, that this is not a trivial detail to
implement in the compiler.
<Notes on the user experience of new integer protocols · GitHub
of heterogeneous comparison and bit shift operators

In Swift, there is a distinction between default implementations of
protocol requirements and protocol extension methods which are not
requirements. Both are defined in extensions to protocols, but default
implementations are dynamically dispatched and can be overridden by
conforming types, while extension methods can be shadowed but never
overridden. (For example, homogeneous == is a protocol requirement of
Equatable, but homogeneous != is a protocol extension method that cannot be
overridden.)

The paragraph above is meant to provide some background on existing
features of the language, but it is meant neither to question their
existence in the language nor to question the design of Equatable and
Comparable, which are far outside the scope of improving the revised
integer protocols.

The heterogeneous comparison and bit shift operators introduced in Swift 4
by SE-0104 are protocol extension methods.

Consider a custom BigInt type and the comparison (42 as BigInt) == (21 as
UInt). There is an efficient way to perform that comparison which does not
involve first converting the UInt value to a BigInt value. However, even if
BigInt manually implements this specialized comparison operator, a generic
algorithm implemented for all binary integers (say, for integer
exponentiation) will use the less efficient standard library implementation
of heterogeneous ==. By contrast, the same algorithm will use the most
specialized implementation of homogeneous ==, because that function is a
protocol requirement that is dynamically dispatched.
<Notes on the user experience of new integer protocols · GitHub
for improvement

Heterogeneous <, <=, ==, >, >= should be requirements of the protocol on
which they are now extension methods, as should heterogeneous masking
shifts. That would permit conforming types to provide more specialized
implementations. This may, however, increase the work of the type checker
at compile time.
<Notes on the user experience of new integer protocols · GitHub
of masking shifts for arbitrary-precision integers

SE-0104 introduced two different types of bit shift operations for
integers, called masking shifts and smart shifts.

Smart shifts, spelled << and >>, are protocol extension methods that always
behave in a well-defined way when overshifting or undershifting. For
example, x << -2 is now equivalent to x >> 2 in Swift 4, where previously
the behavior was undefined.

However, because there is sometimes a performance cost for branches that
handle overshifting and undershifting that cannot be optimized away, Swift
now offers masking shifts, spelled &<< and &>>. As explained in SE-0104,
"[a] masking shift logically preprocesses the right[-]hand operand by
masking its bits to produce a value in the range 0...(x-1) where x is the
number of bits in the left[-]hand operand." These semantics for masking
shifts make sense when the number of bits in the left-hand operand is
fixed, as is the case for all fixed-width integers.

However, *all* binary integers must implement &<< and &>>. For an
arbitrary-width integer type (which, for various reasons, are best
represented as sign-and-magnitude rather than two's-complement), the
literal interpretation of those semantics is problematic. For example, most
users might think that (1 as BigInt) << 1000 should not result in
overshift. However, the number of bits in the left-hand operand is 2, and
therefore a plausible interpretation of the semantics of &<< would have (1
as BigInt) &<< 1000 == 1. By extension, therefore, (1 as BigInt) << 1000 ==
0; this is a counterintuitive result.
<Notes on the user experience of new integer protocols · GitHub
for improvement

The semantics of smart shift should be clarified to state that the
right-hand operand is preprocessed by masking its bits to produce a value
in the range 0..<x where x is the maximum number of bits in a value of the
same type as the left-hand operand. This is equivalent to viewing
arbitrary-width integers *for the purposes of bitwise operations* as
notionally infinite sequences of bits sign-extended from the
two's-complement representation of the integral value.

For the purposes of implementation, BinaryInteger may require a new static
property maxBitWidth, which would be equal to bitWidth for fixed-width
integers and could be defined as Int.max for arbitrary-precision integers.
<Notes on the user experience of new integer protocols · GitHub;
ArithmeticOverflow

With SE-0104, the addWithOverflow family of arithmetic operations are not
longer static. They have been renamed addingReportingOverflow and the like,
and their return type has been changed from (T, Bool) to (partialValue: T,
overflow: ArithmeticOverflow). ArithmeticOverflow is an enum with two
cases, overflow and none. Initially, this may appear to be an improvement
in terms of readability.

However, in actual use, ArithmeticOverflow is extremely lacking in
ergonomics. Where previously it was possible to destructure the tuple and
evaluate overflow by writing if overflow { ... }, it is now required to use
a much more verbose incantation: if overflow == .overflow { ... }. There is
little else one can do with an ArithmeticOverflow result besides converting
it into a value of type Bool. When working with these operations
repeatedly--and, chances are that if you need such an operation, you don't
need it just once--this quickly becomes very cumbersome.

Inside the standard library, the ArithmeticOverflow values returned from
*ReportingOverflow functions are created by calling an initializer that
takes a Bool argument. Therefore, with every call to a
*ReportingOverflow function,
a Bool is converted to an ArithmeticOverflow, returned, then converted by
the user back into a Bool. Worse, based on comments in the standard
library, it appears that the compiler cannot currently elide these
operations.
<Notes on the user experience of new integer protocols · GitHub
for improvement

1.

   Change the return type of *ReportingOverflow functions to (partialValue:
   T, overflow: Bool) (or, didOverflow) and eliminate the
ArithmeticOverflow type.
   In practice, this would require deprecating the current functions and
   providing new overloads.
   2.

   Since the presence of type inference may cause (1) to be source-breaking
   even with proper deprecations, avoid creating a source-breaking change by
   naming the revised functions something else. Specifically, take the
   opportunity to improve the name of these functions, changing
   addingReportingOverflow to addingWithOverflowReporting (*mutatis
   mutandis* for the remaining functions). This avoids the "-ing -ing"
   clash and eliminates the nonsensical interpretation of the phrase "adding,
   reporting overflow by 42."

<Notes on the user experience of new integer protocols · GitHub
from a floating-point source

The new BinaryInteger protocol has the following requirements:

- init?<T : FloatingPoint>(exactly source: T)
- init<T : FloatingPoint>(_ source: T)

However, in my experience, it does not appear possible (or at least, it is
not apparent even after some careful thought) how to implement these
requirement solely with the properties and methods required by FloatingPoint.
There do, however, exist straightforward ways to convert
BinaryFloatingPoint values
to BinaryInteger values.

In the standard library, concrete integer types implement conversions to
and from concrete built-in floating-point types. However, whether
coincidentally or not, the standard library itself has not yet implemented
these protocol requirements. At last check, the entire implementation of
the first requirement was:

  public init?<T : FloatingPoint>(exactly source: T) {
    // FIXME(integers): implement fatalError()
  }

<Notes on the user experience of new integer protocols · GitHub
for improvement

Consider changing the protocol requirement so that T is constrained as
follows: T : BinaryFloatingPoint.

benrimmington · June 18, 2017, 10:30am

Re: <Notes on the user experience of new integer protocols · GitHub;

There are also some commented-out requirements, which were accepted for SE-0067.

(One requirement is obsoleted by the Numeric protocol).

Can these be implemented yet, based on your experience?

FloatingPoint

···

-------------

<https://github.com/apple/swift/blob/5ec0563c167a720e8c2c99390df6e032a24f2403/stdlib/public/core/FloatingPoint.swift.gyb#L256-L273>

  /* TODO: Implement the following APIs once a revised integer protocol is
      introduced that allows for them to be implemented. In particular, we
      need to have an "index of most significant bit" operation and "get
      absolute value as unsigned type" operation on the Integer protocol.

  /// Creates the closest representable value to the given integer.
  ///
  /// - Parameter value: The integer to represent as a floating-point value.
  init<Source: Integer>(_ value: Source)

  /// Creates a value that exactly represents the given integer.
  ///
  /// If the given integer is outside the representable range of the type, the
  /// result is `nil`.
  ///
  /// - Parameter value: The integer to represent as a floating-point value.
  init?<Source: Integer>(exactly value: Source)
  */

BinaryFloatingPoint
-------------------

<https://github.com/apple/swift/blob/5ec0563c167a720e8c2c99390df6e032a24f2403/stdlib/public/core/FloatingPoint.swift.gyb#L1476-L1491>

  /* TODO: Implement these once it becomes possible to do so (requires revised
      Integer protocol).
  /// Creates a new instance from the given value, rounded to the closest
  /// possible representation.
  ///
  /// - Parameter value: A floating-point value.
  init<Source: BinaryFloatingPoint>(_ value: Source)

  /// Creates a new instance equivalent to the exact given value.
  ///
  /// If the value you pass as `value` can't be represented exactly, the result
  /// of this initializer is `nil`.
  ///
  /// - Parameter value: A floating-point value to represent.
  init?<Source: BinaryFloatingPoint>(exactly value: Source)
  */

xwu · June 18, 2017, 7:47pm

Re: <
Notes on the user experience of new integer protocols · GitHub
>

There are also some commented-out requirements, which were accepted for
SE-0067.

(One requirement is obsoleted by the Numeric protocol).

Can these be implemented yet, based on your experience?

Based on a cursory thinking through only:

The first one, yes (if we observe that Integer is now BinaryInteger), but
probably inefficiently as it would have to use `words`, I think. If the
requirement were on BinaryFloatingPoint and T were constrained to
BinaryInteger & FixedWidthInteger, I think it could be more efficient. Of
course, both could be added. We could not remove the concrete initializer
from a UInt source, though, as initializing from `words` would rely on that.

The second one, I think so, but I haven't tried to make sure.

···

On Sun, Jun 18, 2017 at 05:30 Ben Rimmington <me@benrimmington.com> wrote:

FloatingPoint
-------------

<
https://github.com/apple/swift/blob/5ec0563c167a720e8c2c99390df6e032a24f2403/stdlib/public/core/FloatingPoint.swift.gyb#L256-L273
>

> /* TODO: Implement the following APIs once a revised integer protocol
is
> introduced that allows for them to be implemented. In particular,
we
> need to have an "index of most significant bit" operation and "get
> absolute value as unsigned type" operation on the Integer protocol.
>
> /// Creates the closest representable value to the given integer.
> ///
> /// - Parameter value: The integer to represent as a floating-point
value.
> init<Source: Integer>(_ value: Source)
>
> /// Creates a value that exactly represents the given integer.
> ///
> /// If the given integer is outside the representable range of the
type, the
> /// result is `nil`.
> ///
> /// - Parameter value: The integer to represent as a floating-point
value.
> init?<Source: Integer>(exactly value: Source)
> */

BinaryFloatingPoint
-------------------

<
https://github.com/apple/swift/blob/5ec0563c167a720e8c2c99390df6e032a24f2403/stdlib/public/core/FloatingPoint.swift.gyb#L1476-L1491
>

> /* TODO: Implement these once it becomes possible to do so (requires
revised
> Integer protocol).
> /// Creates a new instance from the given value, rounded to the closest
> /// possible representation.
> ///
> /// - Parameter value: A floating-point value.
> init<Source: BinaryFloatingPoint>(_ value: Source)
>
> /// Creates a new instance equivalent to the exact given value.
> ///
> /// If the value you pass as `value` can't be represented exactly, the
result
> /// of this initializer is `nil`.
> ///
> /// - Parameter value: A floating-point value to represent.
> init?<Source: BinaryFloatingPoint>(exactly value: Source)
> */