[Idea] Large integer literals


(Brent Royal-Gordon) #1

So, `BinaryInteger` means we can now create integer types larger than 64 bits. But how do we write them in our code? Currently, you support integer literals by conforming to `IntegerLiteralConvertible`, filling in its `IntegerLiteralType` associated type with one of the standard library's integer types. But that means a third-party integer type can't be initialized from a literal that won't fit into an `Int64` or `UInt64`.

The current implementation is that `IntegerLiteralConvertible.IntegerLiteralType` is invisibly constrained to `_BuiltinIntegerLiteralConvertible`, a parallel version which takes (via a typealias) a `Builtin.Int2048`. (I believe in my searches, I saw it implied somewhere that 2048 bits is a temporary limitation and it's eventually intended to be arbitrary precision, but let's leave that aside for now.) Thus, the Swift compiler internally supports integer literals up to 2048 bits long, but the standard library limits us to 64 bits of that. It'd be nice to do better than that.

I see a few different options here. There are probably others; if you think of something I've missed, please pipe up.

1. Conform `String` to `_BuiltinIntegerLiteralConvertible`

`String` could be conformed to `_BuiltinIntegerLiteralConvertible`. If you choose to make your `IntegerLiteralConvertible` use a String, then you'll receive a string of digits, and you'll parse those digits into your internal representation. I think we're even providing a string-to-integer conversion, so it would be fairly convenient.

This is probably the easiest option to implement, and it lends itself to becoming variable-width later. However, there are large and obvious downsides to this: strings are much larger than necessary and the conversion would be relatively slow. In practice, I think this would be a hack.

2. Use a tuple/array of `Int`s/`UInts`

Like `String`, this lends itself to becoming variable-width in a later version; unlike `String`, it's relatively compact. But also unlike `String`, it would be hard to get this working in Swift 3: tuples cannot be conformed to protocols like `_BuiltinIntegerLiteralConvertible`, and there's no conditional conformances to make `[Int]`/`[UInt]` conform to it either.

3. Introduce an `Int2048` type

Just as we wrap `Builtin.Int64` in an `Int64` type, so we could wrap `Builtin.Int2048` in a full-featured `Int2048` type. This would have the works: arithmetic operators, conversions, bitshifts, everything that `FixedWidthInteger` has to offer.

Though this would certainly work, and even be rather elegant, I don't think it's a good idea. 2048 bits is 256 bytes. It's *huge* compared to other integer types, and *huge* compared to what people usually need. I suspect that it would be an attractive nuisance: People would be drawn to use `Int2048` instead of writing, or finding a library implementing, the `Int128` or `UInt256` they actually needed. They would end up wasting hundreds of bytes of RAM, bus capacity, and cache every time they accessed one, and hundreds of operations every time they performed arithmetic.

In other words, let's not do that.

4. Introduce a `LargeIntegerLiteral` type as an alternative `IntegerLiteralType`.

`LargeIntegerLiteral` would be a struct or class that contained a `Builtin.Int2048`. (A class might be a good idea so that we wouldn't copy around 256 bytes of struct directly.) Its only constructor would be the hidden `_BuiltinIntegerLiteralConvertible` one, so only the compiler could construct one, and it would only support a minimal subset of `BinaryInteger`'s interface. Ideally, the fact that it has a 2048-bit limit would be an implementation detail, so we could change it later.

Rough sketch:

  public final class LargeIntegerLiteral: _BuiltinIntegerLiteralConvertible {
    private var _value: Builtin.Int2048
    
    // hidden from normal code; only available to the compiler
    public init(_builtinIntegerLiteral value: Builtin.Int2048) {
      _value = value
    }
    
    public var bitWidth: Int { return 2048 }
    public func word(at index: Int) -> UInt { ... }
    public var minimumSignedRepresentationBitWidth: Int { … }
    public var sign: Sign { ... }
  }

(There might need to be a separate `LargeUIntegerLiteral`‚ÄĒI haven't thought deeply about this.)

Usage might be something like (I've made no attempt to test or optimize this):

  extension DoubleWidth {
    init(integerLiteral value: LargeIntegerLiteral) {
¬†¬†¬†¬†¬†¬†// This precondition isn't quite right‚ÄĒit isn't taking into account whether Self is signed.
      precondition(value.minimumSignedRepresentationBitWidth <= bitWidth, "Integer literal out of range")
      
      // We want the unsigned variant even if we're currently signed.
      var bits = Magnitude()
      if value.sign == .minus {
        bits = ~bits
      }
      
      for i in 0 ..< countRepresentedWords {
        let word = value.word(at: i)

        bits &<<= word.bitWidth
        bits |= Magnitude(truncatingOrExtending: word)
      }
      
      self.init(bitPattern: bits)
    }
  }

Actually, I suspect this could be put into an extension on `FixedWidthInteger` and many types would never have to write it themselves.

(P.S. I didn't notice the changes to bitshifts when we were reviewing SE-0104. They are *excellent*.)

5. Rework `IntegerLiteralConvertible` to only use `LargeIntegerLiteral`

Well, `LargeIntegerLiteral` would probably be named `IntegerLiteral` in this plan.

Basically, this would follow the previous approach, except it would also eliminate the `_BuiltinIntegerLiteralConvertible` conformance on all other types. Thus, *all* `IntegerLiteralConvertible` types, both standard library and third-party, would be initialized from an `IntegerLiteral` instance:

  public protocol IntegerLiteralConvertible {
    init(integerLiteral value: IntegerLiteral)
  }

The cool thing about this is that it *massively* simplifies the standard library‚ÄĒit actually eliminates an associated type!‚ÄĒand doesn't privilege the standard library over other types. (Well, except that stdlib can reach into the `IntegerLiteral` and work on the `value` within.) The less cool thing is that the optimizer might have to re-learn how to elide `init(integerLiteral:)` calls on built-in types.

Recommendation

My preference is an eventual transition to #5 using a version of #4 as a stopgap.

The transitional design would look like this: We underscore `IntegerLiteralConvertible.IntegerLiteralType` and then default it to `IntegerLiteral`; then we also underscore the current version of `init(integerLiteral:)` and introduce a replacement that always takes an `IntegerLiteral`. We implement the new `IntegerLiteral` based initializers on the various conforming types, but probably don't use them that heavily yet. The compiler would generate the same code, except it would now call the underscored initializer.

  public protocol IntegerLiteralConvertible {
    // This is the public face of the protocol…
    init(integerLiteral value: IntegerLiteral)
    
    // …but things actually go through here.
    associatedtype _BuiltinIntegerLiteralType: _BuiltinIntegerLiteralConvertible = IntegerLiteral
    init(_integerLiteral value: _BuiltinIntegerLiteralType)
  }
  
  extension IntegerLiteralConvertible where _BuiltinIntegerLiteralType == IntegerLiteral {
    public init(_integerLiteral value: _BuiltinIntegerLiteralType) {
      self.init(integerLiteral: value)
    }
  }

This would force third-party types to use `IntegerLiteral`, but allow standard library types (and the compiler) to continue using the existing, but slightly renamed, `_BuiltinIntegerLiteralConvertible` conformances instead. Eventually‚ÄĒpossibly after Swift 3‚ÄĒwe would eliminate the other `_BuiltinIntegerLiteralConvertible` conformances, at which point we can switch the compiler to using the public protocol instead, eliminate the underscored parts of `IntegerLiteralConvertible`, and get on with our lives.

···

--
Brent Royal-Gordon
Architechies


(Brent Royal-Gordon) #2

Tangentially, I now realize that this was totally backwards. Should be more like:

      for i in 0 ..< countRepresentedWords {
        let word = value.word(at: i)
        let shift = i * word.bitWidth
        bits |= Magnitude(truncatingOrExtending: word) &<< shift
      }

Always fun when your dumb mistakes get sent to large numbers of people.

···

On Jul 3, 2016, at 9:20 PM, Brent Royal-Gordon <brent@architechies.com> wrote:

      for i in 0 ..< countRepresentedWords {
        let word = value.word(at: i)

        bits &<<= word.bitWidth
        bits |= Magnitude(truncatingOrExtending: word)
      }

--
Brent Royal-Gordon
Architechies


(Dmitri Gribenko) #3

My recommended approach is to follow the example of DictionaryLiteral,
and define a "transport" data type that can accurately capture the
literal contents. The transport data type should not have any
operations beyond the bare minimum (not even the BinaryInteger
conformance), so that the users are not tempted to use it as a
big-integer type.

The IntegerLiteral type would probably just wrap an
UnsafeBufferPointer to the readonly data segment, where the compiler
would lay out the literal. (Very much like StaticString. Maybe it
should be renamed to StringLiteral?)

I would not recommend wrapping a Builtin.Int2048 by value, this leads
to very inefficient code. When it comes from the standard library,
this code is currently cleaned up by the optimizer because everything
that touches Builtin.Int2048 is marked @_transparent; but ordinary
frameworks can't do that.

A related issue is that the generic entry point that accepts
Builtin.Int2048 is still emitted by the compiler, and it is actually
called from unspecialized generic code when creating literals of
generic types.

func f<T : Integer>() -> T{
  let x: T = 0
  return x
}

It would be great if we introduced another entry point in
IntegerLiteralConvertible that accepted [U]Int64 (and/or [U]Int128),
which the compiler would prefer if the literal is small (and it
usually is).

Dmitri

···

On Sun, Jul 3, 2016 at 9:20 PM, Brent Royal-Gordon via swift-evolution <swift-evolution@swift.org> wrote:

2. Use a tuple/array of `Int`s/`UInts`

Like `String`, this lends itself to becoming variable-width in a later version; unlike `String`, it's relatively compact. But also unlike `String`, it would be hard to get this working in Swift 3: tuples cannot be conformed to protocols like `_BuiltinIntegerLiteralConvertible`, and there's no conditional conformances to make `[Int]`/`[UInt]` conform to it either.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/