A novel binary integer abstraction

A novel binary integer abstraction

Hello, everybody! I'm working on a novel binary integer abstraction on GitHub.

I thought I'd share it in case some of you find it interesting. It unifies fixed-width and variable-width representations and brings recovery mechanisms to generic code. It's experimental, and I don't expect it to stabilize anytime soon, but it's still super neat in its current state. It's more or less at the point where I want to build something with it and iterate based on that experience.

I'm 5 months and (128,665 ++, 102,232 --) deep into it.

It features 8-bit to 64-bit integers, a generic un/signed double-width model, and a generic un/signed big-integer type. It also introduces a notion of infinity to ensure invariants like ~(~(x)) == x for all x.

The README.md is a work in progress, but it explains the new concepts.

Also, I have seen @wadetregaskis mention my predecessor project Numberick a few times, and I appreciate the positive commentary. Thanks! Having said that, I don't plan on maintaining the old project since the new abstraction is much more powerful.


Example: constant-foldable generic comparisons...
extension BinaryInteger {
    @inlinable public func compared<Other>(to other: Other) -> Signum where Other: BinaryInteger {
        if  let lhsSize = UX(size: Self.self), let rhsSize = UX(size: Other.self) {
            if  lhsSize < rhsSize {
                return Other(load: self).compared(to: other)
                
            }   else if lhsSize > rhsSize {
                return self.compared(to: Self(load: other))
                
            }   else if Self.isSigned, !Other.isSigned {
                return self .isNegative ? Signum.less : Other(load: self).compared(to: other)
                
            }   else if !Self.isSigned, Other.isSigned {
                return other.isNegative ? Signum.more : self.compared(to: Self(load:  other))
                
            }   else {
                return self.compared(to: Self(load: other))
            }
            
        }   else {
            //  note that the elements must be rebindable one way or the other
            if  Other.elementsCanBeRebound(to: Self.Element.Magnitude.self) {
                return self.withUnsafeBinaryIntegerElements { lhs in
                    (other).withUnsafeBinaryIntegerElements(as: Self.Element.Magnitude.self) { rhs in
                        DataInt.compare(
                            lhs: lhs, lhsIsSigned: Self .isSigned,
                            rhs: rhs, rhsIsSigned: Other.isSigned
                        )
                    }!
                }
                
            }   else {
                Swift.assert(Self.elementsCanBeRebound(to: Other.Element.Magnitude.self))
                return self.withUnsafeBinaryIntegerElements(as: Other.Element.Magnitude.self) { lhs in
                    (other).withUnsafeBinaryIntegerElements { rhs in
                        DataInt.compare(
                            lhs: lhs, lhsIsSigned: Self .isSigned,
                            rhs: rhs, rhsIsSigned: Other.isSigned
                        )
                    }
                }!
            }
        }
    }
}
Example: a fully generic and fully recoverable Fibonacci sequence...
@frozen public struct Fibonacci<Value> where Value: BinaryInteger {
    
    //=------------------------------------------------------------------------=
    // MARK: State
    //=------------------------------------------------------------------------=
    
    @usableFromInline var i: Value
    @usableFromInline var a: Value
    @usableFromInline var b: Value
    
    //=------------------------------------------------------------------------=
    // MARK: Initializers
    //=------------------------------------------------------------------------=
    
    /// Creates the first sequence pair.
    @inlinable public init() {
        self.i = 0
        self.a = 0
        self.b = 1
    }
    
    /// Creates the sequence pair at the given `index`.
    @inlinable public init(_ index: Value) throws {
        if  Bool(index.appendix) {
            throw Value.isSigned ? Failure.indexOutOfBounds : Failure.overflow
        }
        
        self.init()
        
        try index.withUnsafeBinaryIntegerElementsAsBytes {
            let x = $0.body.count(.nondescending(.zero))
            for i in (0 ..< x).reversed() {
                try self.double()
                
                if  $0.body[unchecked: i &>> 3] &>> U8(load: i) & 1 != 0 {
                    try self.increment()
                }
            }
        }
    }
    
    //=------------------------------------------------------------------------=
    // MARK: Accessors
    //=------------------------------------------------------------------------=
    
    /// The sequence `index`.
    @inlinable public var index: Value { 
        self.i
    }
    
    /// The sequence `element` at `index`.
    @inlinable public var element: Value {
        self.a
    }
    
    /// The sequence `element` at `index + 1`.
    @inlinable public var next: Value { 
        self.b
    }
    
    //=------------------------------------------------------------------------=
    // MARK: Transformations
    //=------------------------------------------------------------------------=
    
    /// Forms the sequence pair at `index + 1`.
    @inlinable public mutating func increment() throws {
        let ix = try i.plus(1).prune(Failure.overflow)
        let bx = try a.plus(b).prune(Failure.overflow)
        
        self.i = consume ix
        self.a = b
        self.b = consume bx
    }
    
    /// Forms the sequence pair at `index - 1`.
    @inlinable public mutating func decrement() throws {
        let ix = try i.minus(1).veto({ $0.isNegative }).prune(Failure.indexOutOfBounds)
        let ax = try b.minus(a).prune(Failure.overflow)
        
        self.i = consume ix
        self.b = a
        self.a = consume ax
    }
    
    //=------------------------------------------------------------------------=
    // MARK: Transformations
    //=------------------------------------------------------------------------=
    
    /// Forms the sequence pair at `index * 2`.
    @inlinable public mutating func double() throws {
        let ix = try i.times(2).prune(Failure.overflow)
        let ax = try b.times(2).minus(a).times (a).prune(Failure.overflow)
        let bx = try b.squared().plus(a.squared()).prune(Failure.overflow)

        self.i = consume ix
        self.a = consume ax
        self.b = consume bx
    }
    
    /// Forms the sequence pair at `index + x.index`.
    @inlinable public mutating func increment(by x: Self) throws {
        let ix = try i.plus (x.i).prune(Failure.overflow)
        let ax = try a.times(x.b).plus(b.minus(a).times(x.a)).prune(Failure.overflow)
        let bx = try b.times(x.b).plus(       (a).times(x.a)).prune(Failure.overflow)

        self.i = consume ix
        self.a = consume ax
        self.b = consume bx
    }
    
    /// Forms the sequence pair at `index - x.index`.
    @inlinable public mutating func decrement(by x: Self) throws {
        let ix = try i.minus(x.i).veto({ $0.isNegative }).prune(Failure.indexOutOfBounds)
        
        var a0 = b.times(x.a).value
        var a1 = a.times(x.b).value
        var b0 = b.plus(a).times(x.a).value
        var b1 = b.times(x.b).value
        
        if  Bool(x.i.leastSignificantBit) {
            Swift.swap(&a0, &a1)
            Swift.swap(&b0, &b1)
        }
        
        self.i = consume ix
        self.a = a1.minus(a0).value
        self.b = b1.minus(b0).value
    }
    
    //*========================================================================*
    // MARK: * Failure
    //*========================================================================*
    
    public enum Failure: Error {
        
        /// Tried to form a sequence pair that cannot be represented.
        case overflow
        
        /// Tried to form a sequence pair at an index less than zero.
        case indexOutOfBounds
    }
}
5 Likes

Looks nice! I created a PR with some typos fixed

1 Like

Neat! Apologies if I missed it, but I read through the readme and didn't see a direct performance comparison to the stdlib integers. Do you have benchmarks somewhere?

I understand many ops may constant fold, I'm curious how many do in practice. Additionally would a notion of const be helpful for ensuring constant folding anywhere in this library?

1 Like

Neat! Apologies if I missed it, but I read through the readme and didn't see a direct performance comparison to the stdlib integers. Do you have benchmarks somewhere?

I performed a lot of micro benchmarks in Numberick, but there was little payoff and nobody cared until quite recently when somebody noticed that this process had made the fib sequence quite speedy. Still, there has been no real payoff, so I've omitted benchmarking. Instead, I have focused on abstraction, complexity, and recovery in generic code. I would rather build something cool and measure the performance by proxy. I have compared the fib sequence, however, and it's a bit faster than it was in Numberick. So the transition from mutating methods to consuming methods seems to have gone well (± the occasional nonzero exit code).

//   MacBook Pro, 13-inch, M1, 2020, -O, code coverage disabled
try! Fibonacci<UXL>( 1_000_000) // 0.04s
try! Fibonacci<UXL>(10_000_000) // 1.65s from 2.30s in Numberick

I understand many ops may constant fold, I'm curious how many do in practice. Additionally would a notion of const be helpful for ensuring constant folding anywhere in this library?

In my experience, constants :tm: almost always fold if you follow the rules to the point that a function call can be specialized. At least, that's my experience looking at Swift.FixedWidthInteger assembly on Godbolt. Again, I don't know if they do in this project, because I have not looked at that level of detail. I suppose a compiler hint would be useful if it could guarantee that they are evaluated at compile time. I'm not sure how such a hint would work with the dynamic any stuff, which I rely on a lot for testing purposes, however.

2 Likes

Good performance is something people rarely comment on, even though it matters to them. And with relatively uncommon domains, like arbitrary-precision integers, people don't have a good intuition for what the performance should be, so they might just accept whatever they get.

I think Numberick may have also simply suffered from being ranked lower in searches than BigInt, in part down to simple things like the name not being as self-explanatory.

Also, your readme on Numberick does say that it's basically abandoned, so that's going to discourage people from using it. :slightly_smiling_face:

2 Likes

My point is that benchmarking is an unpaid time sink. I'd need food, not GitHub stars.

1 Like

Here's a cute measurement for those of you who like benchmarks:

MacBook Pro, 13-inch, M1, 2020, -O, code coverage disabled.
let fib1e6 = try! Fibonacci<UXL>(1_000_000)

let fib1e6r10 = fib1e6.element.description(as:     .decimal) // 0.924s (208988 digits)
let fib1e6r16 = fib1e6.element.description(as: .hexadecimal) // 0.002s (173561 digits)

try! UXL(fib1e6r10, as:     .decimal) // 0.040s (208988 digits)
try! UXL(fib1e6r16, as: .hexadecimal) // 0.002s (173561 digits)

This example is on the asymptotic side of things, and there might be better
algorithms for silly numbers, but silly examples are more fun. It also shows
that hexadecimal is king.

The conversion model uses the same non-generic and non-inlinable algorithms for all binary integers. So it's really optimized for size rather than performance, but it's still quite decent—as you can see.

The encoding strategy is more or less the algorithm I wrote for the Foundation project but generalized over the various radices. I would have proposed an arbitrary-precision decoding strategy as well if Swift.BinaryInteger did not make it so difficult. The problem is that it lacks recovery options (1) or a bitwise initializer (2). It is still possible to implement generic arbitrary-precision decoding, but it requires a Swift.BigInt model because of (1) and (2).

2 Likes