Adding isPowerOf2 to BinaryInteger

dingobye · May 4, 2019, 12:02am

Thank you for the feedback, Steve!

Agree that popcount can be better for bignums (with a large amount of bits representation). However, it may introduce the following problems:

[1] For some hardwares without the popcnt like instructions, it probably ends up counting the 1-bits naively. It would be an overkill, especially for bignums.

[2] The popcount intrinsic requires the number of bits determined at compile time. As a result, it would narrow down the scope of the functionality from BinaryInteger to FixedWidthInteger.

dingobye · May 4, 2019, 12:12am

Thanks for your inputs, Nevin~

If we use the short pass / long pass implementation as shown below, querying for a constant 2 can be as efficient as isPowerOf2, on condition that the type Self is obvious enough (e.g., Int, Uint) to compiler to eliminate the other == 2 check. It may be challenging when Self is some complicated types.

    @inlinable  
    public func isPower(of other: Self) -> Bool {
        if other == 2 {
           // short path
           return self > 0 && self & (self - 1) == 0
        }
        // long path
        ...
    }

idrougge · May 7, 2019, 12:58am

If you define isPowerOf2 instead of isPower(of:), you can define it as a computed property, since it has little cost and doesn't require any arguments. Defining it on non-integers may not have as much value, either, at least not for the use cases I envision.

dingobye · May 8, 2019, 9:01am

I have done some experiments to compare the performance of different strategies, including isPowerOf2 (the straight forward solution), isPowerOf2_ctpop (the popcount implementation), and isPower(of:) (the more general interface). The table below shows the execution time (in sec) for type Int under the -O optimization level across a range of devices.

.	iMac (3.8G Core i5)	Macbook Pro (2.9G Core i7)	iPhone X Max	iPhone X	iPhone 6S	iPad Air (A1474)
isPowerOf2	0.083	0.098	0.121	0.148	0.194	0.305
isPowerOf2_ctpop	0.135	0.154	0.121	0.148	0.222	0.343
isPower(of: 2)	0.082	0.097	0.121	0.148	0.193	0.305
isPower(of: 3)	1.330	1.472	1.442	1.603	3.102	3.775

It can be seen that:

The popcount implementation can be as efficient as isPowerOf2 on some latest devices (e.g., iPhone X and iPhone X Max), but overall is less performant.
In addition, isPower(of: 2) (i.e., the slow path of the general interface isPower(of:)) is as efficient as isPowerOf2.
Moreover, the complexity of isPower(of: non2) is O(log(self) with base non2), and it can be an order of magnitude slower than the other three O(1) scenarios. I think it is acceptable because checking for powers of non-2 bases is much less commonly used, and there isn't any known efficient algorithm.

The results for other built-in integer types have a similar pattern to this one, so they are not given in this table. The code of such experiment is shown below:

import Foundation
extension BinaryInteger {
    // Proposed solution
    @inlinable public var isPowerOf2: Bool {
        return self > 0 && self & (self - 1) == 0
    }

    // Alternative solution #1
    //@{
    @inlinable public func isPower(of other: Self) -> Bool {
        if other == 2 { return self.isPowerOf2 } // fast path
        return self._isPower(of: other)          // slow path
    }
    @usableFromInline internal func _isPower(of other: Self) -> Bool {
       // If self is 1, then it is always true.
       if self == 1 { return true }
       // Now self cannot be 1, so it is false when other is 1.
       if other == 1 { return false }
       // If other is 0, then self must be 0; if other is -1, then self must be -1.
       if (other == 0) || (other < 0 && other + 1 == 0) { return self == other }

       // At this point, we have abs(other) > 1.
       guard self % other == 0 else { return false }
       let bound = self / other
       var x: Self = 1
       while (x > 0 ? x : 0 - x) < (bound > 0 ? bound : 0 - bound) { x *= other }
       return x == bound
    }
    //@}
}


extension FixedWidthInteger {
    // Alternative solution #2
    @inlinable public var isPowerOf2_ctpop: Bool {
        return self > 0 && self.nonzeroBitCount == 1
    }
}


func timing(title: String, op: ()->()) {
    let t1 = CFAbsoluteTimeGetCurrent() // <------ Start timing
    op()
    let t2 = CFAbsoluteTimeGetCurrent() // <------ End timing
    let d = String(format: "%.3f", t2 - t1)
    print("\(title):  execution time is \(d) s.")
}


func measure<T: FixedWidthInteger>(_ :T.Type, _ repeating: Int = 100_000_000) {
    timing(title: "\(T.self).isPowerOf2      ", op: {
        var n: T = T(1) << (T.bitWidth - 2)
        for _ in 0..<repeating {
            if n.isPowerOf2 {
                n += 1
            } else {
                n -= 1
            }
        }
        guard n != 0 else { exit(-1) }
    })
    timing(title: "\(T.self).isPowerOf2_ctpop", op: {
        var n: T = T(1) << (T.bitWidth - 2)
        for _ in 0..<repeating {
            if n.isPowerOf2_ctpop {
                n += 1
            } else {
                n -= 1
            }
        }
        guard n != 0 else { exit(-1) }
    })
    timing(title: "\(T.self).isPower(of: 2)  ", op: {
        var n: T = T(1) << (T.bitWidth - 2)
        for _ in 0..<repeating {
            if n.isPower(of: 2) {
                n += 1
            } else {
                n -= 1
            }
        }
        guard n != 0 else { exit(-1) }
    })
    timing(title: "\(T.self).isPower(of: 3)  ", op: {
        var n: T = T(1) << (T.bitWidth - 2)
        for _ in 0..<repeating {
            if n.isPower(of: 3) {
                n += 1
            } else {
                n -= 1
            }
        }
        guard n != 0 else { exit(-1) }
    })
}


measure(UInt8.self)
measure(UInt16.self)
measure(UInt32.self)
measure(UInt64.self)
measure(Int8.self)
measure(Int16.self)
measure(Int32.self)
measure(Int64.self)

dingobye · May 8, 2019, 9:05am

Thanks for your feedback, Iggy~

Agree that when it comes to the isPowerOf2 form, we'd go with a computed property. I think isPower(of:) appears to have better consistency with the existing isMultiple(of:), though.

dingobye · May 8, 2019, 9:09am

Hi Steve, I did some experiments regarding the performance. The result shows popcount == 1 can be as efficient as n & (n - 1) == 0 at some latest iPhone devices, even for small fixed-width integers.

dingobye · May 8, 2019, 9:14am

Hi Nevin, according to the experiments, isPower(of: 2) is as performant as isPowerOf2 for those built-in integer types. I think it is because the compiler does a good job eliminating the other == 2 check in isPower(of:), and thus generates code identical to isPowerOf2.

Nevin · May 8, 2019, 11:26am

Great, thanks for testing it out!

I think it is clearer to write this as:

while x.magnitude < bound.magnitude { x *= other }

• • •

Somewhat off-topic, but on platforms where popCount is fast, remainders modulo certain integers (including most small integers) can be calculated quickly as well. For example:

extension UInt64 {
  func mod3() -> UInt64 {
    let mask: UInt64 = 0xAAAA_AAAA_AAAA_AAAA  // bits: 1010...
    let p1 = self.nonzeroBitCount
    let p2 = (self & mask).nonzeroBitCount
    let p = UInt64(p1 &+ p2)                  // max 96, then 7
    return (p < 3) ? p : (p &- 3).mod3()      // max 3 recursions
  }
}

dingobye · May 8, 2019, 1:36pm

Thanks for these tips, Nevin!

magnitude looks much better, and it boosts performance to some extent as well. As for the remainders modulo algorithm, it took me some time to figure it out how it works; and yes, it is brilliant to use the fact (4 * x + y) % 3 == (x + y) % 3 to quickly converge the recursion.

michelf · May 8, 2019, 3:13pm

If your base is itself a power of two—such as in isPower(of: 4) and isPower(of: 8)—I believe you can first check for isPowerOf2 and then bit-mask with a pattern of the form 0b...1010101010101 (for base 4) or 0b...1001001001001 (for base 8), etc. I'm not too sure how to create the pattern without a loop, but with a constant base the pattern could end up optimized to a compile-time constant.

Nevin · May 8, 2019, 3:23pm

Something like this?

if base.isPower(of: 2) {
  if !self.isPower(of: 2) { return false }
  let baseZeros = base.trailingZeroBitCount
  return trailingZeroBitCount.isMultiple(of: baseZeros)
}

scanon · May 8, 2019, 4:14pm

Huh? Where are you getting this from? The complexity of testing if an integer is a perfect power is something like O(n log^2 n), where n is the number of bits in self (i.e. essentially O(log(self))). I'm not doing the full analysis right now, but basically you can estimate the exponent it would have if it were a perfect power very quickly (count the bits in self and other, divide), then compute that power via repeated squaring (dominated by the cost of the final multiplication, so instead O(n^2 some_log_nonsense) if you use naive multiplication), and test for equality, which is O(n).

For small fixed-width integers, you can do better, but in those cases it's O(1) anyway.

Apologies for the handwavy analysis, but the complexity of this operation certainly is not exponential. There may be an even better option than the algorithm I sketched out above, but I would need to check in my copy of Computer Arithmetic.

erikstrottmann · May 8, 2019, 5:13pm

I have no comment about the implementation, but if we end up exposing a power of two computed property, the spelling isPowerOfTwo seems more consistent with existing names than isPowerOf2. Compare BinaryFloatingPoint property trailingZeroBitCount and FloatingPoint properties isZero and ulpOfOne, all of which spell out the number instead of using a numeral.

michelf · May 8, 2019, 8:22pm

Yes! That works and is easier to code and understand than my idea of a bit mask.

dingobye · May 8, 2019, 11:44pm

Sorry for the misleading, Steve! I meant log(self) with base non2, however failed to manage writing non2 as a subscript.

dingobye · May 9, 2019, 12:03am

Thanks for such further optimization, Michel!

I am thinking of if we should also have a dedicated optimization when base is 10, which may be a bit more commonly used. However, I am not aware of any efficient implementation in this case.

dingobye · May 9, 2019, 12:06am

Hi Erik, yes it fits the swift naming convention. Thank you so much for pointing this out!

michelf · May 9, 2019, 10:47am

Well, me neither, but let's explore a bit...

Looking at the bit pattern, I think there's one precheck you could use to fail early when checking for base 10:

var perhapsIsPowerOf10: Bool {
  let exponent = trailingZeroBitCount // see details below
  return (self >> exponent) & 0b1111 == ((exponent << 2) | 0b01) & 0b1111
}

Details

Some observations for the binary pattern for powers of 10 (separated into columns below):

Column 1 is some "gibberish"
Column 2 is always the last two bits of the exponent
Column 3 is always 01
Column 4 is the trailing zeros, always in equal number to the exponent value

10^0                                                 1
10^1                                              1_01_0
10^2                                           1_10_01_00
10^3                                         111_11_01_000
10^4                                      100111_00_01_0000
10^5                                    11000011_01_01_00000
10^6                                  1111010000_10_01_000000
10^7                               1001100010010_11_01_0000000
10^8                             101111101011110_00_01_00000000
10^9                           11101110011010110_01_01_000000000
10^10                       10010101000000101111_10_01_0000000000
10^11                     1011101001000011101101_11_01_00000000000
10^12                   111010001101010010100101_00_01_000000000000
10^13                100100011000010011100111001_01_01_0000000000000
10^14              10110101111001100010000011110_10_01_00000000000000
10^15            1110001101011111101010010011000_11_01_000000000000000
10^16         1000111000011011110010011011111100_00_01_0000000000000000
10^17       101100011010001010111100001011101100_01_01_00000000000000000
10^18     11011110000010110110101100111010011101_10_01_000000000000000000

So the algorithm is to look at the number of trailing zeros to get the exponent. Then we make sure column 2 matches the last two bits of the exponent and column 3 matches 0b01. Since column 1 is left unchecked, we don't know for sure we have a power of 10.

But do we have a fast way to compute a power of 10? Since we can easily find the exponent from the bit pattern we could just compute 10^exponent and that'd give us an exact result to check for. That'd be better than just knowing you may have a power of 10.

There aren't that many powers of 10 that can fit in 64 bits anyway. Let's make a lookup table with 20 entries that'll give us a very quick check for powers of 10 in the 64-bit range. Some other algorithm can be used above that.

Full implementation (fast path up to 64-bit)

var isPowerOf10: Bool {
	let exponent = trailingZeroBitCount
	switch exponent {
	case 0:  return self == 1 as UInt8
	case 1:  return self == 10 as UInt8
	case 2:  return self == 100 as UInt8
	case 3:  return self == 1000 as UInt16
	case 4:  return self == 10000 as UInt16
	case 5:  return self == 100000 as UInt32
	case 6:  return self == 1000000 as UInt32
	case 7:  return self == 10000000 as UInt32
	case 8:  return self == 100000000 as UInt32
	case 9:  return self == 1000000000 as UInt32
	case 10: return self == 10000000000 as UInt64
	case 11: return self == 100000000000 as UInt64
	case 12: return self == 1000000000000 as UInt64
	case 13: return self == 10000000000000 as UInt64
	case 14: return self == 100000000000000 as UInt64
	case 15: return self == 1000000000000000 as UInt64
	case 16: return self == 10000000000000000 as UInt64
	case 17: return self == 100000000000000000 as UInt64
	case 18: return self == 1000000000000000000 as UInt64
	case 19: return self == 10000000000000000000 as UInt64
	default:
		// if this is 64-bit or less we can't have a higher power of 10
		if self.bitWidth <= 64 { return false }
		// fail early it not matching the easy-to-check bit pattern
		if !self.perhapsIsPowerOf10 { return false }
		// now time for the slow path
		return self._isPower(of: 10)
	}
}

Note: Beyond 64 bits, the performance characteristics will be uneven.

Generalizing for all bases multiple of 10

You can also generalize the code above to other bases that are a multiple of 10 by adding a precheck of the base-10 exponent found in the bit pattern:

func isPower(ofMultipleOf10 base: Int) -> Bool {
	precondition(base.isMultiple(of: 10))
	let exponentOf10 = trailingZeroBitCount // exponent for base 10
	let exponentOfBase = base.trailingZeroBitCount
	if !exponentOf10.isMultiple(of: exponentOfBase) { return false }
	return self.isPowerOf10
}

jrose · May 9, 2019, 6:06pm

To generalize this further (for math fun), you're observing that 10^x = 2^x * 5^x, and that 5^x % 4 is always 1. So for any isPower(of:) argument, you should be able to do something like this:

func isPower(of base: Int) -> Bool {
  // some limitation on negative numbers I haven't thought through
  let trailingZeroMultiplier = base.trailingZeroBitCount
  guard trailingZeroMultiplier != 0 else {
    return self.isPowerSlow(of: base)
  }
  let exponent = self.trailingZeroBitCount / trailingZeroMultiplier
  return (self >> self.trailingZeroBitCount).isParticularPower(of: base, exponent: exponent)
}

dingobye · May 10, 2019, 1:15am

Interesting pyramid pattern!

For the switch statement, we test trailingZeroBitCount in the first place, and then check if self equals a particular 10^n (up to 64-bit). Can we get rid of trailingZeroBitCount to make it straight forward? It could be something like:

var isPowerOf10: Bool {
     if self == 1 as UInt8 ||
        self == 10 as UInt8 ||
        self == 100 as UInt8 ||
        ...
        self == 10000000000000000000 as UInt64 { return true }

	// if this is 64-bit or less we can't have a higher power of 10
	if self.bitWidth <= 64 { return false }
	// fail early it not matching the easy-to-check bit pattern
	if !self.perhapsIsPowerOf10 { return false }
	// now time for the slow path
	return self._isPower(of: 10)
}