Adding isPowerOf2 to BinaryInteger

Thank you for the math fun, Jordan~ Didn't think this topic would have such a deep dive with so much fun.

1 Like

You can. I suspect it'll be slower though.

The idea behind switching on trailingZeroBitCount is to make it straightforward for the optimizer to use a lookup table instead of evaluating each condition one by one.

1 Like

Is trailingZeroBitCount guaranteed to have any particular behavior when called on 0?

I notice that it returns bitWidth for the standard library fixed-width integer types, but can we count on that?

• • •

Also, as much fun as it is to discuss potential implementation optimizations, I think we’ve reached a general agreement on isPower(of:).

Is this ready to move forward as a proposal?

Yes, the computed property trailingZeroBitCount calls cttz intrinsic, which defines the behaviour on 0 input.

Agree that for now, it is more important to determine the public interface, and make sure it includes sufficient information to optimize the most common use case (i.e., is power of 2). Any specific optimizations for inputs other than 2 can be done later on. Good to see that the isPower(of:) form receives a general favour from the community.

Let's wait for a bit longer (say till next week). If there is no strong negative feedback, I'd be happy to start a formal proposal then.

That does not answer my question. I am asking about future, unknown, potentially third-party, types which conform to BinaryInteger. Are they required to have any particular behavior for trailingZeroBitCount of zero?

Might a BigNum type return 0, or UInt.bitWidth, or some other value in this case?

How about a type conforming to FixedWidthInteger? Must it return Self.bitWidth? Is this documented anywhere?

1 Like

I don't see any explicit mention of it, but I don't think it makes sense to return anything else than bitWidth.

Note how BinaryInteger defines bitWidth as an instance property. Even a BigNum will have a bitWidth, although a variable one. If trailingZeroBitCount == bitWidth, then your value is zero; this holds even if bitWidth somehow happens to be zero.

1 Like

It is common practice to return either bitWidth or undef for zero input in other programming languages. For example, it is the former case for cttz in rust and the latter for __builtin_ctz in gcc.

Swift should definitely standardize its behaviour.

You guys are way far off into the weeds of implementation details here, and wildly out of scope for a pitch thread. You're putting a ton of work into a discussion that is easy to lose track of and just sort of let fall into history.

Please either file a bug or put up a PR to document the semantics of trailingZeroBitCount for zero (it should return bitWidth, like @dingobye suggests). If you put up a PR, please tag me and @nnnnnnnn to review and we'll get it landed.

Regarding the implementation details of a hypothetical isPower(of:), I suggest that one of you create a branch with a draft implementation that you can collaboratively iterate on for a week or two before you start writing a formal proposal. (You'll need to have a working implementation before the review anyway, and PRs against that implementation branch are a much better place to discuss these technical details).

7 Likes

Thanks for report this issue in SR-10657, Nevin!

1 Like

I created a PR and put some performance results in the comments below. We can discuss the implementation details there.

2 Likes

Non-implementation comment: do we expect specific types to have more efficient implementations? If not, we shouldn't bother adding this as a protocol requirement, just an extension method.

I expect that we can get optimal code out of a generic implementation, but it may require a little bit of optimizer work in LLVM (and that's part of why I'm nudging the proposers to draft an implementation, so we can see what the landscape really is).

2 Likes

I was thinking of this as well... I find isPower(of:) has much similarity to isMultiple(of:), which is a protocol requirement, so I'd keep the consistency by either making them both protocol requirements or extension methods. What do you think?

Consistency is not really a motivation for making this decision. If a default implementation can always be correct and efficient, then it doesn't need to be a protocol requirement. Whether or not isMultiple(of: ) actually needs to be a protocol requirement is its own question, but shouldn't significantly effect what you do here.

3 Likes

Well, if we want different implementations for BinaryInteger and FixedWidthInteger, then it needs to be a requirement. Or, at least, there needs to be a requirement involved, though it could be an underscored helper, eg. _isPowerOfTwo:

For example:

protocol BinaryInteger {
  ⋮
  public var _isPowerOfTwo: Bool { get }
}

extension BinaryInteger {
  func isPower(of base: Self) -> Bool {
    if base._isPowerOfTwo {
      guard _isPowerOfTwo else { return false }
      return trailingZeroBitCount.isMultiple(of: base.trailingZeroBitCount)
    }
    
    // Handle other bases
  }
  
  var _isPowerOfTwo: Bool {
    return (self != 0) && (self == (1 as Self) << trailingZeroBitCount)
  }
}

extension FixedWidthInteger {
  var _isPowerOfTwo: Bool {
    return (self != 0) && (self & (self &- 1) == 0)
  }
}
1 Like

note: this makes Int.min._isPowerOfTwo be true. You want self > 0 && self & (self - 1) == 0 instead, since no negative number can be a power of two.

(Please add T.min to your set of test cases :grinning:)

2 Likes

I think it makes sense to be protocol requirements, because BinaryInteger is such a fundamental protocol. The integer arithmetic, bitwise and comparison operators have unpredictable performance depending on how the particular integer type is implemented. It would be good to reserve the future flexibility, and let the users have a chance to optimize when necessary.

Let's take BigInt (defined in swift/test/Prototypes/BigInt.swift) as an example. It doesn't have a native 2's complement representation. So every time it performs a bitwise operation, it converts the operands(result) into(from) 2's complement representations before(after) actually doing the job. In addition, checking self > 0 can be more expensive than !self.isZero && !self.isNegative. As long as we know how BigInt is implemented, we can work out a better solution for such particular type.

I have done some experiments to test the _powerOfTwo scenarios. We focus on Int64, and BigInt with 1024- and 32768-bit configurations. Note that the workloads of BigInt and Int64 are different in repeated amounts, so do not compare numbers between those columns directly.

As we can see, with the knowledge of BigInt's implementation, we write _isPowerOfTwo_BigInt to operate on the type's native representation (rather than transfer to/from 2's complement representation). The optimization is significant.

. Int64 (workload repeated 10^3) BigInt-1024 (workload repeated 10^8) BigInt-32768 (workload repeated 10^8)
_isPowerOfTwo 0.081594 0.003154 0.004853
_isPowerOfTwo_ctpop 0.133068 N/A N/A
_isPowerOfTwo_cttz 0.149579 0.001840 0.002958
_isPowerOfTwo_BigInt N/A 0.000240 0.000327
_slowIsPower(of: 2) 1.250993 0.930400 70.041004
extension BinaryInteger {
  @inlinable @inline(__always)
  internal var _isPowerOfTwo: Bool {
    return self > 0 && self & (self - 1) == 0
  }

  @inlinable @inline(__always)
  internal var _isPowerOfTwo_cttz: Bool {
    return (self > 0) && (self == (1 as Self) << self.trailingZeroBitCount)
  }
}
extension FixedWidthInteger {
  @inlinable @inline(__always)
  public var _isPowerOfTwo_ctpop: Bool {
      return self > 0 && self.nonzeroBitCount == 1
  }
}

extension _BigInt {
  internal var _isPowerOfTwo_BigInt : Bool {
    guard !self.isZero && !self.isNegative else { return false }
    for i in 0..<(_data.count - 1) {
      if _data[i] != 0 { return false }
    }
    precondition(!_data.isEmpty, "A nonzero BigInt must have any element in _data.")
    let w = _data.last!
    precondition(w != 0, "_data has no trailing zero elements")
    return w & (w - 1) == 0
  }
}
2 Likes

Just did some experiments. The cttz version performs better on BigInt, and the self > 0 && self & (self - 1) == 0 version is better on built-in integers.

I think for small-bitwidth integer types (which is directly supported by hardware's instruction set), the self > 0 && self & (self - 1) == 0 version works better. However, when the bitwidth is fatter than those supported by hardware's instruction set, then it is unknown which version is better. As for this specific BigInt example, the cttz version prevails.

1 Like

After thinking more a while, I still prefer to have the proposed API declared as protocol requirement. Because the custom integer types are unpredictable. Some may sacrifice the efficiency of bitwise operations to ensure performance of arithmetic operation (e.g., BigInt), while some may be the other way round, depending on their specific trade-off strategies. As a result, it is so hard to provide a universally efficient default implementation, and we'd better make it a protocol requirement to reserve the future flexibility.

For example, let's consider the optimization for input base 2. We have tested three types of implementation on three different groups of integer types (details can be found here). The result is that three integer types have three different winners. In this case, I would like to use the classic version to be the default implementation, since, as part of stdlib, it works well with built-in integers.

  // classic version, winner on Swift built-in integer
  var _isPowerOfTwo: Bool {
    return self > 0 && self & (self - 1) == 0
  }
  // tzcount version, winner on BigInt
  var _isPowerOfTwo_cttz: Bool {
    return (self > 0) && (self == (1 as Self) << self.trailingZeroBitCount)
  }
  // popcount version, winner on DoubleWidth
  var _isPowerOfTwo_ctpop: Bool {
      return self > 0 && self.nonzeroBitCount == 1
  }

FWIW, I believe that you can provide a default implementation that is efficient for all integer types via the words property; a number is a power of two if and only if exactly one word is non-zero and that word is a power of two. This reduces the problem to a space where you only need to worry about efficiently implementing the operation on a single machine word, which we know how to do.

5 Likes