# Best way to break a integer into powers of 2

For example, if I have an `Int8` value `0b00010010`, I would like to get an `[Int8]` of `0b10000` and `0b10`. For now, I can only come up with an iterative approach using `trailingZeroBitCount` and bit-shift operators, like this:

``````func twosPowers(in numberToBreakUp: Int32) -> [Int32] {
var partialResult: [Int32] = []

var unsignedCopy = UInt32(bitPattern: numberToBreakUp)
var currentTrailingZeroCount = unsignedCopy.trailingZeroBitCount
var accumulatedTrailingZeroCount = currentTrailingZeroCount

while accumulatedTrailingZeroCount < 32 {
partialResult.append(1 << accumulatedTrailingZeroCount)

unsignedCopy >>= (currentTrailingZeroCount + 1)
currentTrailingZeroCount = unsignedCopy.trailingZeroBitCount
accumulatedTrailingZeroCount += currentTrailingZeroCount + 1
}

return partialResult
}
``````

Is there a better way of doing it?

Also, my use case for this is breaking an `OptionSet` into individual options, is there a better way for it than calling the above function on its raw value:

``````let individualOptions = twosPowers(in someOptions.rawValue).map { SomeOptions(rawValue: \$0) }
``````

That depends on what you mean by â€śbetterâ€ťâ€”and what you actually want to achieve with this.

If you just want a shorter implementation:

``````func twosPowers<T: BinaryInteger>(_ n: T) -> [T] {
(0 ..< n.bitWidth).reversed().compactMap { i in
let x = n & (1 << i)
return (x == 0) ? nil : x
}
}
``````

In fact, if youâ€™re okay having the result start with the low bits instead of the high, you can remove the `.reversed()` call.

â€˘ â€˘ â€˘

If you think itâ€™ll be faster to skip over the zero-bits, an iterative approach works. But Iâ€™d suggest benchmarking to be sure, because the simpler version above might be fine:

``````func twosPowers<T: BinaryInteger>(_ n: T) -> [T] {
var n = n
var result: [T] = []

while n != 0 {
let b: T = 1 << n.trailingZeroBitCount
result.append(b)
n ^= b
}

return result.reversed()
}
``````

â€¦where again the `.reversed()` call may be removed if you like.

â€˘ â€˘ â€˘

On the other hand, if you donâ€™t actually need an `Array` per se, and just want to be able to iterate over the bits, then a custom collection will work. Itâ€™s rather more code than the above, but it avoids allocating an array:

``````struct BitSequence<T: FixedWidthInteger> {
var bits: T
init(_ bits: T) { self.bits = bits }
}

extension BitSequence: BidirectionalCollection {
var startIndex: Int { bits.trailingZeroBitCount }
var endIndex: Int { T.bitWidth }
var count: Int { bits.nonzeroBitCount }
var isEmpty: Bool { bits == 0 }

subscript(position: Int) -> T { 1 << position }

func index(after i: Int) -> Int {
let b: T = 1 << i
let highMask = ~b ^ (b &- 1)
}

func index(before i: Int) -> Int {
let b: T = 1 << i
let lowMask = b &- 1
return T.bitWidth - 1 - z
}
}
``````

This starts with the low bits, so its order is the reverse of the previous examples. Youâ€™re free to change that if you prefer.

4 Likes

Thanks for the 3 suggestions and sorry for the late response!

Right. I should've been more clear about it. What I was mainly looking for was something more "efficient", like a solution that does not iterate through the bits of an integer, although I feel that might only be possible via compiler intrinsics or assemblies. But at the same time, I was also looking for any improvement to my algorithm in general, and I think the 3 algorithms you suggested are all better than mine in some ways.

It's not highlighted in your post, but I want to point out that by not right-shifting the number, you were able to make the algorithm generic. I think it's a great improvement.

What I was mainly looking for was something more "efficient", like a solution that does not iterate through the bits of an integer, although I feel that might only be possible via compiler intrinsics or assemblies

You could optimize it by checking if the upper 50% or 75% of the bits are all zeroes (using a bit mask and conditional). A lot of 64-bit integers rarely exceed 4 billion, so you could double or quadruple performance. Other than that, it's fundamentally impossible to optimize it further on the ALU side without implementing @Nevin's advice. If there was a special instruction (there probably might be on x86), then the overhead of creating an algorithm for it might outweigh the benefits of a simpler algorithm. Also, it wouldn't work on ARM, which has a very limited instruction set.

The biggest performance concern with this is memory allocation-bound, not ALU-bound, because you're creating an array. I do have a few tips for optimizing that. First, call `reserveCapacity` on the array by expanding it to the value of `.nonZeroBitCount`. Second, you could use the `unsafeUninitializedCapacity` array initializer to eliminate an extra function call.

It's hard to accurately profile because the problem size is so small. The time resolution of `clock_gettime_nsec_np(CLOCK_UPTIME_RAW)` is roughly 1 microsecond, enough for around 10 function calls. In my opinion, it isn't worth optimizing this unless you're facing a serious bottleneck in a larger program. Or, if you're doing it just for the sake of doing it, then go ahead!

Shifting and counting are for chumps:

``````struct PowersOfTwo<T: FixedWidthInteger>: Sequence, IteratorProtocol {

internal var remaining: T

mutating func next() -> T? {
guard remaining != 0 else { return nil }
let lsb = remaining & ~(remaining &- 1)
remaining &= ~lsb
return lsb
}

init(summingTo value: T) { remaining = value }
}

for v in PowersOfTwo(summingTo: 0b1100_0101) {
print(v)
}
// 1
// 4
// 64
// 128
``````
20 Likes

if `.trailingZeroBitCount` is fast:

``````extension FixedWidthInteger {
func blah() -> [Int] {
var result = [Int]()
result.reserveCapacity(nonzeroBitCount)
var n = self
while n != 0 {
let p: Self = 1 << n.trailingZeroBitCount
result.append(Int(p))
n = n ^ p
}
return result
}
}
``````

I think result should be `Int`, not type of the original integer value.

The best way to make this efficient is to never create an array. Make a collection that just stores the `Int`. Which is (almost) what @scanon did

Is it possible to learn this power?

If I need to use the result for multiple times, would it be more efficient to create an array (or set or other containers) and store it instead of repeatedly calculating the result? I think I should benchmark it to know.

For my use case, I need the original type, because I'm splitting an `OptionSet` value into individual options.

Computing the next result is about as efficient as loading it from memory (and possibly more efficient, if bounds checks fail to be eliminated); it requires only two or three arithmetic instructions to get the result, and one more to update the sequence (or vice-versa, depending on what the compiler decides to do). Even if you need to iterate through the bits repeatedly, recalculating will be competitive in this case.

Not from the Swift forums (at least not quickly from the Swift forums). Hacker's Delight by Henry Warren is a good source for this type of trickery.

15 Likes

Bit Twiddling Hacks is another great resource.

I use one of their algorithms to check if a buffer contains a particular byte (e.g. to check if some UTF-8 buffer contains a particular ASCII character). It can process chunks of 8 bytes in just a handful of instructions:

``````        not     rbx
and     rbx, r9
xor     rcx, r10