I'm exploring Swift's performance in some basic situations involving operations on Collection
s and Sequence
s. As part of that, I've been benchmarking the difference between programming styles (using simple, contrived algorithms), e.g.:
Functional
testData.next
.filter { 0 == $0 % 2 }
.map { $0.byteSwapped }
.filter { ($0 & 0xff00) >> 8 < $0 & 0xff }
.map { $0.leadingZeroBitCount }
.filter { Int.bitWidth - 8 >= $0 }
.reduce(into: 0, &+=)
Imperative
var result = 0
for value in testData.next {
if 0 == value % 2 {
let value = value.byteSwapped
if (value & 0xff00) >> 8 < value & 0xff {
let value = value.leadingZeroBitCount
if Int.bitWidth - 8 >= value {
result &+= value
}
}
}
}
Unsurprisingly (to me) the latter is many times faster than the former, because in a sense I've "manually unrolled" the former. The former - even compiled with -O
or -Ounchecked
- actually allocates the numerous intermediary collections etc. The Swift compiler currently seems to implement it in a very literal, prescriptive sense.
Now, if you make use of lazy sequences - just add .lazy
after testData.next
- then the compiler does in fact optimise it down to something very similar to the imperative version. Which frankly surprised me - I've never seen it do that before (perhaps all my prior cases were too complicated for the optimiser?). What surprises me even more is that the compiler's version performs better than my hand-unrolled version!
Amazing work compiler!
(although, why u no optimise my imperative version as good as your lazy functional version?!? )
The problem is, that innocuous lazy
property has a lot of pitfalls (e.g. when it's not really lazy and in fact does duplicate work). So, is there some way to get the optimiser to do the same optimisations with the non-lazy version?
I'm currently using Swift 5.9.
When I look at the disassembly of the non-lazy version, the compiler has already inlined basically all the relevant Array
methods, making it pretty close to locally reasonable (few if any hidden side effects). So one would think that the optimiser can then - in principle - "see through" all the intermediary cruft and optimise it away.