We just had a discussion on a Slack about some unexpected behavior with lazy collections, given this small snippet of code:
let foo = (1...30).lazy.compactMap { test($0) }.prefix(4)
The intention here is that test
is an expensive operation that may sometimes return nil, and we want to collect the first 4 non-nil results while minimizing the number of tests run. That isn't what happens, though. Let's make this bigger and more complete to see the steps:
func test(_ i: Int) -> Int? {
print("running \(i)")
return i % 2 == 0 ? i : nil
}
print("Step 1")
let foo = (1...30).lazy
print("Step 2")
let foo2 = foo.compactMap(test)
print("Step 3")
let foo3 = foo2.prefix(4)
print("Done")
foo3.forEach { print("output \($0)") }
Which produces this confusing output:
Step 1
Step 2
Step 3
running 1
running 2
running 3
running 4
running 5
running 6
running 7
running 8
running 9
running 10
running 1
running 2
Done
running 2
output 2
running 3
running 4
output 4
running 5
running 6
output 6
running 7
running 8
output 8
running 9
Notice how the call to prefix
seems to do a lot of extra work. Ideally we'd only see "running" lines once for 1 through 8 and that's all. Why does this output look the way it does?
Well, the standard library defines a LazyMapCollection
, which is a LazyMapSequence
whose base sequence conforms to Collection
, and then makes LazyMapCollection
also conform to Collection
. Therefore, because what we start with is a Collection
(the ClosedRange
(1...30)
), foo through foo3 are all LazyMapCollection
s.
And because of that, Swift chooses the Collection
version of prefix
, which makes a slice of the original collection, rather than the Sequence
version of prefix
(what we would have expected), which would just have an iterator that counts up to the max length. The extra "running" lines are due to making a slice subsequence.
I can't think when that would be the desirable behavior for code like this.
So I'd like to pitch an addition to the standard library:
extension LazyMapCollection {
func prefix(_ maxLength: Int) -> PrefixSequence<Self> {
return PrefixSequence(self, maxLength: maxLength)
}
}
This would make LazyMapCollection
s use the Sequence
behavior for prefix
calls, rather than the Collection
behavior.
Thanks to Tal Atlas for the original problem statement, and Olivier Halligon for the expanded code that makes the problem clear.