LazyFilterCollection is not a Collection

(Continuing a discussion from
https://twitter.com/cocoaphony/status/729712476451971073\.\)

The current recommendation for converting from lazy to strict is with an
Array constructor:

    Array(xs.lazy.<chain>)

Because the Array constructor accesses count and then iterates over the
sequence, this generally means that the entire chain is evaluated twice.
This is a surprising result. See
Why `double` is called 5 times and `filter` is called 4 times? Shouldn't they be called same amount of times? Shouldn't they both be called only twice as I have 2-element array? · GitHub for an
example.

While it is reasonable to argue that a map/filter chain should be pure, it
is much more surprising to require that it be cheap (a common use for lazy
is to avoid calling expensive operations that might not be needed). This
can be a particular issue if network fetches (even idempotent ones) are
included in the chain.

As Joe Groff notes, this is fixable with Array(AnySequence(xs.lazy....)),
but that is very undiscoverable and dependent on knowing the implementation
details of Array.init.

I believe the root cause of this is that LazyFilterCollection is not a
proper Collection. It violates the performance requirements.
CollectionType.count requires O(1) access if Index is a
RandomAccessIndexType. LazyFilterCollection.count is always at least O(N).
LFC also breaks several other O(1) performance requirements (endIndex for
instance). This breaks the performance promises of anything that operates
on a generic collection and receives an LFC. The problem here isn't
Array.init. It's doing everything right. The problem is that LFC doesn't
uphold its side of the bargain.

There are many ways and reasons that lazy filter chains are constructed. I
don't think there's any way for the stdlib to guess the best performance
trade-off for all of them, and optimizing for one breaks others. Given
that, I think it should stick to the documented performance promises. Since
LazyFilterCollection cannot meet the requirements of a Collection (even in
simple cases), a lazy filter must return a LazySequence. This breaks
pre-allocation optimizations, but allows reasoning about the performance in
ways that is not currently possible. We should keep our promises.

While not beautiful, it is possible to recover pre-allocation optimization
in a generic way by adding an initialCapacity option to
Array.init<SequenceType>. This would allow the caller to provide the best
initial allocation in cases where that is externally known, or worth an
explicit counting.

-Rob

Hi Rob,

We don't have RandomAccessIndexType anymore.

Dmitri

···

On Mon, May 9, 2016 at 11:46 AM, Rob Napier <robnapier@gmail.com> wrote:

It violates the performance requirements.
CollectionType.count requires O(1) access if Index is a
RandomAccessIndexType.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/

> It violates the performance requirements.
> CollectionType.count requires O(1) access if Index is a
> RandomAccessIndexType.

Hi Rob,

We don't have RandomAccessIndexType anymore.

I don't understand how this addresses any of the original points.
LazyFilterIndex explicitly notes that it breaks the performance promises of
its protocol:

NOTE

The performance of advancing a LazyFilterIndex depends on how sparsely
the filtering predicate is satisfied, and may not offer the usual
performance given by models of ForwardIndexType.

And also LazyFilterCollection:

NOTE

The performance of advancing a LazyFilterIndex depends on how sparsely
the filtering predicate is satisfied, and may not offer the usual
performance given by models of ForwardIndexType. Be aware, therefore,
that general operations on LazyFilterCollection instances may not have
the documented complexity.

I believe there should be a very high performance bar before breaking a
protocol's promises. The fact that filter predicates must be both pure and
very fast is not obvious, and this can have a cascading effect of calling
filters an ever-increasing number of times. For example:

var c1 = 0

var c2 = 0

var c3 = 0

let xs = Array(1...1000).lazy

    .filter { _ in c1 += 1; return true }

    .filter { _ in c2 += 1; return true }

    .filter { _ in c3 += 1; return true }

_ = Array(xs)

print(c1, c2, c3, c1+c2+c3) // *7986 3996 2000 13982*

I don't think the caller would expect that these filters would be called a
total of 14k times (rather than 3k).

This is rough, and I just tossed in Xcode and hit "Test" (so it's very
possible that there's a mistake here due to oversimplifying), but I was
unable to find any combination of count or number of filters (including one
filter and up to 1M element arrays) where the current stdlib is faster than
the sequence version:

extension LazySequenceType {

    @warn_unused_result

    public func seqFilter(

        predicate: (Elements.Generator.Element) -> Bool

        ) -> LazyFilterSequence<Elements> {

        return LazyFilterSequence(

            self.elements, whereElementsSatisfy: predicate)

    }

}

class testperfTests: XCTestCase {

    func testPerformanceSeq() {

        self.measureBlock {

            let xs = Array(1...1000).lazy

                .seqFilter { _ in true }

                .seqFilter { _ in true }

                .seqFilter { _ in true }

            _ = Array(xs)

        }

    }

    func testPerformanceCollection() {

        self.measureBlock {

            let xs = Array(1...1000).lazy

                .filter { _ in true }

                .filter { _ in true }

                .filter { _ in true }

            _ = Array(xs)

        }

    }

}

seqFilter was reliably over an order of magnitude faster than filter. I
tested this with various filters (to demonstrate actually removing some
elements), on Mac and iOS, with between 1 and 3 filters, and with sizes
between 1000 and 1M. The non-collection was always faster in my tests.

I don't see this as a performance slam dunk that justifies breaking the
performance promises of CollectionType. Is there more to the story that I'm
missing?

-Rob

···

On Mon, May 9, 2016 at 4:16 PM, Dmitri Gribenko <gribozavr@gmail.com> wrote:

On Mon, May 9, 2016 at 11:46 AM, Rob Napier <robnapier@gmail.com> wrote:

Naively (and thinking about how this works on other platforms), I would expect Array(lazyFilterSequence) to iterate only once and take the hit of reallocation.

Russ

···

On May 11, 2016, at 2:46 PM, Rob Napier via swift-dev <swift-dev@swift.org> wrote:

On Mon, May 9, 2016 at 4:16 PM, Dmitri Gribenko <gribozavr@gmail.com <mailto:gribozavr@gmail.com>> wrote:
On Mon, May 9, 2016 at 11:46 AM, Rob Napier <robnapier@gmail.com <mailto:robnapier@gmail.com>> wrote:
> It violates the performance requirements.
> CollectionType.count requires O(1) access if Index is a
> RandomAccessIndexType.

Hi Rob,

We don't have RandomAccessIndexType anymore.

I don't understand how this addresses any of the original points. LazyFilterIndex explicitly notes that it breaks the performance promises of its protocol:

NOTE
The performance of advancing a LazyFilterIndex depends on how sparsely the filtering predicate is satisfied, and may not offer the usual performance given by models of ForwardIndexType.

And also LazyFilterCollection:

NOTE
The performance of advancing a LazyFilterIndex depends on how sparsely the filtering predicate is satisfied, and may not offer the usual performance given by models of ForwardIndexType. Be aware, therefore, that general operations on LazyFilterCollection instances may not have the documented complexity.

I believe there should be a very high performance bar before breaking a protocol's promises. The fact that filter predicates must be both pure and very fast is not obvious, and this can have a cascading effect of calling filters an ever-increasing number of times. For example:

var c1 = 0
var c2 = 0
var c3 = 0

let xs = Array(1...1000).lazy
    .filter { _ in c1 += 1; return true }
    .filter { _ in c2 += 1; return true }
    .filter { _ in c3 += 1; return true }

_ = Array(xs)
print(c1, c2, c3, c1+c2+c3) // 7986 3996 2000 13982

I don't think the caller would expect that these filters would be called a total of 14k times (rather than 3k).

This is rough, and I just tossed in Xcode and hit "Test" (so it's very possible that there's a mistake here due to oversimplifying), but I was unable to find any combination of count or number of filters (including one filter and up to 1M element arrays) where the current stdlib is faster than the sequence version:

extension LazySequenceType {
    @warn_unused_result
    public func seqFilter(
        predicate: (Elements.Generator.Element) -> Bool
        ) -> LazyFilterSequence<Elements> {
        return LazyFilterSequence(
            self.elements, whereElementsSatisfy: predicate)
    }
}

class testperfTests: XCTestCase {
    func testPerformanceSeq() {
        self.measureBlock {
            let xs = Array(1...1000).lazy
                .seqFilter { _ in true }
                .seqFilter { _ in true }
                .seqFilter { _ in true }
            
            _ = Array(xs)
        }
    }

    func testPerformanceCollection() {
        self.measureBlock {
            let xs = Array(1...1000).lazy
                .filter { _ in true }
                .filter { _ in true }
                .filter { _ in true }

            _ = Array(xs)
        }
    }
}

seqFilter was reliably over an order of magnitude faster than filter. I tested this with various filters (to demonstrate actually removing some elements), on Mac and iOS, with between 1 and 3 filters, and with sizes between 1000 and 1M. The non-collection was always faster in my tests.

I don't see this as a performance slam dunk that justifies breaking the performance promises of CollectionType. Is there more to the story that I'm missing?

-Rob

_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev