[Pitch] RangeReplaceableCollection removeAll(where:) should always use O(1) Space

There are two versions of RangeReplaceableCollection.removeAll(where:). The first requires MutableCollection conformance and uses O(1) space. The second one does not require MutableCollection conformance and requires O(N) space—potentially copying the entire collection and always discarding the original—but this fact is not documented.

We should obviously fix the documentation if we keep the O(N) space version, but in my opinion the factors that create a major performance difference may be too subtle here, especially given that the implementation of the O(N) space version is simply:

self = self.filter {!predicate($0)}

If you have a really large collection that is RangeReplaceable but not Mutable, you should be aware that the call to removeAll(where:) can double your instantaneous memory pressure, and writing it out as above certainly makes that apparent.

What do others think?

3 Likes

Some range replaceable collections (most importantly String), which are not mutable, would still be able to implement it far more efficiently, in-place, even though they cannot take advantage of the default implementation's mutable speed benefit.

It's not a speed benefit that I'm referring to. Although that surely comes along with not having to copy all the data into freshly allocated storage, the speed is still O(N) in both cases. The problem is the O(N) space cost.

There's no problem at all with having an implementation of replaceAll(where:) for String or StringProtocol but I question whether making it available on all RangeReplaceableCollections is wise. The implication is that there are some other generic algorithms requiring RangeReplaceableCollection but not MutableCollection that would use this algorithm in their implementation, and that these algorithms should properly have (and document) O(N) or O(1) space cost depending on the types of their generic parameters. Do we have any examples of such things?

Yeah, you're right, I meant space benefit (though, like you say, it's speed as well). It's likely most range-replaceable-but-not-mutable collections will be able to customize their implementation of the method to do it in-place without additional space.

Note that this is just a mutating removeAll(where:), it isn't documented as being done in-place necessarily, and IMO even without optimization has a clear readability benefit over the self = self.filter { !pred($0) } form. There are a number of algorithms already in the standard library that may, or may not, use O(n) space, depending on what other conformances the type has. And we don't, as far as I know, document space complexity anywhere.

Yeah, I guess that's true if you take those to have the “sausage model.” In fact, all of them will be able to do that, right? So maybe my real complaint is that we have a default implementation for this case that is super-inefficient, and violates what would otherwise be a consistent and reasonable space bound. I would strongly prefer to keep an O(1) space bound and tell anyone implementing such a collection that they must implement the method. If we believe the vast majority of such collections will conform to StringProtocol, we can (if we choose, to preserve ease-of-conformance) put a default implementation there that takes advantage of lower-level StringProtocol API to maintain the space bound.

Note that this is just a mutating removeAll(where:), it isn’t documented as being done in-place necessarily, and IMO even without optimization has a clear readability benefit over the self = self.filter { !pred($0) } form.

Agreed.

There are a number of algorithms already in the standard library that may, or may not, use O(n) space, depending on what other conformances the type has.

Which ones? I can't think of any.

And we don’t, as far as I know, document space complexity anywhere.

If we ever failed to do that where the bound isn't clearly deducible from the other semantics (i.e. sorted() obviously takes at least O(N) space because it has to return a copy of the elements arbitrarily permuted), that's an oversight. attn: @nnnnnnnn

2 Likes