Implementing a Collection.removingIndices(_:)

You can do this algorithm in place. It turns out that the classic stable partition algorithm only visits each element once, before it is moved, so you can partition the elements based on their indices. That's probably true of the simpler half-stable partition as well, but I haven't checked.

Actually it looks like @vadian may have posted this answer above.

1 Like

Aren't these all O(N²) algorithms?

Yes indeed, in fact all of them have big trouble dealing with arrays with more than 1000 elements.
The only faster implementations are @nnnnnnnn version:

Iterative copying
extension Collection {
    func removingIndices<S: Sequence>(_ indicesToRemove: S) -> [Element]
        where S.Element == Index
    {
        var result: [Element] = []
        var low = startIndex
        for high in indicesToRemove {
            result.append(contentsOf: self[low..<high])
            low = index(after: high)
        }
        result.append(contentsOf: self[low...])
        return result
    }
}

and @vadian version:

It's all about partitions
extension RangeReplaceableCollection where Self: MutableCollection, Index == Int {

    mutating func remove(at indexes : IndexSet) {
        guard var i = indexes.first, i < count else { return }
        var j = index(after: i)
        var k = indexes.integerGreaterThan(i) ?? endIndex
        while j != endIndex {
            if k != j { swapAt(i, j); formIndex(after: &i) }
            else { k = indexes.integerGreaterThan(k) ?? endIndex }
            formIndex(after: &j)
        }
        removeSubrange(i...)
    }
}

@dabrahams the "Iterative copying" version is not O(n²), isn't it? If it is, then I don't understand how it is much faster than the other (bad) implementations.
Thank you for your amazing WWDC presentation :ok_hand:

Does the constraint ā€œMutableCollection & RangeReplaceableCollectionā€ provide sufficient guarantees of index stability for this?

Here's someone saying that the new method actually appears on the Standard Library documentation: RangeReplaceableCollection - #10 by mattrips

The specific addition that the documentation says to have been made is remove(atOffsets offsets: IndexSet). There also is a new method: applying(_ difference: CollectionDifference<Self.Element>) -> Self?. These changes were made in Catalina Beta 5.

The question raised on the other thread is whether these changes actually are changes to the Standard Library, as the documentation indicates, or whether they might be extensions made within the SwiftUI framework.

I can't find the implementation of the method in the stdlib code though (unless its yet to be committed publicly).

remove(atOffsets:) is implemented in SwiftUI as an extension. I don't know why the list page miscategorises it, but if you click through and look at the sidebar, it says 'Framework: SwiftUI'.

applying(_:) is correctly grouped in the standard library domain and was introduced in the collection diffing proposal.

I started a proposal for this a while ago but didn't have the time to take it forwards. If anyone want to take it on you are quite welcome: [Pre-pitch] removeAll(at:)

Actually, with a little more digging, it appears the remove(atOffsets:) method is implemented in the Foundation framework.

And, as you indicate, applying(_:) is anticipated to be included in the Standard Library, but has not yet been accepted.

I observe these bare facts, here. I'll post discussion in the other thread.

It was accepted.

When working with RRCs, you're constrained in what you can do — whenever you call a mutating RRC method you invalidate all existing indices (though some concrete collections, like array, won't have problems).

If you constrain to both MutableCollection and RRC, you can use swapAt (which doesn't invalidate indices) to rearrange as much as necessary, and then remove a single batch of elements at the end of your algorithm, once you don't need to access any other indices.

5 Likes

Right. Back in January it was accepted.

I've been referencing the most recent thread, which was a re-review on account of an amendment. The proposal itself is still listed as being in Active Review (June 19 - 25 2019).

Where is it guaranteed? I couldn't find it in the doc.

1 Like

That guarantee belongs to the protocol, but it doesn't appear to be documented explicitly. The closest we get is in the Conforming to the MutableCollection Protocol section of the protocol docs, but even that doesn't spell out that other indices must also be unaffected by changes to the value at a particular position.

1 Like

It could be heavily implied, but I think we should still be conservative on this.

We don't need to be conservative on this — it's core to the main function of the MutableCollection protocol. Without that guarantee, none of the in-place algorithms that MutableCollection is intended to enable (partition, reverse, sort, rotate, etc) would be possible. We just need improved docs!

7 Likes

I agree, that the discrepancy should be resolved. Either to remove the assumption that appeared else where, or include it into the doc.

The amendment was just to add an inverse. It doesn't affect the original proposal, which was accepted and implemented in 5.1.

Looking at this thread and the [Pre-pitch] removeAll(at:) thread, maybe we can generalize to:

protocol RangeReplaceableCollection {

    //...

    /// Removes all the elements whose indices satisfy the given predicate.
    ///
    /// Use this method to remove every element in a collection whose location
    /// meets a particular criteria. The order of the remaining elements is
    /// preserved. This example removes all the odd values from an
    /// array of numbers:
    ///
    ///     var numbers = [5, 6, 7, 8, 9, 10, 11]
    ///     numbers.removeAllAt(locations: { $0.isMultiple(of: 2) })
    ///     // numbers == [6, 8, 10]
    ///
    /// - Parameter shouldBeRemoved: A closure that takes an index of the
    ///   collection as its argument and returns a Boolean value indicating
    ///   whether the corresponding element should be removed from the
    ///   collection.
    ///
    /// - Complexity: O(*n*), where *n* is the length of the collection.
    mutating func removeAllAt(locations shouldBeRemoved: (Index) throws -> Bool) rethrows

}

Instead of using a Set, a hopefully sorted Sequence, or an arbitrary Sequence, we express the exclusion set as a predicate. Note that this is a customization point. There are two default implementations:

extension RangeReplaceableCollection {

    /// Removes all the elements whose indices satisfy the given predicate.
    ///
    /// Use this method to remove every element in a collection whose location
    /// meets a particular criteria. The order of the remaining elements is
    /// preserved. This example removes all the vowels from a string:
    ///
    ///     var phrase = "The rain in Spain stays mainly in the plain."
    ///
    ///     let vowels: Set<Character> = ["a", "e", "i", "o", "u"]
    ///     let vowelIndices = Set(phrase.indices.lazy.filter({ vowels.contains(phrase[$0]) }))
    ///     phrase.removeAllAt(locations: { vowelIndices.contains($0) })
    ///     // phrase == "Th rn n Spn stys mnly n th pln."
    ///
    /// - Parameter shouldBeRemoved: A closure that takes an index of the
    ///   collection as its argument and returns a Boolean value indicating
    ///   whether the corresponding element should be removed from the
    ///   collection.
    ///
    /// - Complexity: O(*n*), where *n* is the length of the collection.
    @inlinable public mutating func removeAllAt(locations shouldBeRemoved: (Self.Index) throws -> Bool) rethrows {
        var result = Self()
        result.reserveCapacity(count)
        for i in try indices.lazy.filter({ try !shouldBeRemoved($0) }) {
            result.append(self[i])
        }
        self = result
    }

}

extension RangeReplaceableCollection where Self: MutableCollection {

    /// Removes all the elements whose indices satisfy the given predicate.
    ///
    /// Use this method to remove every element in a collection whose location
    /// meets a particular criteria. The order of the remaining elements is
    /// preserved. This example removes all the odd values from an
    /// array of numbers:
    ///
    ///     var numbers = [5, 6, 7, 8, 9, 10, 11]
    ///     numbers.removeAllAt(locations: { $0.isMultiple(of: 2) })
    ///     // numbers == [6, 8, 10]
    ///
    /// - Parameter shouldBeRemoved: A closure that takes an index of the
    ///   collection as its argument and returns a Boolean value indicating
    ///   whether the corresponding element should be removed from the
    ///   collection.
    ///
    /// - Complexity: O(*n*), where *n* is the length of the collection.
    @inlinable public mutating func removeAllAt(locations shouldBeRemoved: (Self.Index) throws -> Bool) rethrows {
        var latest = startIndex
        for i in try indices.lazy.filter({ try !shouldBeRemoved($0) }) {
            self[latest] = self[i]
            formIndex(after: &latest)
        }
        removeSubrange(latest...)
    }

}

We need a customization point so collections that are not MutableCollection but (at least internally) don't necessarily invalidate indices upon (the first) removal, or can keep track of any shifts for at least one call, can do so in optimized implementations. This is helpful for collections that store elements as collections of collections, where removing elements in one segment doesn't disturb the internal indices of elements in other segments.

1 Like