Yet another index/enumeration replacement thread

CTMacUser · March 1, 2019, 3:35am

Continuing the discussion from Add forEachWithIndex to Array:

I read the previous thread from another thread linked in the previous message. Should a solution, like:

extension Collection {

    /**
     Calls the given closure on each element index in the collection in the
     same order as a `for`-`in` loop.

     The two loops in the following example produce the same output:

         let numberWords = ["one", "two", "three"]
         numberWords.forEach { word in
             print(word)
         }
         /*
          Prints """
              one
              two
              three
              """
          */

         numberWords.forEachIndex {
             print(numberWords($0))
         }
         // Same as above

     Using the `forEachIndex` method is distinct from a `for`-`in` loop upon
     `indices` in several important ways:

     1. You cannot use a `break` or `continue` statement to exit the current
        call of the `body` closure or skip subsequent calls.
     2. Using the `return` statement in the `body` closure will exit only from
        the current call to `body`, not from any outer scope, and won't skip
        subsequent calls.
     3. Since there is no `Indices` object possibly holding a reference to
        `self`, elements may be mutated within the closure.

     - Parameter body: A closure that takes an index of the collection as a
       parameter.
     */
    public func forEachIndex(_ body: (Index) throws -> Void) rethrows {
        var i = startIndex
        while i < endIndex {
            try body(i)
            formIndex(after: &i)
        }
    }

}

be added to the Standard Library? I did check that I can change mutable collections through a closure.

amnykon · March 1, 2019, 2:18pm

if you want to loop based on indices, just use:

let numberWords = ["one", "two", "three"]

for i in numberWords.indices {
    print(numberWords[i])
}
//or
numberWords.indices.forEach {
    print(numberWords[$0])
}

or just use map{}

Iterating over the full array has a O(x) regardless of the loop method, the cost of copy on first mutation has a cost of O(x). You don't save much by reducing the cost of copying on first mutation, however iterating over numberWords.indices should do the trick if you are really conserned.

Nevin · March 1, 2019, 2:33pm

If you are mutating the elements, indices can be a performance trap. Many collections use an Indices type which retains the collection itself (the default indices type does this), thus mutating an element while using indices will force a copy of the entire collection instead of mutating in-place.

This came up previously in the Idea: mutatingForEach for Collections thread among other places, and it is explained in the documentation for indices.

Karl · March 1, 2019, 6:17pm

That's not really correct code, though. Mutating a collection invalidates all subsequent indexes. It will happen to work for some simple collections (e.g. Array), but could break for something more complex, like String.

With value semantics, it would be weird if a previously-created value of indices changed its contents due to a later change to the collection it was originally created from. The copying isn't the issue - the code was incorrect to begin with.

Nevin · March 1, 2019, 6:29pm

I’m pretty sure it is correct for MutableCollection, where changing an element does not affect index validity.

The point is you should write code like this:

var i = startIndex
while i != endIndex {
  self[i] = …
  formIndex(after: &i)
}

And not like this:

for i in indices {
  self[i] = …
}

Karl · March 1, 2019, 6:33pm

MutableCollection also allows things like replacing slices of collections, and there is no requirement that they have the same length, so you could easily invalidate later indexes.

But yes, the overall effect is that you shouldn't use .indices when mutating.

Nevin · March 1, 2019, 6:35pm

I think you’re thinking of RangeReplaceableCollection.

Karl · March 1, 2019, 6:59pm

Syntactically it works, but yes looking at the documentation it's beyond what MutableCollection guarantees to work.

MC makes some pretty strong guarantees about indexes in general. I suspect most MCs would use a POD type like Range for their Indexes and wouldn't copy anyway. We can't express that in the language today so it can't be a formal requirement, but I think it's more-or-less implied by what is there.

Nevin · March 1, 2019, 7:23pm

Erm, what?

I mean, specifically, what works syntactically?

The only code I have posted in this thread, is to recommend against mutating while iterating over indices, due to the documented potential performance pitfall.

First of all, Dictionary.Values conforms to MutableCollection but its indices retains the entire object, so there is at least one example in the standard library where this performance trap is realized.

Second, the MutableCollection documentation specifies: “A value stored into a subscript of a MutableCollection instance must subsequently be accessible at that same position.”

I can’t find it now, but I’m quite certain that a member of the core team has stated on-list (perhaps before the forums existed) that assigning to the basic subscript of a MutableCollection does not invalidate any indices.

Third, there was a discussion about exactly this subject, mutating while iterating over indices, which covered what we’re talking about today a year ago.

CTMacUser · March 2, 2019, 8:37am

Going back to the original post: so you agree that something like forEachIndex() is useful? Because the most obvious solution, enumerated(), is a correctness trap, and the next obvious, indices, is sort-of a correctness trap because it's a performance trap in its most common case (iterating to mutate).

CTMacUser · March 2, 2019, 8:51am

I made a mistake in the documentation for forEachIndex(); I needed to explain separately how the method differs from for-in loops and forEach(). I was going to paste it here, but I moved the code to a gist.

I added demonstration methods: mutating each element of a collection through a closure. It was inspired by reading related threads from the parent thread and seeing a request for an in-place map. The methods are called MutableCollection.remap. The name just adjusts "map" in the simplest way. As far as whether to use a parameter-self-mutating closure or a classic transformation closure, I chose both and used the same name as a pair of overloads. The self-mutating version uses forEachIndex and the classic version uses its sibling method.

Although it doesn't use forEachIndex, I added RangeReplaceableCollection.remapOrRemove for completeness. It was originally called remapAndFilter, but I don't include a predicate; whether or not the closure returns .some or .none is the test.

Should a proposal be all four methods? Or just forEachIndex, leaving in-place mapping for others?