- Pitch: Mutating forEach

I think we should add a mutating forEach method to MutableCollections to make it easier to modify them.

extension MutableCollection {
  public mutating func forEach(_ body: (inout Element) throws -> Void) rethrows {
    for index in self.indices {
      try body(&self[index])
    }
  }
}

Has this been pitched before? Is there a reason it's not in the language?

1 Like

Previous pitch thread: Idea: mutatingForEach for Collections

1 Like

self.indices often captures self for some of the more complicated collections, which interferes with copy-on-write. But it’s not too bad to manually do what iteration over indices does. That’d even be a minor argument to put this in the stdlib, since it’s something where a naive implementation performs worse than ideal.

I do know the Ownership efforts are leading towards something like for inout element in collection, but I don’t know how far off that is. And we do have both for/in and forEach today, although the main reason forEach exists is so that people don’t reach for map instead and collect a bunch of results they don’t need. (At one point there was an idea about allowing the optimizer to implicitly reorder or parallelize arbitrary maps, but it was decided that would break too many things, especially in a language that encourages imperative thinking like Swift.)

I do think that naming this new operation forEach is asking for trouble with more complicated closures, having to infer which forEach is meant based on whether the closure modifies its argument. I’ve seen people refer to this operation as forEachMutating, forEachInout, and I think mutatingEach.

A closely related operation is a hypothetical “mapInPlace”:

mutating func mapInPlace(_ mutate: (Element) throws -> Element) rethrows

This is a slightly different but essentially equivalent signature…unless we ever add non-movable types to Swift. In that case inout won’t really be accurate anymore, but a “mutating borrow” would be strictly more powerful than an owned->owned transformation.

8 Likes

If this was pitched in 2018, why wasn't it ever added to the language? It seems pretty useful to me.

var numbers = [1, 2, 3]

// someone might do this
numbers = numbers.map { $0 * 2 }
// but this has to allocate another array before replacing the old one.
// and this isn't obvious.
var index = numbers.startIndex
while index != numbers.endIndex {
  numbers[index] *= 2
  numbers.formIndex(after: &index)
}
// but with mutating forEach you could write this, which is clear.
numbers.forEach { $0 *= 2 }

You could also mark it as @_disfavoredOverload so that it would by default use the non-inout one.
It seems like it wasn't pitched because of the potential for future syntax like this:

for inout number in numbers {
 number *= 2
}

or this

for &number in numbers {
  number *= 2
}

But it's been 5 years since the initial thread and we still don't have that syntax.

1 Like

I agree that the name forEach would not be ideal — not only because of possible compiler issues.
My preference would be withEach (wasn't there a thread about with lately?).

1 Like

Yes, I made that thread.
I wonder if withEach is sufficiently clear, but I do think it's the best-sounding name of the names that were proposed in that other thread.

But also, the with function proposed in that thread takes a value, makes a copy of it, allows for modification of it, and then returns the modified value. A collection method with those semantics would look more like this.

extension MutableCollection {
  public func withEach(_ transform: (inout Element) throws -> Void) rethrows -> Self {
    var copy = self
    var index = copy.startIndex
    while index != copy.endIndex {
      try transform(&copy[index])
      copy.formIndex(after: &index)
    }
    return copy
  }
}

This is similar to map but with slightly different semantics.

I prefer the mutating version, but you can always implement the non-mutating version on top of it.

extension MutableCollection {
    mutating func updateEach(with update: (inout Element) throws -> ()) rethrows {
        let end = endIndex
        var curr = startIndex
        while curr < end {
            try update(&self[curr])
            formIndex(after: &curr)
        }
    }
    func updatingEach(with update: (inout Element) throws -> ()) rethrows -> Self  {
        var result = self
        try result.updateEach(with: update)
        return result
    }
}
3 Likes

I think there was also something mentioned in this pitch which might be the more fundamental mutating foreach:

for inout y in &array {
  // The following updates the array elements "in place"
  y += 1
}
2 Likes

While the lack of this feature has been a long-standing pain point, and definitely needs addressing in some way, it has been long enough that we now seem pretty close to the alternative: a for inout loop that lets you mutate its elements.

It's my personal view that adding forEach to the standard library was a major blunder. Its inappropriate use is common in parts of the community. It was originally recommended in the docs only for use at the end of chains, and discouraged in favor of real for loops otherwise, because of the potential confusion with the behavior of return (which acts like continue). At some point though that recommendation was dropped, and besides, was not guidance always seen or followed. I frequently come across cases where I see a forEach instead of a for loop, several lines long, with a return in the middle, and I have no idea if the author intended to return from the outer function or not.

The mutating version would have exactly the same problem, and since it would have more legitimacy, would in turn further encourage use of forEach.

For this reason, I strongly prefer the route of adding a native support for mutating for loops instead. If it were to be another 5 years, that would be one thing, but I think we have a reasonable shot of getting it into the next couple of releases.

24 Likes

Interesting, I've never heard that reason. I think it's bogus reason though. We shouldn't have bad API as a lesser evil to offer to people blatantly misusing other APIs.

There is another reason why forEach used to be justified which is that it used to be a customization point. In theory that meant it could be faster than using an iterator, if some exotic data structure chose to provide a different implementation. In fact this theoretical speed benefit was even taken advantage of in the standard library in a couple of places. But it was removed as a customization point in SE-0232 so that no longer applies.

The other two justifications are what I thought was the main one (sticking it on the end of chains... which doesn't apply to a mutation variant) and the use with optionals, which ideally would have a better solution.

3 Likes

This issue applies to all closures, not just ones passed to forEach. It could be partially mitigated by deprecating the version of forEach that takes a Void-returning closure and introducing a new one that must return a signal to continue or abandon the loop. A simple two-case enum could make it quite readable:

enum IterationDisposition {
  case continue
  case break
}

extension Sequence {
  func forEach(_ body: (Element) -> IterationDisposition)
}

// Usage:
let xs = [1, 2, 3]
xs.forEach {
  if $0 == 2 {
    return .break
  } else {
    return .continue
  }
}

Ruby implements a more holistic solution, where return inside a do block really does mean to return from the caller, and one must use next or break to return from the block.

Swift privileges closures nearly as much as Ruby privileges blocks, so I think it’s frankly swimming upstream to discourage use of forEach. And since the problem you describe doesn’t apply solely to forEach, I think it’s worth considering how it might be fixed for all closure-taking APIs.

Copying Ruby’s solution exactly is inappropriate for Swift given the existing meaning of return in closures and the ability of Ruby’s behavior to result in non-local returns that the runtime must detect and turn into exceptions. But a different solution that prohibits return in a closure and forces use of a different keyword to more clearly communicate control flow would be feasible.

2 Likes

@Ben_Cohen I also prefer real for loops but one user case that forEach is currently a winner is when used together with property wrappers:


@State var tasks: [Task] = []

$tasks.forEach { $task in
    task.isDone.toggle()
}

It would be really great if it was possible to use a for $task in $tasks to move away from forEach on this case.

8 Likes

Very true. When looking at:

XXX
{
    ...
    return
}

it is not immediately obvious whether this is a closure we are looking at or part of a statement like if / for / etc unless we look at the "XXX" part. I believe a better design would have been using different brackets for these two quite different concepts, like so:

func foo() [
    if condition {
        for x in items {
            ...
            return
        }
        items.forEach [
            ...
            return
        ]
    } else {
        ...
    }
    let newItems = items.map [
        ...
    ]
]
On whether forEach is a bug or a feature: I believe it is a feature.

When looking at this fragment:

items.forEach {
    ...
}
print("here you are")

I know that at "here you are point" all items were iterated without having to look into "...", no ifs no buts.

this is not the case with "for in":

for item in items {
    ...
}
print("here you are")

In this case there could have been some early break, which I'd have to pay attention to.

When choosing between the two I ask myself: "do I need to iterate through all items or would I ever want to break early?" and use either "forEach" or "for in" based on the answer.

1 Like

I think there exist use cases for forEach, but it simply shouldn't be the "default": the default should be for-in and I would like if this was eventually addressed in the official docs. I myself strongly discourage it, but specifically in the cases where a for-in loop would be fine (and in some cases better and more flexible), or where the developer used forEach just because they "like it" more: in some other cases it (kind of) makes sense, so I would rather leverage convention here and see an update to the documentation that clarifies that for-in should be considered default.

Thanks for the example – I wasn't aware of this one. Yes, it would be good if this weren't necessary.

1 Like

BTW, I see no harm for forEach returning @discardableResult self, so we could continue the chain after it:

    items
        .forEach {
            print("will do something", $0)
        }
        .forEach {
            print("do something", $0)
        }
        .forEach {
            print("didDo something", $0)
        }

I must say that the forEach design in the stdlib seemed weird to me from the very start precisely because it doesn't return self, limiting its application as a chainable operator.

It would make me uncomfortable to see a purely side-effecting operator in the middle of a chain. What does someSequence.forEach(…).lazy do? The purpose of forEach is to iterate the entire sequence and do other side-effecting stuff; the purpose of lazy is to not iterate the entire sequence.

1 Like