In-place map for MutableCollection

anandabits · March 19, 2018, 6:31pm

This sounds really great! I can't wait to see the ownership model get fleshed out over the next couple versions.

Chris_Eidhof · March 19, 2018, 6:36pm

I'm not sure what you're asking... which items should be @discardable_result?

Chris_Eidhof · March 19, 2018, 6:40pm

There are two differences: it could avoid a copy of the item, and there's a different syntax. Sometimes it's more natural to write things in a mutating way.

Erica_Sadun · March 19, 2018, 6:43pm

    @discardable_result
    mutating func applyInPlace(_ x: (inout Element) -> Self) {
        for i in indices {
            x(&self[i])
        }
        return self
    }
}

This would allow you to do:

var collection = ...
collection.applyInPlace({ $0.capitalizedString }).forEach{ print($0) }

I think it may be an abomination to do this kind of hybrid functional/mutation but I'm mulling.

DevAndArtist · March 19, 2018, 6:55pm

The compiling example should probably look like this:

extension MutableCollection {
  @discardableResult
  mutating func applyInPlace(_ x: (inout Element) -> Void) -> Self {
    for i in indices {
      x(&self[i])
    }
    return self
  }
}

Fixed the attribute
The closure should return Void
The function should return Self

Chris_Eidhof · March 19, 2018, 7:02pm

Hm, that seems very much out of line with any of the other mutating methods in the standard library. It would fit very well for something that constructs a new value, but this doesn't. The only mutating methods that I'm aware of which return values are things like remove(at:), which return the element they just removed.

hlovatt · March 19, 2018, 7:58pm

+1 for mutateAll.

zwaldowski · March 20, 2018, 7:13am

Both some kind of language-level handling of mutating and borrowing/ownership have come up on this thread as potentially obviating this addition before it's even implemented. I've wanted this myself at times, but I don't think the stdlib needs to mask away every 2-3 line incantation at this point, especially if it may be replaced later on.

That being said, I concur with others that mutate is not an appropriate method name. I think between Dictionary.updateValue(_:forKey:) and forEach(_:) lies something like updateEach(_:). I'm nonplussed about All, that's fine to me too it just comes up less in the stdlib.

DevAndArtist · March 20, 2018, 8:31am

I wanted to throw another alternative name into the bucket. For me as a non-native English speaker the withEach name is literally the same thing as forEach, because it sound like we're iterating over all the elements and doing something with them, in a sense that we're passing the elements to a different function or doing some computation.

The name that has not been mentioned before is onEach. The on prefix could be misleading if you're familiar with API's like RxSwift which use the onNext / onError conventions when a new value or error arrives (the full name is subscribe(onNext:onError:) anyway and not just onNext), but I always felt that this naming was some how wrong and probably has its roots from a different programming language where this naming established long time ago and is kept for historical reasons. onEach sounds to me like if we're operating on the element itself.

gwendal.roue · March 20, 2018, 8:51am

Hello, I'd like to know if we're still talking about a closure that takes an inout element and returns void, or a closure that takes an element and returns another element.

The reason why I ask this question is because an inout argument makes the code look weird when there is no mutating method that performs the job on each element:

var a = [1, 2, 3]
a.mapInPlace { $0 = $0 * 2 } // double all elements

var b = ["foo", "bar"]
a.mapInPlace { $0 = $0.uppercased() } // uppercase all elements

Is there any reason why we wouldn't want first a plain closure that takes an element and returns another?

var a = [1, 2, 3]
a.mapInPlace { $0 * 2 }

var b = ["foo", "bar"]
a.mapInPlace { $0.uppercased() }

And from this base method, we would then derive a convenience inout variant, just like we had reduce(into:) as a (lovely) convenience derivation of reduce?

This may help the naming. Two examples below based on the "Replace" and "Update" verbs:

var a = [1, 2, 3]
a.replaceAll { $0 * 2 }

var foods: [Food] = [...]
foods.replaceEach { $0.name += " & Salad" }

var a = [1, 2, 3]
a.updateAll { $0 * 2 }

var foods: [Food] = [...]
foods.updateEach { $0.name += " & Salad" }

gwendal.roue · March 20, 2018, 9:14am

To better explain my previous post:

I know that there are pragmatic performance considerations. For the better or for the worse, the choice between mutable and immutable variants is currently important. A bad choice can turn an innocuous-looking piece of code into a quadratic beast, or worse.

Yet, the reduce / reduce(into:) story had taught us something. It is that the stdlib first had the general and optimistic reduce. And that only after, a well-motivated proposal brought us reduce(into:) because Swift and the stdlib are ruled by pragmatic people.

The pitched mutating func mapInPlace(_ x: (inout Element) -> ()) looks to me like it jumps over a more general and optimistic mutating func mapInPlace(_ x: (Element) -> Element).

I think that we'd miss the general variant if it weren't introduced. And that the inout-optimized variant could be well-motivated by pragmatic performance/memory considerations. This would turn this pitch into a proposal that introduces two "map in place" methods.

Chris_Eidhof · March 20, 2018, 3:56pm

I'm not sure if it's more general, you can always go from one to the other. As discussed earlier, of the reasons of having an in-place map is to provide a mutating interface. The possible optimisation is a separate thing.

gwendal.roue · March 20, 2018, 4:05pm

Yes, but there remains a question: should you go from mapInPlace { $0 = $0 * 2 } to mapInPlace { $0 * 2 }, or the opposite? The fact that you can go both ways does not mean they're equivalent, or that nobody should care about which variant has to be preferred in a given context.

Before you answer too fast, let me please try to lift a misunderstanding. I support your pitch. But you wrote:

There are two mutations in your pitch. The mutation of the collection, and the mutation of the elements. You pitch the mutation of the collection. I support it. I question the mutation of the elements.

In the sample code below, both collections are mutated:

var a = [1, 2, 3]
a.replaceAll { $0 * 2 }

var foods: [Food] = [...]
foods.replaceEach { $0.name += " & Salad" }

However, elements are only mutated in the second version. I think that it's not obvious that the second form, which mutates both collection and elements, is the only interesting method, and the only way to achieve your pitch. I think that the first form, which mutates the collection but not elements, is interesting, too, and should be at least considered.

Jens · March 20, 2018, 5:26pm

Isn't Swift's inout implemented in such a way that the two versions compile to equivalent code?

gwendal.roue · March 20, 2018, 5:29pm

Do you mean that

mutating func mapInPlace(_ x: (inout Element) -> ()) { ... }

could accept both:

var a = [1, 2, 3]
a.mapInPlace { $0 * 2 }

var foods: [Food] = [...]
foods.mapInPlace { $0.name += " & Salad" }

?

I'm not sure because in the first case we have a closure that returns a value, and in the second case we have a closure without any result. Do I miss something?

Jens · March 20, 2018, 5:31pm

No, I meant that the two versions (though still called in two separate ways of course), will be compiled into equivalent byte code, because of how inout is implemented in the compiler.

gwendal.roue · March 20, 2018, 5:38pm

I understand what you mean. And maybe all mutating/non-mutating variants will eventually been handled by the compiler so that we can freely use the one we prefer, only driven by aesthetics concerns. It will become a mere question of code legibility, not a question of performance.

And that's where I find that mapInPlace { $0 = $0 * 2 } looks like a overlooked use case, should we only get mapInPlace { $0.name += " & Salad" }. We'll miss someGoodName { $0 * 2 }.

Nevin · March 20, 2018, 5:46pm

One case where the difference matters is when you have a collection-of-collections. To avoid confusion, let’s say an Array of Data:

var x = [Data(repeating: 10, count: 2),
         Data(repeating: 11, count: 3)]

// Element-mutating version:
x.modifyAll{ data in
  data[0] = 15
}

// Element-replacing version
x.updateAll{ data in
  var newData = data
  newData[1] = 14
  return newData
}

print(x.map{Array($0)})

Even if the compiler is smart enough to produce equivalent code for the two implementations (can we get confirmation of that?) there still remains a massive readability difference at the call-site.

Although…I just stumbled on a weird case where the compiler requires type annotation (and has an unhelpful diagnostic):

var z = [[2, 2], [3, 3, 3]]
z.modifyAll{ array in
  array.modifyAll{ n in     // error: passing value of type 'Int' to an inout parameter requires explicit '&'
    n += 10
  }
}

The actual fix is to replace “n in” with “(n: inout Int) in”, OR to replace “n += 10” with “n = n + 10”. And I don’t understand why.

gwendal.roue · March 20, 2018, 5:50pm

Exactly. That's all I want to say. That we may need both variants. Because some types are better handled with one variant, and some types are better handled with the other variant. I don't understand why the pitch should favor one over the other, when it could acknowledge that both are needed.

A massive readability difference at the call-site. Couldn't say it better :-)

xwu · March 20, 2018, 11:02pm

I agree. If, within one or two versions of Swift, the performance gains that this pitch seeks to enable can be entirely subsumed by borrowing/ownership (and in a way that's largely transparent to the end user), then it brings into question whether making a permanent addition to the standard library now is consistent with the direction of Swift.