In-place map for MutableCollection

Hello, I'd like to know if we're still talking about a closure that takes an inout element and returns void, or a closure that takes an element and returns another element.

The reason why I ask this question is because an inout argument makes the code look weird when there is no mutating method that performs the job on each element:

var a = [1, 2, 3]
a.mapInPlace { $0 = $0 * 2 } // double all elements

var b = ["foo", "bar"]
a.mapInPlace { $0 = $0.uppercased() } // uppercase all elements

Is there any reason why we wouldn't want first a plain closure that takes an element and returns another?

var a = [1, 2, 3]
a.mapInPlace { $0 * 2 }

var b = ["foo", "bar"]
a.mapInPlace { $0.uppercased() }

And from this base method, we would then derive a convenience inout variant, just like we had reduce(into:) as a (lovely) convenience derivation of reduce?

This may help the naming. Two examples below based on the "Replace" and "Update" verbs:

var a = [1, 2, 3]
a.replaceAll { $0 * 2 }

var foods: [Food] = [...]
foods.replaceEach { $0.name += " & Salad" }
var a = [1, 2, 3]
a.updateAll { $0 * 2 }

var foods: [Food] = [...]
foods.updateEach { $0.name += " & Salad" }

To better explain my previous post:

I know that there are pragmatic performance considerations. For the better or for the worse, the choice between mutable and immutable variants is currently important. A bad choice can turn an innocuous-looking piece of code into a quadratic beast, or worse.

Yet, the reduce / reduce(into:) story had taught us something. It is that the stdlib first had the general and optimistic reduce. And that only after, a well-motivated proposal brought us reduce(into:) because Swift and the stdlib are ruled by pragmatic people.

The pitched mutating func mapInPlace(_ x: (inout Element) -> ()) looks to me like it jumps over a more general and optimistic mutating func mapInPlace(_ x: (Element) -> Element).

I think that we'd miss the general variant if it weren't introduced. And that the inout-optimized variant could be well-motivated by pragmatic performance/memory considerations. This would turn this pitch into a proposal that introduces two "map in place" methods.

2 Likes

I'm not sure if it's more general, you can always go from one to the other. As discussed earlier, of the reasons of having an in-place map is to provide a mutating interface. The possible optimisation is a separate thing.

1 Like

Yes, but there remains a question: should you go from mapInPlace { $0 = $0 * 2 } to mapInPlace { $0 * 2 }, or the opposite? The fact that you can go both ways does not mean they're equivalent, or that nobody should care about which variant has to be preferred in a given context.

Before you answer too fast, let me please try to lift a misunderstanding. I support your pitch. But you wrote:

There are two mutations in your pitch. The mutation of the collection, and the mutation of the elements. You pitch the mutation of the collection. I support it. I question the mutation of the elements.

In the sample code below, both collections are mutated:

var a = [1, 2, 3]
a.replaceAll { $0 * 2 }

var foods: [Food] = [...]
foods.replaceEach { $0.name += " & Salad" }

However, elements are only mutated in the second version. I think that it's not obvious that the second form, which mutates both collection and elements, is the only interesting method, and the only way to achieve your pitch. I think that the first form, which mutates the collection but not elements, is interesting, too, and should be at least considered.

Isn't Swift's inout implemented in such a way that the two versions compile to equivalent code?

Do you mean that

mutating func mapInPlace(_ x: (inout Element) -> ()) { ... }

could accept both:

var a = [1, 2, 3]
a.mapInPlace { $0 * 2 }

var foods: [Food] = [...]
foods.mapInPlace { $0.name += " & Salad" }

?

I'm not sure because in the first case we have a closure that returns a value, and in the second case we have a closure without any result. Do I miss something?

No, I meant that the two versions (though still called in two separate ways of course), will be compiled into equivalent byte code, because of how inout is implemented in the compiler.

I understand what you mean. And maybe all mutating/non-mutating variants will eventually been handled by the compiler so that we can freely use the one we prefer, only driven by aesthetics concerns. It will become a mere question of code legibility, not a question of performance.

And that's where I find that mapInPlace { $0 = $0 * 2 } looks like a overlooked use case, should we only get mapInPlace { $0.name += " & Salad" }. We'll miss someGoodName { $0 * 2 }.

One case where the difference matters is when you have a collection-of-collections. To avoid confusion, let’s say an Array of Data:

var x = [Data(repeating: 10, count: 2),
         Data(repeating: 11, count: 3)]

// Element-mutating version:
x.modifyAll{ data in
  data[0] = 15
}

// Element-replacing version
x.updateAll{ data in
  var newData = data
  newData[1] = 14
  return newData
}

print(x.map{Array($0)})

Even if the compiler is smart enough to produce equivalent code for the two implementations (can we get confirmation of that?) there still remains a massive readability difference at the call-site.

Although…I just stumbled on a weird case where the compiler requires type annotation (and has an unhelpful diagnostic):
var z = [[2, 2], [3, 3, 3]]
z.modifyAll{ array in
  array.modifyAll{ n in     // error: passing value of type 'Int' to an inout parameter requires explicit '&'
    n += 10
  }
}

The actual fix is to replace “n in” with “(n: inout Int) in”, OR to replace “n += 10” with “n = n + 10”. And I don’t understand why.

1 Like

Exactly. That's all I want to say. That we may need both variants. Because some types are better handled with one variant, and some types are better handled with the other variant. I don't understand why the pitch should favor one over the other, when it could acknowledge that both are needed.

A massive readability difference at the call-site. Couldn't say it better :-)

2 Likes

I agree. If, within one or two versions of Swift, the performance gains that this pitch seeks to enable can be entirely subsumed by borrowing/ownership (and in a way that's largely transparent to the end user), then it brings into question whether making a permanent addition to the standard library now is consistent with the direction of Swift.

In the adding toggle to bool thread, some good examples came up for why that is useful, and many of the same examples apply to this inout version as well. A big difference is that it allows you to not repeat your self. Here's a contrived example:

struct Favorite {
    var name: String
    var visible: Bool
}
var favorites: [String: [Favorite]] = [
    "john": [Favorite(name: "swift.org", visible: true), Favorite(name: "idris-lang.org", visible: false)],
    "edward": [Favorite(name: "haskell.org", visible: true)],
]

Let's say that we want to toggle all visible properties for the "john" key. Our new mapInPlace makes this quite easy.

favorites["john"]?.mapInPlace { $0.visible.toggle() }

The main difference in expressivity is that the inout allows you to both get and set a value. The way it composes is especially interesting: For example, in the closure body, $0 is inout, but because visible is a var, $0.visible is also inout. Likewise, the ? operator will also maintain the inout-ness of a value. (Technically, these are called l-values, not inouts).

This style allows you to reach "deep" into a data structure, and mutate it. In a functional style, mutating something that is slightly deeper into the data structure becomes very cumbersome, unless you use something like lenses. This deep mutation is something that comes up often in my code when dealing with data. We could also consider adding lenses to the Standard Library, but Swift already has strong support for inout throughout the language.

If you have a closure (A) -> A, you can always recover (inout A) -> () by prepending it with $0 = . If something like with would be in the Standard Library, the inout version could also be recovered from the functional version.

3 Likes

It's sometime hard to be understood. Of course your pitched method is useful. You don't need to be defensive, because no one is telling the opposite! Now please if you have five seconds, look at the other variant that does not mutate element, just in case you'd find it useful, too:

If you have a closure (A) -> A, you can always recover (inout A) -> () by prepending it with $0 =.

Meh. This sentence is short-sighted. It puts the focus on one side of the coin without telling why the other side of the coin does not deserve the slightest attention.

Now this is your pitch. I've tried hard enough to make it less focused on your use case. I won't repeat and repeat again :-)

I understand what you're saying. I'm sorry if I come off to you as defensive, I'm just trying to show examples of where I think inout is useful, and how it is different from (A) -> A.

Like I said, you can also recover the inout variant by using with. If we don't regard performance (something the compiler engineers should weigh in about), it means that it doesn't matter whether mapInPlace would use inout or not in the closure. I do have a preference for the inout version of the closure, but to me, that's not set in stone. I'm actually very happy to be convinced otherwise! Instead of calling each other defensive / not listening, let's focus on coming up with more examples of how the API differs at the site of usage.

Oh, and with regard to adding both variants: I think that's too much of an ask, whichever one we choose could easily be recovered from the other. And unless we give them distinct names, it might hurt type-inferencing.

I've been thinking about this a bit more. I might still misunderstand you. You're asking me to look at the other variant (with the (A) -> A) parameter. I did look at it (and thought about it a lot, even before the pitch). Do I understand correctly that you would like me to give more examples of when that is useful? Or do you want me to acknowledge that I have read what you wrote? Or do you want the proposal to include both variants? I'm not here to get into a fight, I'm honestly asking, and trying to understand your needs :slight_smile: .

Fair questions, Chris, thanks :-) All right:

I think that the eventual proposal derived from this pitch should:

  • Give at least one example of a.mapInPlace { $0 = ... }, very early in the proposal, in the motivation section.
  • Tell that a variant that takes a (A) -> A closure is not included in the proposal in order to 1. keep the proposal short (a single added method), and 2. focus on the method that provides the most benefits in terms of memory usage.

The reasons why I think it would improve your proposal is because { $0 = ... } is ergonomically sub-par, but the only available technique, in your pitch, for alterations that aren't supported by a mutating method (and there are many). I gave two trivial examples (integer multiplication, and string uppercasing), but there are tons of other alterations that are not supported by mutating methods. People should be prepared for { $0 = ... }.

Your current pitch shines because its examples use elements that have mutating methods. But there is a little dark corner, and I wouldn't like that it was hidden from the community sight during the review phase. Maybe the community would find a way to address this shortcoming. At the same time, I'm sure the community will not kill the proposal because of this shortcoming: you won't jeopardize the success of your excellent pitch.

Thank you, that clarifies a bit.

There is a third reason, besides keeping it short and efficiency, which I mentioned a few times: it allows you to write code differently. Look at the examples in the posts above to see. This is (for me) an even bigger reason to write it using inout.

This's very common scenario used in my apps. I'm two hands for this improvement.

updateAll is very similar to a C++ function called std::transform but it allows you to specify a different destination. To apply it in place, you can then use the destination as the source:

extension Collection {
    func transform<D: MutableCollection>(
        to destination: inout D,
        transform: (Element) -> D.Element
    ) {
        var i = startIndex
        var j = destination.startIndex
        
        while i != endIndex, j != destination.endIndex {
            destination[j] = transform(self[i])
            
            i = self.index(after: i)
            j = destination.index(after: j)
        }
    }
}

The source could be adapted to be a Sequence as well.