I've put together a draft proposal, shown below. Please let me know what you think.
In-place modification of MutableCollection elements
- Proposal: SE-NNNN
- Authors: Cory Benfield, Chris Eidhof
- Review Manager: TBD
- Status: Awaiting implementation
Introduction
This proposal would add a modifyEach
method to MutableCollection
.
This feature would enable modifying the elements contained within a single MutableCollection
, without creating a new MutableCollection
to contain them the way map
does. In many cases this can provide performance improvements by taking advantage of the _modify
accessor available on many MutableCollection
s.
Motivation
Many Swift programs use MutableCollection
s to store substantial amounts of program state. In many cases Swift programs will endeavour to store their state as objects with value semantics, as this provides substantial rewards in terms of ensuring program correctness and performance.
However, if the primary copy of the program state is stored in the MutableCollection
, updating that state can be expensive. If the user writes code that brings the value into temporary storage, that value will need to be copied out of the storage (invoking the get
accessor), mutated, and them copied back in to the storage (invoking the set
accessor).
As an example, consider a simple task tracking application. We may model its main program state something like this:
struct TaskTracker {
var tasks: [Task]
}
Imagine we wanted to implement a function that would mark all of the tasks as completed. How might we write such a function?
One natural approach would be to use map
:
mutating func markAllComplete() {
self.tasks = self.tasks.map {
var newElement = $0
newElement.complete = true
return newElement
}
}
This approach creates a whole new array, heap-allocates it, copies all the data from the old array to the new one (modifying it along the way), and then frees the old one. That's not ideal.
We could instead attempt something more performant: modify the array in place. A naive approach might be:
mutating func markAllComplete() {
for index in self.tasks.indices {
self.tasks[index].complete = true
}
}
While ths implementation is safe for Array
, it has performance problems when written as a generic operation on MutableCollection
, as indices
may hold a reference to self.tasks
, which will cause this operation to trigger a CoW operation unnecessarily.
The highest-performance implementation is this one:
mutating func markAllComplete() {
var index = self.tasks.startIndex
while index != self.tasks.endIndex {
self.tasks[index].complete = true
index = self.tasks.formIndex(after: &index)
}
}
This code is good in Swift 4, but in Swift 5 this final form is particularly powerful due to the increasingly widespread use of the _modify
accessor. In this case, this code will be able to compile down to something very close to the equivalent code in C, manipulating the values directly in the underlying Array
storage.
Unfortunately, this pattern is not a natural pattern for most Swift programmers to write. It is unlikely that the average Swift programmer will naturally solve their way to this implementation without having a relatively good grasp of not only how accessors work in Swift in general, but also the use of newer accessors in Swift 5.
Proposed solution
This proposal seeks to add a new method, modifyEach
, to MutableCollection
. This method would provide a generic implementation of the above pattern that seeks to push users towards a performant pattern of modifying elements in mutable collections.
This change would modify the above function to:
mutating func markAllComplete() {
self.tasks.modifyEach { $0.complete = true }
}
In addition to being a natural spelling of this kind of operation, it provides users with a common pattern for performing this kind of state mutating operation on all kinds of collections, including those that potentially have substantially more expensive access operations.
While the example above using Array
is simple, in real-world usage the programs and state modification logic can be substantially more complex. Generally these more complex operations are described as mutating functions that need to be invoked on the objects stored in a MutableCollection
. modifyEach
provides easy hooks for performing arbitrary mutation of stored objects, while pushing users towards invoking functions either on inout
references to state or as mutating
member functions on the stored objects. In either case, this plays much more nicely with the _modify
accessor than more naive mutation approaches.
This proposal does not propose to make modifyEach
a customisation point. In practice there is only one reasonable implementation that will work across all MutableCollection
objects with maximum performance, snd there is no reason to allow MutableCollection
s to override that behaviour.
Detailed design
The new function is short and clear:
extension MutableCollection {
/// Calls the given closure on each element in the collection in the same order as a `for-in` loop.
///
/// The `modifyEach` method provides a mechanism for modifying all of the contained elements in a `MutableCollection`. It differs
/// from `forEach` or `for-in` by providing the contained elements as `inout` parameters to the closure `body`. In some cases this
/// will allow the parameters to be modified in-place in the collection, without needing to copy them or allocate a new collection.
///
/// - parameters:
/// - body: A closure that takes each element of the sequence as an `inout` parameter
@inlinable
mutating func modifyEach(_ body: (inout Element) throws -> Void) rethrows {
var index = self.startIndex
while index != self.endIndex {
try body(&self[index])
self.formIndex(after: &index)
}
}
}
Source compatibility
This change provides no source compatibility impact. It requires no new syntax from the language and could easily have been implemented in Swift 4, though it provides better performance in Swift 5.
Effect on ABI stability
This change does not affect the stable ABI.
Effect on API resilience
This change adds to the declared API of MutableCollection
with minimal ABI impact. In particular, it does not break the ABI of MutableCollection
.
As the body of the method relies entirely on declared parts of the MutableCollection
protocol, it is safe to make this method non-resilient, as it is valid in all current versions of Swift. Programs that do inline this implementation will continue to be correct as long as the MutableCollection
protocol does not fundamentally change.
Alternatives considered
This change was originally proposed in March of 2018 by Chris Eidhof. This proposal is spiritually identical to Chris' identical proposal: as such, he has been identified as a co-author of this proposal.
Indices
Chris' original proposal, as well as several suggestions in the new thread, proposed using indices
instead of formIndex
. As discussed above, indices
can cause an unnecessary CoW operation in some cases, as it may hold a reference to the original MutableCollection
that cannot be elided.
For this reason, it is preferable to use formIndex
instead.
Inout for loops
As part of the same ownership manifesto that led to the addition of the _modify
accessor, John McCall discussed the possibility of using Python-style generators for iteration in a future version of Swift.
The sample code from the manifesto bears a striking similarity to the code proposed in this document:
mutating generator iterateMutable() -> inout Element {
var i = startIndex, e = endIndex
while i != e {
yield &self[i]
self.formIndex(after: &i)
}
}
As this generator construct could be used to provide iteration à la Python, a logical extension to the for-in
syntax would be for inout x in y
, allowing the mutation of the elements produced by the generator.
This language extension would be substantially nicer than the modifyEach
function provided here. In particular, it integrates much better with features elsewhere in the langage, avoids the potential performance pitfalls of the widespread use of closures, and looks altogether more "Swifty".
However, at this stage such a feature is unlikely to land in Swift 5, meaning that its public release will be at least a year away, even if it is implemented on the most aggressive of schedules. For this reason, and considering the low ABI, API, and source compatibility impact of adding modifyEach
to MutableCollection
, the authors consider it worthwhile to add modifyEach
to provide a worse version of for inout
until that language feature arrives.
At the time that this language feature arrives, the Swift community should consider whether modifyEach
(and its non-modifying cousin, Sequence.forEach
) should be formally deprecated in favour of the loop constructs. However, this should be part of a wider discussion about the use of generators for iteration, and is outside the scope of this proposal.