We eventually want Swift to be able to support move-only types, and we want the core iteration APIs to be ready to support data structures containing move-only values. Particularly, we want it to be possible for fundamental data structures like Array
, Dictionary
, and Set
to be generalized to allow
move-only types in a future version of Swift. This means, at minimum, that future move-only implementations of these data structures must be able to conform to the Sequence
and Collection
protocols we ship with Swift 5 for backward compatibility. We should also anticipate the language and
library changes that will be necessary to support move-only types and consider how they'll work when backward-deployed to code compiled with Swift 5. inevitably require additions to the standard library.
Limitations of Sequence
iteration with move-only types
As it currently stands, the basic iteration protocol in Swift is the Sequence
protocol:
protocol Sequence {
associatedtype Iterator: IteratorProtocol
func makeIterator() -> Iterator
}
protocol IteratorProtocol {
associatedtype Element
mutating func next() -> Element?
}
and a for
loop desugars into code that makes an iterator for the sequence and then pulls values out of the iterator using its next
method:
for x in xs {
<body>
}
// is equivalent to:
do {
var iterator = xs.makeIterator()
while let x = iterator.next() {
<body>
}
}
Sequence
is inadequate for a number of reasons that are already being discussed, but on top of all that ongoing discussion, it's also unusable in its current form with move-only types:
- The
Iterator
implementation in practice requires duplicating some or all of the information in the originalSequence
, which is not possible if theSequence
cannot be copied without consuming the original sequence. - The iteration method
next()
returns each element by value. This is impossible for sequences whose elements can't be copied, unless the original sequence is consumed by the iteration, since returning by value would require moving those values out of the holding sequence.
As written today, a move-only type cannot conform to the Sequence
protocol in most practical cases due to the need for the Iterator
to copy information from the Sequence
value. At minimum, we must make the makeIterator
requirement __consuming
in anticipation of move-only types so that a move-
only implementation can satisfy the requirement by consuming the sequence to produce the iterator.
With the current protocol design, a move-only Sequence
would only support single-pass consuming iteration, but future versions of Swift could add requirements to the protocol to support multi-pass iteration. At that time, existing copyable implementations of Sequence
would be able to automatically
conform to the new requirements with a default implementation written in terms of the existing makeIterator
interface, but new move-only implementations may have to implement both interfaces.
Collections and move-only types
The Collection
protocol is somewhat better prepared for move-only types, since it provides indexing and subscripting as traversal and access primitives. The index manipulation requirements on Collection
like index(after:)
are designed to all take the original collection as input in addition to the index being manipulated, and likewise subscripting takes both the collection and index as inputs, which means that most index implementations do not need to duplicate any information from their parent collection. We're working on adopting a coroutine-based model for property and subscript accesses, which
will mean that property and subscript reads can be done as borrows. For move-only collections, we could define an alternative expansion of for
loops for collections in terms of indexes and subscripts:
for x in xs {
<body>
}
// could be expanded to:
do {
var index = xs.startIndex
let end = xs.endIndex
while index != end {
// Pretend that `with` is a statement that extends an access to a property
// over multiple statements
with let x = xs[index] {
<body>
}
index = xs.index(after: index)
}
}
Since indexes do not need to copy the original collection, and subscripting will be able to yield borrowed values out of the collection without formally copying them, this would be a workable iteration model for move-only collections. Index-based iteration could also be used to extend for loops to support mutating elements on MutableCollection
s.
On the other hand, index-based iteration is suboptimal in a number of ways. In unspecialized code, it would require three virtual calls per iteration to test, subscript, and advance the index,
compared to one next()
call for the current iterator design. There's also more per-iteration validation overhead in subscripting than there would be in an ideal IteratorProtocol
implementation, since an iterator
can be written to assume in-bounds access, whereas an isolated subscript must validate the given index every time. We don't strictly need to do anything today for Collection
to accommodate future move-only implementations, but we may want to anticipate and mitigate the performance issues if we don't.
The ideal solution: Generators?
Generators are a popular feature in other programming languages that make it easy to define sequences by writing imperative code that yield
-s each element of the sequence in order. They would also be a natural fit for the proposed ownership and move-only type model, since a running generator can hold on to
a borrow of a parent sequence while yielding access to its elements without requiring copies. For instance, one could efficiently iterate through an Array of move-only values without requiring per-iteration bounds checks or intermediary copies of the array value:
func generateArray<T>(xs: [T]) yields T {
xs.withUnsafeBufferPointer {
var address = $0.startAddress
for i in 0..<$0.count {
yield address.unsafelyUnwrapped.pointee
address += 1
}
}
}
A future version of the Sequence
protocol could introduce generator requirements that supersede the makeIterator
requirement for consuming and borrowing iteration:
protocol Sequence: moveonly {
/* existing requirements */
consuming func generateConsuming() yields consumed Element
func generate() yields Element
}
For compatibility with existing copyable Sequence
s and Collection
s, we can provide default implementations of these generator requirements in terms of the existing requirements. We can also provide default implementations for move-only Collection
s, though move-only Sequence
s need explicit implementation of the new interface:
extension Sequence: moveonly where Element: moveonly {
// A default implementation of generateConsuming() can be given
// in terms of the makeIterator() model
consuming func generateConsuming() yields consumed Element {
var iterator = makeIterator()
while let element = iterator.next() {
yield element
}
}
}
extension Sequence: copyable where Element: copyable {
// A default implementation of generateConsuming() can be given
// in terms of the makeIterator() model, but only for copyable types.
// This is a bit inefficient due to the need to copy the
// sequence and the elements.
func generate() yields Element {
var iterator = makeIterator()
while let element = iterator.next() {
yield element
}
}
}
extension Collection: moveonly where Element: moveonly {
// For collections, we can provide a default implementation of
// generate() even for move-only implementations in terms of indexing
// and subscripting.
//
// This would be inefficient when the default implementation is used
// with binaries compiled for Swift 5, due to the need for virtual
// calls to manipulate indexes and access the subscript. Recompiling
// code ought to at least get a specialized implementation of
// `generate`, though.
func generate() yields Element {
var index = startIndex
let end = endIndex
while index != end {
yield self[index]
index = index(after: index)
}
}
}
Generators are not likely to happen in the Swift 5 timeframe, both because they're a large new language feature and because getting the efficiency and interactions we want with generators and the ownership model likely itself requires move-only types to model the continuations for suspended generators, but they're a good model for seeing how iteration could work in the future of Swift.
Preparing Collection
iteration to be efficient with move-only types
As noted above, we can extend the Collection
protocol we have today to be compatible with move-only collections in the future using generators, but the default implementation would be inefficient with binaries not recompiled to specialize the new requirements. We could perhaps mitigate this in the compiler
by favoring the Swift 5 interfaces to expand for
loops with types known to have been compiled for Swift 5. On the other hand, we can somewhat anticipate what a generator-based ABI would look like and prepare for it by putting hidden requirements in the Sequence
and Collection
protocols today:
protocol Sequence {
/* existing requirements... */
associatedtype _Generator: _GeneratorProtocol
func _generate() -> _Generator
subscript(_advancing generator: inout _Generator) -> Element
}
extension Sequence {
func _generate() -> _SequenceGenerator<Self> {
return _SequenceGenerator(iterator: makeIterator())
}
subscript(_advancing generator: inout _SequenceGenerator<Self>) -> Element {
yield generator.value!
generator.value = generator.iterator.next()
}
}
protocol _GeneratorProtocol {
var done: Bool { get }
}
struct _SequenceGenerator<S: Sequence>: _GeneratorProtocol {
var iterator: S.Iterator
var value: S.Element?
var done: Bool { return value == nil }
init(iterator: S.Iterator) {
self.iterator = iterator
self.value = self.iterator.next()
}
}
protocol Collection: Sequence
where _Generator == _CollectionGenerator<Self> {
/* existing requirements... */
func _generate() -> _CollectionGenerator<Self>
subscript(_advancing generator: inout _CollectionGenerator<Self>) -> Element
}
extension Collection {
func _generate() -> _CollectionGenerator<Self> {
return _CollectionGenerator(done: startIndex == endIndex,
start: startIndex,
end: endIndex)
}
subscript(_advancing generator: inout _CollectionGenerator<Self>) -> Element {
read {
precondition(!generator.done)
let cur = generator.start
yield self[cur]
generator.start = index(after: generator.start)
generator.done = generator.start == generator.end
}
modify {
precondition(!generator.done)
let cur = generator.start
yield &self[cur]
generator.start = index(after: generator.start)
generator.done = generator.start == generator.end
}
}
}
struct _CollectionGenerator<C: Collection> {
var done: Bool
var start, end: C.Index
}
By putting the per-iteration logic for testing, advancing, and subscripting indexes behind the Sequence
vtable, where it can be specialized for each implementation, we can reduce the collection-based for loop to one virtual call per iteration:
for x in xs {
<body>
}
// could be expanded to:
do {
var generator = xs._generate()
while !generator.done {
// Pretend that `with` is a statement that extends an access to a property
// over multiple statements
with let x = xs[_advancing: &generator] {
<body>
}
}
}
When we add move-only types and generators, a default generator implementation could then be added to Collection
on top of these existing entry points as well with less overhead.
Summary
The one thing we must do for future move-only data structures to be compatible with today's protocols is make the Sequence.makeIterator()
consuming, since most implementations need to carry state from the sequence value into the iterator value. With that done, we can introduce new requirements to future versions of the Sequence
and Collection
protocols to better support move-only types, though the default implementations of those requirements will be inefficient when applied to existing binaries. We may want to consider mitigating that future inefficiency by adding private requirements to Sequence
and Collection
today.