Add Various Zip-Related Types and Operations
- Proposal: SE-NNNN
- Authors: Dennis Vennink
- Review Manager: To be determined
- Implementation: dennisvennink/bdf3c52ae1d9a3750ec3507ab3262921
- Previous Discussions: 1, 2, 3
- Status: Awaiting implementation
Introduction
This proposal adds various zip-related types and operations to the standard library.
Motivation
In functional programming, zip and unzip are irreplaceable operations. zip maps an n-tuple of sequences to a sequence of n-tuples. It exhausts the input sequences up to the length of either the shortest or the longest sequence. In the case of shortest sequence exhaustion, all mapped elements have a value, but the rest of the elements of the longer sequences are not included. In the case of longest sequence exhaustion, the latter elements of the shorter sequences in the n-tuples have no value, but all elements are included. unzip is the inverse of zip; it maps a sequence of n-tuples to an n-tuple of sequences.
Currently, the standard library only contains the generic func
tion zip(_:_:)
that takes two Sequence
parameters and return
s a Sequence
of 2-tuples using shortest sequence exhaustion. An operation using longest sequence exhaustion is oft-requested on the mailing lists and forums. An unzip operation complements both and completes the zip-related API for 2-tuples.
Proposed Solution
This proposal adds: the generic namespace Zip
containing the types ShortestSequence
and LongestSequence
, the generic func
tion zipLongest(_:_:)
, the method extension unzip()
on Sequence
where Element
is equal to 2-tuples and the typealias
es Zip2Sequence
and Zip2Iterator
.
Detailed Design
A new generic namespace Zip
is introduced:
public struct Zip <Sequence1: Sequence, Sequence2: Sequence> { ... }
Zip
contains two optimised Sequence
types. The first is ShortestSequence
(which is almost identical to Zip2Sequence
):
public struct ShortestSequence: Sequence {
internal let sequence1: Sequence1
internal let sequence2: Sequence2
internal init (_ sequence1: Sequence1, _ sequence2: Sequence2) {
(self.sequence1, self.sequence2) = (sequence1, sequence2)
}
public struct Iterator: IteratorProtocol {
internal var iterator1: Sequence1.Iterator
internal var iterator2: Sequence2.Iterator
internal var reachedEnd = false
internal init (_ iterator1: Sequence1.Iterator, _ iterator2: Sequence2.Iterator) {
(self.iterator1, self.iterator2) = (iterator1, iterator2)
}
public mutating func next() -> (Sequence1.Iterator.Element, Sequence2.Iterator.Element)? {
if self.reachedEnd {
return nil
}
guard let element1 = self.iterator1.next(), let element2 = self.iterator2.next() else {
self.reachedEnd.toggle()
return nil
}
return (element1, element2)
}
}
public func makeIterator () -> Iterator {
return Iterator(self.sequence1.makeIterator(), self.sequence2.makeIterator())
}
}
And LongestSequence
, which is a new type capable of longest sequence exhaustion:
public struct LongestSequence: Sequence {
internal let sequence1: Sequence1
internal let sequence2: Sequence2
internal init (_ sequence1: Sequence1, _ sequence2: Sequence2) {
(self.sequence1, self.sequence2) = (sequence1, sequence2)
}
public struct Iterator: IteratorProtocol {
internal var iterator1: Sequence1.Iterator
internal var iterator2: Sequence2.Iterator
internal var reachedEnd = false
internal init (_ iterator1: Sequence1.Iterator, _ iterator2: Sequence2.Iterator) {
(self.iterator1, self.iterator2) = (iterator1, iterator2)
}
public mutating func next() -> (Sequence1.Iterator.Element?, Sequence2.Iterator.Element?)? {
if self.reachedEnd {
return nil
}
let element1 = self.iterator1.next()
let element2 = self.iterator2.next()
if element1 == nil && element2 == nil {
self.reachedEnd.toggle()
return nil
}
return (element1, element2)
}
}
public func makeIterator () -> Iterator {
return Iterator(self.sequence1.makeIterator(), self.sequence2.makeIterator())
}
}
zip(_:_:)
will now return
a ShortestSequence
:
public func zip <Sequence1: Sequence, Sequence2: Sequence> (_ sequence1: Sequence1, _ sequence2: Sequence2) -> Zip<Sequence1, Sequence2>.ShortestSequence {
return Zip.ShortestSequence(sequence1, sequence2)
}
zipLongest(_:_:)
is a new generic func
tion that return
s a LongestSequence
:
public func zipLongest <Sequence1: Sequence, Sequence2: Sequence> (_ sequence1: Sequence1, _ sequence2: Sequence2) -> Zip<Sequence1, Sequence2>.LongestSequence {
return Zip.LongestSequence(sequence1, sequence2)
}
Finally, unzip()
is an extension on Sequence
where Element
is equal to 2-tuples:
extension Sequence {
public func unzip <Type1, Type2> () -> ([Type1], [Type2]) where Self.Element == (Type1, Type2) {
return self.reduce(into: ([Type1](), [Type2]())) { (result, element) in
result.0.append(element.0)
result.1.append(element.1)
}
}
}
Source Compatibility
To prevent code breakage, typealias
es for Zip2Sequence
and Zip2Iterator
will be redirected to Zip.ShortestSequence
and Zip.ShortestSequence.Iterator
, respectively:
@available(*, deprecated, renamed: "Zip.ShortestSequence")
public typealias Zip2Sequence<Sequence1, Sequence2> = Zip<Sequence1, Sequence2>.ShortestSequence where Sequence1: Sequence, Sequence2: Sequence
@available(*, deprecated, renamed: "Zip.ShortestSequence.Iterator")
public typealias Zip2Iterator<Sequence1, Sequence2> = Zip<Sequence1, Sequence2>.ShortestSequence.Iterator where Sequence1: Sequence, Sequence2: Sequence
Effect on ABI Stability
Not applicable.
Effect on API Resilience
Not applicable.
Alternatives Considered
The main points of focus were the naming and return
type of zipLongest(_:_:)
; would it return Optional
s or not? If not, then variants with defaults or a closure would have to be introduced. Using default values meant that dummy placeholder values had to be used, which was not ideal. The variant with a closure makes using the func
tion harder. These were all quickly dismissed as they're all derivatives of the Optional
-return
ing variant. It turned out that .lazy.map(_:)
can be used to efficiently implement all other variants.
While tempted, implementations for arities higher than 2 have been deferred until variadic generics have been introduced.