- Proposal: SE-NNNN
- Author: Dennis Vennink
- Review Manager: To Be Determined
- Status: Awaiting Review
- Implementation: apple/swift#18769
- Bug: SR-3760
Introduction
This proposal addresses the limitations of the Standard Library's current zip
API by adding improvements in the form of enhancements and additions.
It's comprised of:
- the types
Zip
[2
–6
]Index
,Zip
[2
–6
]Sequence
andZipLongest
[2
–6
]Sequence
; - the global
func
tionszip(_:_:
[_:
[_:
[_:
[_:
]]]])
,zipLongest(_:_:
[_:
[_:
[_:
[_:
]]]])
; - and the
extension
methodunzip()
onSequence
andLazyCollectionProtocol
.
Discussions
It has been discussed in the following Evolution forum threads:
-
Improved
zip
API; -
Add Various
zip
-Related Types and Operations; - and Add a Padded Version of
zip
to Complement the Currentzip
.
Its development has been discussed in the following Development forum thread:
Motivation
The Standard Library's current zip
API consists solely of the global func
tion zip(_:_:)
, which creates a Sequence
of pairs from a pair of Sequence
s by zipping their respective elements until the shortest Sequence
is exhausted, truncating the latter elements of the longest Sequence
.
It's primarily used to iterate over two Sequence
s simultaneously:
func dotProduct <T: Numeric> (_ vector1: [T], _ vector2: [T]) -> T {
return zip(vector1, vector2).map(*).reduce(0, +)
}
print(dotProduct([-1.5, 2], [4, 3]))
// Prints "0.0"
And since it's lazily implemented we can also zip
over (pseudo-)infinite Sequence
s; which is a common functional pattern:
extension Sequence {
func enumerated () -> Zip2Sequence<PartialRangeFrom<Int>, Self> {
return zip(0..., self)
}
}
print(Array("abc".enumerated()))
// Prints "[(0, "a"), (1, "b"), (2, "c")]"
However, it's missing certain protocol
conformances, and features found in many other programming languages, like non-truncating and inverse zip
operations and support for higher arities.
Missing protocol
Conformances
Collection
, BidirectionalCollection
and RandomAccessCollection
Currently, Zip2Sequence
's only conformance is to Sequence
. Since SE-0234 removed SubSequence
as an associatedtype
from Sequence
, we can extend Zip2Sequence
to conditionally conform to protocol
s higher up the hierarchy without introducing a new type.
Adding these additional conformances has a number of benefits:
- It makes their properties and methods available for use.
- It makes existing type inferred code that uses operations with improved default implementations higher up the hierarchy more capable or performant.
- It guarantees that the instance can be non-destructively consumed by iteration when the underlying
Sequence
s conform toCollection
.
However, the granularity of Sequence
's protocol
hierarchy makes a correct implementation of these conformances non-trivial.
Each Collection
places additional conditional constraints on the underlying Sequence
s to comply with its semantics. These constraints hinder members lower in the protocol
hierarchy from using a Collection
's more capable or performant default implementations in a generic context. This means that these members will have to be reimplemented on an unconstrained extension
. But which members specifically can only be found out by manually tracking code paths in the Standard Library's source code as the compiler doesn't provide fixits.
LazySequenceProtocol
Because Zip2Sequence
doesn't conform to LazySequenceProtocol
when its underlying Sequence
s conform to LazySequenceProtocol
, some subsequent operations for which lazy variants exist are not used, even though Zip2Sequence
itself is implemented lazily:
func id <T> (_ x: T) -> T {
return x
}
print(type(of: zip([0, 1, 2].lazy, "abc".lazy).map(id)))
// Prints "Array<(Int, Character)>"
Here, we workaround this limitation by calling lazy
on zip(_:_:)
's result to make consequent operations lazy again:
print(type(of: zip([0, 1, 2].lazy, "abc".lazy).lazy.map(id)))
// Prints "LazyMapSequence<Zip2Sequence<LazySequence<Array<Int>>, LazySequence<String>>, (Int, Character)>"
While this doesn't have a negative impact on performance—as LazyMapSequence
will unwrap LazySequence
—it's not exactly elegant.
Non-Truncating zip
Operation
Some successive operations require all elements of the underlying Sequence
s, including the latter elements of the longest Sequence
. Such an operation is useful when the lengths of the underlying Sequence
s are indeterminable at compile time and helps to prevent potential correctness traps. It's also versatile in that it composes nicely with other operations.
Both eager and lazy concrete type-return
ing implementations of this operation can be created from existing operations and language constructs.
Here, we use state and a while
loop to create an eager imperative implementation:
func zipLongest <Sequence1: Sequence, Sequence2: Sequence> (_ sequence1: Sequence1, _ sequence2: Sequence2) -> [(Sequence1.Element?, Sequence2.Element?)] {
var (iterator1, iterator2) = (sequence1.makeIterator(), sequence2.makeIterator())
var result = [(Sequence1.Element?, Sequence2.Element?)]()
result.reserveCapacity(Swift.max(sequence1.underestimatedCount, sequence2.underestimatedCount))
while true {
let (element1, element2) = (iterator1.next(), iterator2.next())
if element1 == nil && element2 == nil {
break
} else {
result.append((element1, element2))
}
}
return result
}
print(zipLongest([0, 1, 2], [0]))
// Prints "[(Optional(0), Optional(0)), (Optional(1), nil), (Optional(2), nil)]"
And here, we use sequence(state:next:)
and lazy
to create a lazy functional implementation:
func zipLongest <Sequence1: Sequence, Sequence2: Sequence> (_ sequence1: Sequence1, _ sequence2: Sequence2) -> LazySequence<UnfoldSequence<(Sequence1.Element?, Sequence2.Element?), (Sequence1.Iterator, Sequence2.Iterator)>> {
return sequence(state: (sequence1.makeIterator(), sequence2.makeIterator())) {
let (element1, element2) = ($0.0.next(), $0.1.next())
return element1 == nil && element2 == nil ? nil : (element1, element2)
}.lazy
}
print(Array(zipLongest([0, 1, 2], [0])))
// Prints "[(Optional(0), Optional(0)), (Optional(1), nil), (Optional(2), nil)]"
However, these implementations won't scale along with the capabilities of the underlying Sequence
s. A type that does, similar to Zip2Sequence
, is tricky to implement for the aforementioned reasons.
Inverse zip
Operation
Most programming languages that include zip
also include some variant of its inverse operation, usually named unzip
, as an idiom. Swift should follow suit.
Fortunately, this operation doesn't require any new types. But implementing it is tricky as some common performance and correctness traps exist.
Here, reduce(into:_:)
can't reserve the capacity up-front:
func unzip <Base: Sequence, Element1, Element2> (_ sequence: Base) -> ([Element1], [Element2]) where Base.Element == (Element1, Element2) {
return sequence.reduce(into: ([Element1](), [Element2]())) { (result, element) in
result.0.append(element.0)
result.1.append(element.1)
}
}
And a lazy variant can't be implemented on a type that only conforms to LazySequenceProtocol
since the operation requires multiple passes.
Support for Higher Arities
zip
is currently limited to a pair of Sequence
s.
Here, we workaround it by nesting multiple calls to zip(_:_:)
and calling map(_:)
on its result to realign its elements:
import UIKit
let components = (
red: sequence(state: 0.0...1.0) { CGFloat.random(in: $0) },
green: sequence(state: 0.0...1.0) { CGFloat.random(in: $0) },
blue: sequence(state: 0.0...1.0) { CGFloat.random(in: $0) },
alpha: repeatElement(CGFloat(1.0), count: 3)
)
zip(components.red, zip(components.green, zip(components.blue, components.alpha)))
.map { UIColor(red: $0, green: $1.0, blue: $1.1.0, alpha: $1.1.1) }
.forEach { print($0) }
// Prints "UIExtendedSRGBColorSpace 0.357433 0.784562 0.903349 1"
// Prints "UIExtendedSRGBColorSpace 0.411219 0.622004 0.842976 1"
// Prints "UIExtendedSRGBColorSpace 0.615279 0.888622 0.651203 1"
However, this becomes less readable and increasingly more complicated for higher arities and is less performant.
For non-truncating zip
operations this becomes a readability issue as both the values and tuples will be wrapped in Optional
s.