I'm -1 on this proposal, because I don't feel that the use cases warrant the expansion of the Collection
protocol hierarchy.
The Collection
protocol hierarchy is a nontrivial protocol hierarchy that confronts users fairly early in their use of the Swift standard library. Each protocol in the hierarchy brings with it a wealth of algorithms and some easily-describable set of conforming types. Because it's so visible and already fairly complex, there should be a very high bar to extending it with new protocols. The use cases in this proposal aren't compelling enough to meet that high bar.
The primary example in the proposal shows how cumbersome it is to use withContiguousStorageIfAvailable
as motivation for {Mutable}ContiguousCollection
:
func dspAdd<A: Collection, B: Collection>(
_ a: A, _ b: B, _ result: inout [Float]
) where A.Element == Float, B.Element == Float {
let n = a.count
// try accessing contiguous underlying buffers:
let wasContiguous: ()?? =
a.withContiguousStorageIfAvailable { abuf in
b.withContiguousStorageIfAvailable { bbuf in
vDSP_vadd(abuf.baseAddress!, 1, bbuf.baseAddress!, 1, &result, 1, UInt(n))
}
}
// if they weren't contiguous, create two arrays try again
if wasContiguous == nil || wasContiguous! == nil {
dspAdd(Array(a), Array(b), &result)
}
}
However, we can do a lot better for the cases where we want to realize storage by adding a few simple extensions, e.g.,
extension Collection {
func withForcedContiguousStorage<R>(
_ body: (UnsafeBufferPointer<Element>) -> R
) throws -> R {
if let result = withContiguousStorageIfAvailable(body) {
return result
}
return Array(self).withContiguousStorageIfAvailable(body)!
}
}
extension MutableCollection {
mutating func withForcedContiguousMutableStorage<R>(
_ body: (inout UnsafeMutableBufferPointer<Element>) -> R
) throws -> R {
if let result = withContiguousMutableStorageIfAvailable(body) {
return result
}
var array = Array(self)
let result = array.withContiguousMutableStorageIfAvailable(body)!
for (myIdx, arrayIdx) in zip(self.indices, array.indices) {
self[myIdx] = array[arrayIdx]
}
return result
}
}
The dspAdd
example is basically the same as with the proposed solution, but without the type system there to enforce contiguousness:
func dspAdd<A: Collection, B: Collection, R: MutableCollection>(
_ a: A, _ b: B, _ result: inout R
) where A.Element == Float, B.Element == Float, R.Element == Float {
let n = a.count
a.withForcedContiguousStorage { abuf in
b.withForcedContiguousStorage { bbuf in
result.withForcedContiguousMutableStorage { rbuf in
vDSP_vadd(abuf.baseAddress!, 1, bbuf.baseAddress!, 1, rbuf.baseAddress!, 1, UInt(n))
}
}
}
}
Unlike in the proposal, this version will "work" when given a noncontiguous collection, in the sense that it will provide the correct semantics. However, performance will be poor due to the extra copies incurred. This is not that much different from when copy-on-write causes performance surprises because the collection being referenced is not unique. Neither situation is ideal, but on balance we benefit from having the simpler semantics, and performance-minded programmers can ensure that the passed-in arrays are contiguous if the problem manifests in production. A performance-critical API like vDSP_add
could choose to log (or even trap!) at runtime to guide programmers along the most-efficient path, without pushing the guarantee into the type system.
It is reasonable to point out that introducing the protocols makes the "contiguous" contract stronger (it does!), but that point is made weaker because array isn't always contiguous, and there are a number of other data structures (like String
and Data
) that are often-but-not-always contiguous. Designing an API to require ContiguousCollection
excludes such data structures, which is unfortunate given that developers are getting explicit control to make a String
instance contiguous.
Doug