Want speed? Go for for with formUnion(_:). Want speed and FP? Go for reduce(into:) with formUnion(_:). The struckthrough statement is only applicable in Debug mode. In Release mode, with high optimisation settings, the differences become less pronounced.
A small doubt with x2: reduce (still learning it), has an accumulating parameter (in out), so with union would a new set be created for every iteration as opposed to x3 which would mutate the set everytime ?
reduce(into:)'s initialResult ($0 in the closure) refers to the "same" Set<Int> each time. (Technically it's a bit more complicated.) Using union(_:) here will probably be slower as you're creating a new Set<Int> for each iteration. You'll also have to rewrite the closure as { $0 = $0.union($1 ?? []) }.
In reduce(_:)'s closure you return a new Set<Int> every time. formUnion(_:) will not work here as $0 is immutable.
import XCTest
@testable import SetUnion
final class SetUnionTests: XCTestCase {
let a1: [Set<Int>?] = [[1, 2], [2, 3], nil]
let n = 100000
func testX1 () {
self.measure {
for _ in 1...self.n {
_ = Set(self.a1.flatMap { $0 ?? [] })
}
}
}
...
Can we be sure that the optimizer (of a future Swift version) won't remove this for-in loop by dead code elimination?
I ran your SetUnionTests.swift (the gist of the above link) on my
MacBook Pro (Retina, 15-inch, Late 2013), 2 GHz Intel Core i7
in Xcode 9.4, default toolchain, Test Scheme Build Configuration set to Release
Here are the test results I got compared to yours (along with times relative to x1):
// +------------------------------------+------------------------------------+
// | My Results | Your Results |
// +------------------------------------+------------------------------------+
// | x1: 0.097 s 0.097 / 0.097 = 1.00 | x1: 0.694 s 0.694 / 0.694 = 1.00 |
// | x2: 0.108 s 0.097 / 0.108 = 0.89 | x2: 0.360 s 0.694 / 0.360 = 1.93 |
// | x3: 0.075 s 0.097 / 0.075 = 1.29 | x3: 0.327 s 0.694 / 0.327 = 2.12 |
// | x4: 0.068 s 0.097 / 0.068 = 1.43 | x4: 0.296 s 0.694 / 0.296 = 2.34 |
// | x5: 0.073 s 0.097 / 0.073 = 1.33 | x5: 0.344 s 0.694 / 0.344 = 2.02 |
// +------------------------------------+------------------------------------+
// (x6: 0.247 s <-- I added Adrian Zubarev's snippet as testX6, apparantly
// lazy is slow for this particular piece of code & context.)
//
// +------------------------------+
// | So, my test results are this |
// | many times faster than yours |
// +------------------------------+
// | x1: 0.694 / 0.096 = 7.2 |
// | x2: 0.360 / 0.108 = 3.3 |
// | x3: 0.327 / 0.075 = 4.4 |
// | x4: 0.296 / 0.068 = 4.4 |
// | x5: 0.344 / 0.073 = 4.7 |
// +------------------------------+
As I find it hard to believe that your machine and/or Xcode version compiles and/or runs these tests between 3.3 and 7.2 slower than on my MBP late 2013, 2 GHz i7 with Xcode 9.4, my guess is that, unless I'm missing something here, you ran the test in Debug. : )
You're right! I ran the tests in Debug mode on a 17-inch Late 2011 MacBook Pro with a 2.4 GHz Intel Core i7, being held together by duct tape (don't ask), in Xcode Version 10.0 beta (10L176w).
Here are my results using Release (-Os):
x1: 0.131 s x2: 0.117 s x3: 0.120 s x4: 0.120 s x5: 0.121 s x6: 0.193 s
Thanks @DevAndArtist, looks like so many possibilities.
Just curious, since there is a dot operator and functions are chained together, would that mean there would be 1 scan to remove nil (compact) and 1 more scan to flatten it (flatMap) ?
Note: My knowledge is very limited, so I could be completely wrong.
No. The .lazy means the method chain should produce instances conforming to LazySequenceProtocol that wrap each other and fuse into a single pass over the a1. This happens inside the Set.init in this case.
Well, it is one of the three fastest ones. x3, x4 and x5 are about the same (around 0.07 s in a release build on my machine), they are faster than x1, x2 and x6. But as demonstrated here, the optimizability is highly context dependent, in ways that makes it very hard, if not impossible, to build an intuition for performance in current Swift.