How to correctly use swift-collections-benchmark to add a benchmark (or - what mistakes did I just make?)

This is probably related to Guide on writing benchmarking tasks · Issue #7 · apple/swift-collections-benchmark · GitHub -

I was converting some code the other day, and wanted to know it's performance impact - so I reached for this codebase to try it out.

The code was replacing one locking system with another - so I wanted to look at the difference. I tried to use addSimple matching the existing, but I'm fairly certain that I'm holding it wrong.

The sample code that I was poking at is available at NetNewsWire/main.swift at lock_benchmark · heckj/NetNewsWire · GitHub, and the resulting comparison chart:

I fumbled around sufficiently to figure out I needed to register how to create my own type (since I was working with a String input), but I hadn't followed the code through how it all worked sufficient to really know what was being captured and measured (or not) between the two comparisons.

The surprise to me wasn't that there was a tiny gap, but that the time decreased with size - I expected these to be pretty much constant time scenarios, so I was expecting two more-or-less horizontal lines.

I'm clearly missing something, advice?

It looks like you're ignoring the input, so the benchmark tool is dividing the time to do a single lock/unlock pair by the number on the x axis. i.e. your input generator is just something like "8000" as a String, then passing that to the closure where it's ignored. You probably want to make your input generator just product an Int with the size, then lock/unlock that many times.

2 Likes

@jawbroken is right -- swift-collections-benchmark is designed to visualize the behavior of algorithms that have a degree of freedom (e.g., the size of the array in Array.sort()). For straightforward performance comparisons, something simple like XCTest's performance measurements will work just as well -- swift-collections-benchmark will just waste time on needlessly rerunning the exact same code for various points on the size axis.

The curve is sloping down because by default, the tool plots the average per-element processing cost, i.e. it divides the elapsed time measured with the size it is measuring. This makes it very easy to distinguish between O(n), O(n * log(n)) and O(n^2) algorithms -- they produce flat horizontal lines, humpy curves converging to horizontal and lines sloping 45% up, respectively.

(In your case, the line slopes 45% downward because it's an O(1) algorithm, which turns into O(1/n) when we look at the average "per-element" processing time. The slope of the line on a log-log chart corresponds to the exponent, which is -1 in this case.)


Very important note:

Beware of os_unfair_lock in Swift code! & is not an address-of operator in Swift -- it is not guaranteed to provide a stable pointer, so this is highly fragile code:

    os_unfair_lock_lock(&_oldLock)
    os_unfair_lock_unlock(&_oldLock)

I recommend sticking to NSLock for now in Swift.

4 Likes

Heh, yeah - I learned about the os_unfair_lock complexities and found it was being used incorrectly in NetNewsWire and got that updated - that's what spawned the "What's the actual difference" question - so I used that as an excuse to try and learn more about swift-collections-benchmark's tasks, add and the stuff beyond addSimple for making new benchmarks.

Now that I've read a bit more through the code and experimented, it was clear that it wasn't good for the "timing a single interaction" kind of thing - it's entirely built for where that second dimension of scale is what you're interested in keeping' an eye on. I figured out the per-element plotting, and I don't think there's a way to really plot anything that isn't per-element, but given what it's for - I'm not sure that's really an issue. More that I was just mis-using the tool. (If there is a way, and I just missed it, please let me know)

Thanks both!

1 Like