Benchmark package initial release

And 0.4.1/0.4.2 was released - if you are measuring very short time periods and don’t measure malloc or OS metrics, you’d want to update as the overhead for measuring have been significantly reduced (wouldn’t impact the quality of measurements, but would impact real world time spent waiting for the benchmark run for such setups).

Also fixed a bug for absolute thresholds where the units would be the same as measured (now instead concrete helpers for defining them, e.g. .mega(3) or .milliseconds(5).

Release notes for both are at:


hi Joakim, i’m having trouble compiling the plugin on Amazon Linux 2.

there were no instructions for installing libjemalloc-dev on Amazon Linux, so i installed it with the following:

$ sudo amazon-linux-extras install -y epel
$ sudo yum install -y jemalloc-devel

which installed

  jemalloc-devel.x86_64 0:3.6.0-1.el7                                                                                                   

Dependency Installed:
  jemalloc.x86_64 0:3.6.0-1.el7         

when i try to compile the plugin i get:

.build/checkouts/package-benchmark/Sources/BenchmarkSupport/MallocStats/MallocStatsProducer+jemalloc.swift:109:59: error: cannot find 'MALLCTL_ARENAS_ALL' in scope
            let result = mallctlnametomib("stats.arenas.\(MALLCTL_ARENAS_ALL).small.nrequests",
.build/checkouts/package-benchmark/Sources/BenchmarkSupport/MallocStats/MallocStatsProducer+jemalloc.swift:119:59: error: cannot find 'MALLCTL_ARENAS_ALL' in scope
            let result = mallctlnametomib("stats.arenas.\(MALLCTL_ARENAS_ALL).large.nrequests",
[12/15] Emitting module BenchmarkSupport

Hmm, not at computer right now, but checked our CI and we seem to have a 5.x.x version of jemalloc, so would guess that symbol is missing in 3.x. Any way to get a newer version installed?

Haven’t tried with anything except Ubuntu unfortunately yet (and macOS of course).

Might also be an issue with not finding the header file for jemalloc, but then swiftpm should have given you a warning IIRC.

Quick google gave one hint that source install might be needed (with steps):

1 Like

building from source worked for me, here is what i put in my dockerfile if it helps anyone:

RUN sudo yum -y install bzip2 make
RUN curl -L -o jemalloc-5.3.0.tar.bz2
RUN tar -xf jemalloc-5.3.0.tar.bz2
RUN cd jemalloc-5.3.0 && ./configure && make && sudo make install

after all the usual swift toolchain dependencies.

make install installs the libraries in /usr/local/lib, which the plugin can’t find, so you also have to do:

$ sudo ldconfig /usr/local/lib
1 Like

okay i think i am using the plugin wrong, because i cannot seem to get it to run more than 11 iterations:

        desiredIterations: 1000)
        benchmark in
        for _:Int in benchmark.throughputIterations
            blackHole(encode(dates: dates))

always seems to generate 11 samples no matter what i set desiredIterations to:

│ Metric                                   │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
│ Malloc (total) (K)                       │    1164 │    1164 │    1164 │    1164 │    1164 │    1164 │    1164 │      11 │
│ Memory (resident peak) (M)               │      71 │      76 │      88 │      91 │      93 │     103 │     103 │      11 │
│ Throughput (scaled / s)                  │      10 │      10 │      10 │      10 │       9 │       9 │       9 │      11 │
│ Time (total CPU) (ms)                    │      90 │     100 │     100 │     110 │     110 │     110 │     110 │      11 │
│ Time (wall clock) (ms)                   │      96 │      97 │      99 │     102 │     106 │     110 │     110 │      11 │

You might want to tweak desiredDuration too - it will run until the first of those two are reached and default is just one second IIRC.

Easiest is to set eg. Benchmark.defaultDuration = .seconds(10)

1 Like


1 Like

Only just found this, this is great!

Had a few questions come up while taking a look through the plugin:

  • google/swift-benchmark had an option where it would try to get a statistically meaningful performance value if no number of iterations was provided is that something that could be interesting?

  • Would adding a standard deviation be interesting? (I just mention it since google/swift-benchmark originally displayed that as well)

  • Could it be possible configure for benchmarks to be in different folders than Benchmarks as well? Ie a per target folder for example. I have a large project where a lot of modules are grouped together with custom path's for most targets, would be very convenient to also put benchmarks into these groupings.

I really like this! Thanks!

1 Like

Hi, glad you find it useful!

I’ve never seen much use for ‘auto iterations’ in practice, as we’d usually tweak it with the combination of runtime / number of iterations - which would give more comparable test runs too - but maybe I’m missing something and would be happy to be convinced otherwise.

With regard to SD, I think it’s an error to have it in the first place in Google benchmark actually - see e.g. React San Francisco 2014 : Gil Tene - Understanding Latency - YouTube (or many other nice talks from Gil Tene) around the 30-minute mark - performance measurements aren’t normally distributed in practice, so it’s not a good model to try to fit into IMHO.

It’d make sense to support more flexibility for benchmark placement for more complex project layouts, maybe an optional prefix for the executable targets could be one way to do it (there’s no way to mark up targets with metadata as far as I know that we could use) - PR:s are welcome!

1 Like

yes, i’ve found performance tends to be multi-modal (as they do mention in the video), and this defies easy summarization with a statistic like standard deviation. in my opinion you really have to view the histogram to read these sorts of measurements. like:

1 Like

That's exactly the response I was hoping for haha. Thanks. I don't have that much experience with more than basic performance testing so that's good to know thanks for sharing.
I'll try to give a better example of a setup when I know how I'd ideally incorporate it with our project structure thanks! Then we might be able to get to a PR at some point :)!
Thanks for the response.

@Joakim_Hassila1 Have you used this to track performance over time by any chance?

We haven’t, although the intention was to make it possible by pulling out the data from the JSON format to some external system like grafana if you want to plot performance over time.

Our primary use cases are a) to validate PR performance vs main to avoid merging in regressions and b) to provide a convenient workflow for engineers for improving key metrics such as malloc s/memory footprint/context switches etc, vs baselines when actively working on performance cases.

You can see a trivial sample of a) here:

1 Like

Is there a convenient way to run a single benchmark? Would be cool if we could somehow get the same way of running benchmarks as tests (buttons to run individual ones in Xcode). Not sure if that's possible however

—filter regexp filtering is on the laundry list as it was waiting for the new regexp support - definitely something that would be nice to add.

Having Xcode integration would be fantastic - but there’s no apis as far as I know for that.

Best right now is probably to split out a separate benchmark suite where you can have the single one you want to test.

Wouldn't moving to testTargets and using XCTest allow us to do that?

It doesn’t work as xctest crashes with jemalloc that is used for the Malloc counters. That is only a problem with the proprietary xctest on macOS - the open source one on Linux works fine - I have a feedback open with Apple that was closed as they thought it was a problem with jemalloc - but had jemalloc engineers debug it and it seems xctest on macOS passes a pointer to jemalloc that was not allocated with jemalloc - so would need to fix that.

It also doesn’t build optimized for xctest targets as far as I understand? But maybe that is possible somehow.

There is probably some other issues with r gags to integration with eg. Swift argument parser etc (we need to be able to run from command line at Linux too), but that might be possible to split perhaps.

Hmm good point, it would not be a perfect solution IF it could work. Thanks!