Add a blackHole function

Hi!

I've recently been using the excellent swift-benchmark package: GitHub - google/swift-benchmark: A swift library to benchmark code snippets. (which I hope Apple/Swift.org will adopt one day - it needs to be fleshed-out a bit, but it's really much better than XCTest, especially when it comes to linux support).

One thing that I've found a bit awkward is that it doesn't come with a "black hole" function to prevent the compiler optimising values away. This kind of function appears to be used extensively by the standard library and compiler's own test suite, but implementing your own seems to be non-trivial and relies on insider knowledge about what the compiler can/can't do at this particular point in time.

Is this something the standard library could/should provide?

GitHub issue, and CC @dabrahams who knows more about it than I do. I just need it, and I think that benchmarking should not be considered a niche use-case. Having developers try to trick the compiler doesn't feel like a stable solution.

7 Likes

Although there are a number of different definitions in the standard library, several are of the form:

@inline(never)
func _blackHole(_ x: Int) {}

...which must suffice if it’s used as such so extensively. I do wonder why the other implementations are more elaborate than that.

Also interesting to me is the _opaqueIdentity function.

In general these feel appropriate for XCTest perhaps (or, as you say, swift-benchmark). Not sure how I feel about having these in the standard library itself given their specific relevance for testing when Swift ships with a core library for testing.

7 Likes

Agreed; there's nothing that requires these to be in the standard library, and they're not of general use, so I don't think that they belong there. Note also that these already exist in the benchmarks in the main Swift source tree, which can either be built with Swift or function as a standalone package.

1 Like

I'm not sure how the existence of XCTest is relevant - surely we also support alternative benchmarking libraries? And those libraries should not be forced to depend on XCTest for fundamental functionality, IMO, nor should they have to second-guess the compiler. This feels like it should be some kind of compiler built-in; I think it should be exposed directly.

I wouldn't mind if this was part of some kind of supplementary library rather than the standard library itself, but it should have nothing to do with XCTest.

There's also the problem that, like black holes themselves, nobody knows what the evolution process for XCTest is.

4 Likes

XCTest is a core library, like Foundation; it’s supposed to provide fundamental functionality. Supporting alternative benchmarking libraries and depending on XCTest are not mutually exclusive.

Can you explain a bit more why the “obvious” implementation of _blackHole that relies on no compiler magic is insufficient or otherwise feels like it requires compiler built-ins?

What is the difference between “some kind of supplementary library” for testing and benchmarking utilities that ships on all Swift platforms (i.e., another core library) and XCTest, which is already a core library, other than that the latter is named XCTest?

Sure, but that’s not a problem best addressed by stuffing new code elsewhere. Conway’s law is a descriptive observation, not an aspirational goal.

Which implementation would that be?

The version in StdlibUnitTest seems to do something funky which I can't quite make out:

  1. _blackHole calls _blackHolePtr
  2. _blackHolePtr calls _getPointer
  3. _getPointer is @_silgen_name-d to getPointer. Presumably that's the getPointer function at the end of the file.
  4. getPointer calls _opaqueIdentity
  5. _opaqueIdentity calls _getPointer... i.e. go back to step 3. How does this not infinitely recurse? :man_shrugging:

The only other implementation I can think of is the one which calls in to an opaque C function... assuming that neither the Swift compiler nor LLVM can optimise across the language barrier. Will that always be the case? Who knows! Rust was motivated to work on it because of Firefox - and as Swift becomes a bigger part of Apple's OS, one would think that they have at least as strong a motivation to investigate cross-language LTO.

We cannot rely on the module boundary, since we already have early tests of cross-module optimisation (which is something you would certainly want to build with, in order for your benchmarks to accurately reflect the performance observed by clients of a library).

Again, all of this second-guessing and attempting to trick the compiler feels like a waste of time and effort.

This supplementary library would only provide access to what are effectively compiler built-ins. XCTest does far, far more than that.

It's not just stuffing it somewhere else. AFAIK, XCTest is supposed to be its own unit-testing and benchmarking framework: it was not designed as a library to facilitate building other frameworks (which may want to use a radically different design). I am suggesting that, for this entirely different use-case, it makes sense to create a new library rather than stuffing it in XCTest.

3 Likes

The _getPointer function uses https://github.com/apple/swift/blob/main/stdlib/private/StdlibUnittest/OpaqueIdentityFunctions.cpp

2 Likes

This one:

...and this one:

Interesting, and interesting that the compiler/stdlib have so many varying implementations.

I’m not able to find any official documentation for @inline(never). I know it doesn’t affect generic specialisation, but I’m not sure if it follows that the compiler is not allowed to omit the call, or that it must be called with a valid argument.

1 Like

Right, that's what I'm not sure about. Perhaps what's called for here is better information about that, and if it turns out to be inadequate, consideration whether some other annotation in the same family would be necessary.

I don't see why StdlibUnittest needs to have such a fancy _blackHole these days -- I suspect it may have been implemented before we had @inline(never). (In any case, it's not particularly important, and I wouldn't be surprised if over the years someone has figured out a way to write a test that relies on its current behavior. :see_no_evil: Still, feel free to submit a PR that fixes it!)

The definitions in the standard benchmark suite are far more sensible:

// Just consume the argument.
// It's important that this function is in another module than the tests
// which are using it.
@inline(never)
public func blackHole<T>(_ x: T) {
}

// Return the passed argument without letting the optimizer know that.
@inline(never)
public func identity<T>(_ x: T) -> T {
  return x
}

Feel free to emulate these in your own benchmarking code. I agree with Steve and Xiaodi above -- I think these are both way too trivial and way too specialized for inclusion in the stdlib.

4 Likes

In the presence of cross-module optimization, @inline(never) would not be sufficient for these, but neither would anything else except for a dedicated compiler builtin or a hypothetical @optimize(never) attribute. Building benchmark drivers with CMO enabled is fundamentally weird, however, and not something I'm particularly seeing a need to support.

2 Likes

Why is it weird? I've been building my benchmarks with CMO ever since I learned it exists. I'm using google/swift-benchmark, so it's all statically linked AFAIK. There's no way to tell the compiler to enable cross-module optimisations for certain libraries but to exclude others (right now).

Benchmarking with CMO enabled allows me to focus on the performance of the underlying algorithms, rather than worrying about fragile inlining heuristics. I'll get to tuning those things eventually, but for now it would be a distraction.

Unless you write your benchmarking loops in assembly (which is what I have always done in the past if it really matters), enabling CMO will break your benchmarks repeatedly.

... in the absence of a true "black hole" function, or for other reasons?

(not questioning it, just curious to learn more)

1 Like

I would expect fairly arbitrary breakage of benchmarks under CMO. Hoisting things that you expect to be in the measurement out of the measurement, throwing away computations that can be optimized out (this one is the "absent a true black hole operation" part), constant propagation and compile-time evaluation, etc. I think it's possible to get useful benchmark results under CMO, but it requires constant vigilance, and you cannot really expect to maintain any sort of long-term baselines.

When working with small operations that I expect to be inlined, I usually prefer to benchmark a larger computation that they can be inlined into, with some form of boundary that prevents optimization across the calls into that wrapping computation.

3 Likes

I’d like to resurrect this discussion with one new - and extremely practical oriented reason for adding this.

If wanting to godbolt a benchmark snippet it’s hard (impossible AFAIU) to get (yet another…) copy of blackhole in place - having built in support would make it possible to use godbolt.

With regard to xctest , it’s really not an option in this godbolt scenario- but also problematic on eg. macOS as it crashes when used in conjunction with jemalloc (which our benchmark driver use) and the feedback (FB11642043) is closed even though jemalloc engineers have helped conclude it seems to be an issue with the proprietary xctest driver (works on Linux…) but I digress.

Of course we’ve made our own copy of this in the benchmark package - but would be really nice to be able to godbolt.

1 Like

Just one data point:
It would have been good to standardise them though even if "trivial" (in addition to the Godbolt support!), at least three different places were impacted of the changes to CMO now (and who knows how many more?) with 5.8 and was broken;

Not saying it needs to be in the standard library, but it'd definitely would have been good in this case...

and

An opaque identity(:) function would be useful for constant-time operations. Like so. Ideally with stronger guarantees than Rust’s core::hint::black_box.

1 Like

IMO performance testing is under-utilized. Part of that is surely because it’s hard to do correctly, but part of that is just the nuisance of the setup it requires.

Requiring a second module is annoying enough, but worse yet, it precludes the ability to set up a simple, single-file micro benchmark.

I’m agnostic as to where exactly a black hole function ends up, but I think it’s pretty clear that it would be a huge boost to the developer experience into have one agreed-upon implementation we can easily rely on.

6 Likes