From pushback by @Erik_Eckstein on some suggested clean-ups to Swift Benchmark Suite, I got the impression that what I see in the GitHub repo isn’t all that relies on how the Benchmark_O works and that there are additional Apple internal concerns to be considered. For that reason I have taylored my refactoring to be always 100% preserving of existing behavior and all the new capabilities for robust measurement are strictly additive and hidden behind flags and optional arguments. Public scripts (Benchmark_Driver, run_smoke_bench) that build on it have leveraged these features, but I got the message that more radical changes are problematic because they would create additional Apple-internal maintenance burden which isn’t budgeted for. Is there such a thing? Maybe I just got a wrong impression…
Mainly the lack of any publicly visible scripts that use this parameter for anything. There are no doubt some personal workflows that rely on tags?
I've been focusing on analyzing the SBS as a whole and built tooling on top of Benchmark_O and Benchmark_Driver, I had no use for tags yet. Now my workflow during the cleanup is to first run Benchmark_Driver -f PATTERN with test that were flagged by Benchmark_Driver check and then, when I need to drill down to the details of individual benchmark I invoke Benchmark_O using numbers to select tests (revealed in previous results). While I'm tuning the legacyFactor, I'm usually looking at output from runs with rather long argument list like:
--num-iters=1 --num-samples=6 --quantile=4 --verbose
All benchmarks are tagged, because it’s a mandatory parameter, it’s just that it left me with impression of a pro forma excersise, rather than a vital organizing tool.
Here are stats from all 747 benchmarks in public SBS:
bin$ ./Benchmark_O --list --skip-tags= | cut -d ',' -f 3,4,5,6,7,8 \
| tr -d '[ ]' | tr ',' '[\n*]' | sort | uniq -c | sort -bnr
568 validation
559 api
215 String
144 skip
114 Array
64 bridging
56 algorithm
54 Data
51 Set
42 Dictionary
19 runtime
11 cpubench
10 abstraction
6 unstable
5 regression
5 refcount
3 miniapplication
3 exceptions
1 metadata
Almost everything is tagged api and validation, to the point it seem like a noise to me. The skip is all in Existential and StringWalk families (and I've suggested we keep that as Bool argument). The Array, Data, Dictionary and Set tagged benchmarks almost always include the type also in their benchmark names, so the performance tests for these currency types would be organizationally well covered just by the naming convention. Tag algorithm is quite interesting, but could still be replaced by grouping all these into one Algorithm. benchmark family in the naming hierarchy. The runtime, cpubench and the rest of the tags are hardly used (<2%).
Only the String tag seems like a solid argument for tagging, because of extensive coverage. @Michael_Ilseman do you rely on running Benchmark_O --tags=String? Could you share how you use the benchmarks?
That may be the original vision for introducing BenchmarkCategory, but is that also the practice after 1½ years? Probably I'm just unfamiliar with your workflows... Could you share some of these from your practice, to help me better understand the requirements and how the tags play into it all?
Isn't it possible to have an equally well organized Swift Benchmark Suite without tags by a thorough application of the naming convention? It seems like you could hand someone a command line using the Benchmark_Driver with multiple -filters to get the same effect as using tags. I realize we currently have a functionality gap between Benchmark_Driver and Benchmark_O, since Swift doesn't ship with regexes yet. Maybe we could add simple substring match +/- filtering?
Benchmark_O +String +Breadcrumbs +Char -Indexing
To be clear, I'm not hell-bent on eliminating the tags, only taking the BenchmarkInfo refactoring as an opportunity to simplify the implementation if possible… you know, convention over configuration and all that 
Apropos, @Michael_Gottesman any thoughts on keeping/dropping of the unsupportedPlatforms: BenchmarkPlatformSet ? It is only used in the DictionaryKeysContains.