[Proposal] SMT-0001: task-local metrics factory proposal

Hi all,

The proposal [SMT-0001] task-local metrics factory for swift-metrics is now up and In Review.

The review period will run until March 11th - please feel free to post your feedback either as a reply to this thread, or on the pull request.

Thanks!

1 Like

I'm looking forward for this to land as old way was not working when using in parallel environment.

Also find just using with name strange, but @FranzBusch already commented in PR. :+1:

1 Like

I think this is a good step forward, and I am +1 on this proposal. While I am generally wary of using task locals for dependency injection, I think the MetricsFactory (and Logger and Tracer) are an exception to this. We expect developers to use those types to emit observability events, and they must be able to emit those events in contexts where they can't use other forms of dependency injection, such as initializer or parameter-based dependency injection. This also significantly increases the testability of any code that emits metrics.

1 Like

I'm not opposed in general, and the testing in scopes is definitely useful and something we want to support, however I do think this needs far more investigation than presented.

Similarly like the necessary performance analysis was done in the other error: pitch for logger, the same analysis must be performed here IMHO.

This probably also needs to be discussed in general... while metrics instance creation isn't common, the same pattern is likely to get "pushed into" all the other libs, so we need to discuss this holistically, don't we?

For example one may want to discuss if this "check a task local" needs to be possible to disable or not, or if it should be handled by "enabling" or bootstraping in some way...

So I'd like to see a bit more discussion and analysis here since this is a sweeping change and not just a minor one.

performance analysis

Additional logic checking task-local indeed adds an overhead noticeable in "pure" tests of metrics creation, or if we look at a broader picture, logging with a task-local logger. However once you look at practical usage with a backend that does something, task-local overhead becomes negligible. @ktoso in this particular situation of task-local metrics factory, do you have some special use-case to test in addition to just "what is performance difference with and without task-local factory check"?

"check a task local" needs to be possible to disable or not

This is an interesting idea. Is performance the reason for such a config, or you have something else in mind? My thinking is if performance is actually critical, task-local should not be used and the necessary objects (metrics factory, dimensions, logger, etc) should be passed explicitly. Task-local is a convenience feature punching through API boundaries.

I would like the analysis to be made since I had people say "task locals are slow" and when I went to investigate the issue wasn't task locals at all, it was retain/release traffic on the LogHandler in swift-log specifically that was problematic in this pattern. It'd be good to understand and double check if there's any unintended consequences like this here.

Well kind-of... but if there's no way to opt out you'll always pay the cost. Yes it's a performance driven question.

For example, one could imagine in logging that you have to bootstrap with "task local checking" enabled, so I was wondering if we need the same thing here. Or if it should be a flag one can flip on (but not disable once it's on maybe...?)

It'd be good to at least explore the options here before we just make something always on with no way to avoid it.

Metrics are however in better position because you usually woulnd't create a Counter "per request" because technically you should be using some low-cardinality label and not something that must be created per request... but what is someone does -- e.g. a Counter for each Locale we're supporting, so I can see a world where we might want to know what overhead we're adding here, for the purpose of testing, that'd be there even in release mode and production runs where we don't use the task-local factory.

Again, I'm not opposed to this, but I'd request we make sure we understand the impact as we propose such changes.

1 Like

FWIW Iโ€™ve implemented observability tests in a number of projects that use parallel testing without much issue by using a wrapper type that just forwards calls through to a TLV. You set the TLV in a test or a trait and it all works fine

1 Like

Ok, so I've constructed a couple of benchmarks to explore what are the implications of task-local object check. Running naively on my machine a few cases:

task-local-init

  • Initialize task-local metrics factory
  • Create 3 metric objects
metrics_benchmark_task-local-init
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ Metric               โ”‚        p0 โ”‚       p25 โ”‚       p50 โ”‚       p75 โ”‚       p90 โ”‚       p99 โ”‚      p100 โ”‚   Samples โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Instructions (K) *   โ”‚        11 โ”‚        12 โ”‚        12 โ”‚        12 โ”‚        12 โ”‚        14 โ”‚        36 โ”‚    183302 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Object allocs *      โ”‚         3 โ”‚         3 โ”‚         3 โ”‚         3 โ”‚         3 โ”‚         3 โ”‚         3 โ”‚    183302 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

explicit-init

  • Create 3 metric objects with explicitly provided factory object created earlier
metrics_benchmark_explicit-init
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ Metric               โ”‚        p0 โ”‚       p25 โ”‚       p50 โ”‚       p75 โ”‚       p90 โ”‚       p99 โ”‚      p100 โ”‚   Samples โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Instructions *       โ”‚      8607 โ”‚      8623 โ”‚      8623 โ”‚      8623 โ”‚      8623 โ”‚     11071 โ”‚     32510 โ”‚    185765 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Object allocs *      โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚    185765 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

explicit-init-with-current

  • Initialize task-local metrics factory
  • Create 3 metric objects with explicitly provided MetricsSystem.currentFactory (which returns initialized task-local value)
metrics_benchmark_explicit-init-with-current
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ Metric               โ”‚        p0 โ”‚       p25 โ”‚       p50 โ”‚       p75 โ”‚       p90 โ”‚       p99 โ”‚      p100 โ”‚   Samples โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Instructions *       โ”‚      9291 โ”‚      9303 โ”‚      9303 โ”‚      9303 โ”‚      9303 โ”‚     11863 โ”‚     33922 โ”‚    185610 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Object allocs *      โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚    185610 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

I specifically created explicit-init and explicit-init-with-current to show difference for the same function. task-local-init calls a different initializer. To estimate effect of task-local factory in that initializer, I change currentFactory to always return globally bootstrapped metrics factory, and this gives the actual impact of task-local access:

explicit-init-with-current without any task-local

metrics_benchmark_explicit-init-with-current
โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
โ”‚ Metric               โ”‚        p0 โ”‚       p25 โ”‚       p50 โ”‚       p75 โ”‚       p90 โ”‚       p99 โ”‚      p100 โ”‚   Samples โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Instructions *       โ”‚      9248 โ”‚      9255 โ”‚      9255 โ”‚      9255 โ”‚      9255 โ”‚     11807 โ”‚     33910 โ”‚    188555 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Object allocs *      โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚         1 โ”‚    188555 โ”‚
โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

Which gives us negligible impact of task-local metrics factory.

3 Likes

Thank you for the discussion in both the PR and in this thread. Based on the comment about naming of the .withXXX family of methods, I've done several changes to the proposal:

  • Renamed .withCurrent to .withMetricsFactory.
  • Moved withMetricsFactory methods to global namespace instead of MetricsSystem to match patter in the swift-distributed-tracing.
  • Removed public typealias Metrics = MetricsSystem as it no longer relevant.

I would appreciate any comments on the updated proposal.

3 Likes

I agree that improving the testing story for swift-metrics is valuable. However, I have significant concerns about the patterns this proposal encourages and the signal it sends to users of the library.

1. Metric creation belongs at startup, not at request time

The motivating examples in the proposal show metrics being created in request-handling code:

// Example that uses the globally hooked bootstrap
struct UserService {
    func createUser(name: String) async throws -> User {
        let counter = Counter(label: "users.created") // โŒ This creates a Counter at request invocation
        let user = User()
        counter.increment()
        return user
    }
}

// Example that uses a provided MetricsFactory
struct UserService {
    let factory: MetricsFactory  // โŒ Required for dependency injection

    func createUser(name: String) async throws -> User {
        let counter = Counter(label: "users.created", factory: factory) // โŒ This creates a Counter at request invocation
        let user = User()
        counter.increment()
        return user
    }
}

Every call to Counter(label:dimensions:) calls through to the factory's makeCounter(label:dimensions:). In real-world backends (Prometheus, StatsD, etc.), this is a dictionary lookup behind a lock - the factory acts as a deduplication cache. When you create counters at request time rather than at startup, every single request pays the cost of this locked lookup.

The correct pattern for metrics is to create them once at startup and then only call the cheap operations (increment(), record()) on the hot path:

struct UserService {
    let counter: Counter
    
    init(metricsFactory: any MetricsFactory) {
        self.counter = Counter(label: "users.created", factory: metricsFactory) // โœ… counter created once at startup
    }

    func createUser(name: String) async throws -> User {
        let user = User()
        self.counter.increment() // ๐Ÿš€ cheap: no lock, no lookup
        return user
    }
}

Interestingly, the proposal switches to creating the metrics at the correct time when the new Task local approach is used:

struct UserService {
    let counter = Counter(label: "users.created")  // โœ… Gets factory from the task-local storage
    
    func createUser(name: String) async throws -> User {
        let user = User()
        counter.increment()
        return user
    }
}
    
@Test
func testUserCreation() async throws {
    let testMetrics1 = TestMetrics()
    
    try await withMetricsFactory(changingFactory: testMetrics1) {
        let service = UserService()
        return try await service.createUser(name: "Alice")
    }

    #expect(try testMetrics1.expectCounter("users.created").values == [1])
}

However, the metrics factory in the task-local is also present at the request invocation time. It is likely that we would see this pattern in user code:

struct UserService {
    func createUser(name: String) async throws -> User {
        let user = User()
        // โŒ Gets factory from the task-local storage
        //    Needs to go through global lock to get counter out of the backend
        Counter(label: "users.created").increment() 
        return user
    }
}

If we want to make the task-local pattern safe, developers should pay attention that their code looks like this:

let service = try await withMetricsFactory(changingFactory: testMetrics1) {
    UserService() // โœ… Only provide metrics factory to service during init
}

2. Dynamic labels and unbounded cardinality: a security concern

Metrics are fundamentally different from logging and tracing in one critical way: metrics must retain allocated state across flushes. A logger can flush its buffer and free memory. A tracing system can complete a span and release it. But a counter or gauge, once created with a given label+dimensions combination, must persist - the backend needs to continue reporting it to the monitoring system.

This makes unbounded metric cardinality a denial-of-service vector. If an attacker can cause metric creation with arbitrary dimension values (e.g., request paths, user IDs, query parameters), they can exhaust server memory.

This is not theoretical. Vapor had CVE-2021-21328 for exactly this issue: an attacker could send requests to arbitrary undefined URL paths, each creating new counters and timers with unique labels, eventually draining system resources and impacting downstream monitoring services. The fix was to rewrite undefined route paths to a static vapor_route_undefined label.

Prometheus documents this explicitly:

CAUTION: Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

3. The task-local pattern sends the wrong signal

The task-local factory pattern mirrors how swift-log and swift-distributed-tracing work, where withSpan and logger metadata are scoped to tasks. But this analogy is misleading:

  • Logging: Each log line is emitted and can be flushed. Memory cost is transient.
  • Tracing: Each span has a defined lifecycle - it starts and ends. Memory is reclaimed after export.
  • Metrics: Each unique counter/gauge/timer persists for the lifetime of the process. Memory cost is permanent.

By presenting withMetricsFactory as analogous to withSpan, the proposal suggests that metrics creation is similarly lightweight and safe to do dynamically. Users who are familiar with logging and tracing patterns will naturally assume they can create metrics with dynamic dimension values inside request handlers. This signal can be fatal for production systems.

4. The only responsible way to use metrics is with a restricted set of values

If all dimension values are from a restricted, known set, then all metric combinations can be enumerated and created once at startup. This is the pattern that should be encouraged:

// Known set of HTTP methods - safe
enum HTTPMethod: String, CaseIterable { case GET, POST, PUT, DELETE }

// Create all combinations at startup
let requestCounters: [HTTPMethod: Counter] = Dictionary(
    uniqueKeysWithValues: HTTPMethod.allCases.map { method in
        (method, Counter(label: "http.requests", dimensions: [("method", method.rawValue)]))
    }
)

// On the hot path: just increment
func handleRequest(method: HTTPMethod) {
    requestCounters[method]?.increment() // no lock, no lookup, no allocation
}

If metrics are created at startup with a known set of dimensions, there is no need for injecting the MetricsFactory through task-locals at request time.

5. Practical concern: will users follow the discipline?

One could argue that developers could use withMetricsFactory responsibly - only during startup, to set up a test factory before creating metrics. But in practice, the API's design naturally leads to creating metrics inside the scope, at request time. The similarity to withSpan from swift-distributed-tracing and metadata propagation in swift-log reinforces this: developers familiar with those patterns will assume the same scoping is safe for metrics. The examples in the proposal itself demonstrate this pattern. If the proposal's own examples show the anti-pattern, we should expect users to follow suit.

What I'd suggest instead

The testing problem is real, and I think we should solve it. But I believe we should:

  1. Improve documentation: We need better guidance in swift-metrics about the correct pattern for metric creation - create once at startup, use on the hot path. This is a gap today regardless of this proposal.

  2. Solve testing without encouraging request-scoped creation: The explicit factory parameter (Counter(label:dimensions:factory:)) already exists. Passing a MetricsFactory to services at init time - so they can create their counters, timers, and gauges up front - may not be as burdensome as the proposal suggests. It naturally enforces the "create once at startup" pattern and solves the testing story implicitly: in tests you pass a TestMetrics factory, in production you pass the real one. No task-local magic needed. This also solves the parallel test execution problem that motivates much of the proposal - each test simply creates its own service with its own factory:

    @Test
    func testUserCreation() async throws {
        let testMetrics = TestMetrics()
        let service = UserService(metricsFactory: testMetrics) // isolated factory, no global state
        _ = try await service.createUser(name: "Alice")
        #expect(try testMetrics.expectCounter("users.created").values == [1])
    }
    
    // Runs in parallel with no conflicts - each test has its own factory
    @Test
    func testAnotherUser() async throws {
        let testMetrics = TestMetrics()
        let service = UserService(metricsFactory: testMetrics)
        _ = try await service.createUser(name: "Bob")
        #expect(try testMetrics.expectCounter("users.created").values == [1])
    }
    
  3. If we do add task-local factory support, the documentation and examples must prominently warn against creating metrics on the hot path, and the examples should show the factory being used only during initialization, not per-request.

6 Likes

I'm +1 on the task local approach, for the same reasons we want it in swift-log and swift-distributed-tracing. Not only applications, but also libraries want to be able to, for example, run multiple child tasks and accumulate metrics inside those child tasks separately (not using the process-wide bootstrapped metrics factory). The fact that this fixes parallel testing is a nice benefit.

Explicit passing and process-wide bootstrap are two methods to propagate factories, and they're not going anywhere. I do agree with us embracing Structured Concurrency here and allowing users to opt into propagating their factories through the task hierarchy as well, for the benefits already outlined in the other proposals.

We also want to make sure all of logging, metrics, and tracing fit well together and have a consistent (doesn't mean identical) approach. When a new developer joins our ecosystem next year and they look at how to propagate telemetry through their libraries and apps with the help of task trees, they should get a good, easy-to-explain experience.