Swift Concurrency/general hooks for introspection and observability

Hi,

I think it would be exceedingly useful with richer observability / introspection tools of runtime statistics if possible - for production environments - especially on the server side - it would be of great help to be able to ask certain questions that in development environments might be investigated with development tools like Instruments, which would not be an option to use in production environments.

To be specific, we'd have great use of enabling very basic statistics from both the Concurrency runtime and Swift in general, such that we at production sites can take statistics snapshots of e.g.:

  • How many tasks have been created since startup
  • How many tasks are currently waiting to be scheduled
  • How many tasks have been finished since startup
  • Which actors and classes have instances in the application, and how many have been allocated/destroyed?

Understanding it would incur some (optional) bookkeeping overhead, but having such fundamental questions answered would allow for much better understanding of complex application behaviour.

Have there been any discussion of adding such support? Any thoughts?

I'd love to be able to add support for such information to the Benchmark package.

7 Likes

I too would benefit from having this!

Also worth tracking: which running tasks are marked as cancelled.

Can be good to know in order to tune how often you check for cancellation.

2 Likes

Right, very much so! I think this is a basic requirement for runtime to be able to expose such metrics.

We have made the first step towards this last year, but we've not done the second one yet: hooking up swift-metrics to the hooks we've prepared.

Basically, we have the following "hooks" prepared in the runtime already: https://github.com/apple/swift/blob/main/stdlib/public/Concurrency/Tracing.h

Here's two interesting examples:

void task_wait(AsyncTask *task, AsyncTask *waitingOn, uintptr_t status);
void task_resume(AsyncTask *task);

By providing implementations of those hooks at runtime, we should be able to provide a bridge from such tracepoints, into metrics which can be emitted into the globally bootstrapped swift-metrics system. Some of those are actually even potentially useful for tracing, so spans can be emitted between a start/suspend/end of a task etc.

Either way, for emitting the counts you mentioned we'd hook:

void task_create(AsyncTask *task, AsyncTask *parent, TaskGroup *group,
                 AsyncLet *asyncLet, uint8_t jobPriority, bool isChildTask,
                 bool isFuture, bool isGroupChildTask, bool isAsyncLetTask);
void task_destroy(AsyncTask *task);
void actor_create(HeapObject *actor);
void actor_destroy(HeapObject *actor);

I have initially reached out to some folks a while ago about how we could do this, but we've not dug deep just yet. Generally I think we'd either want to surface some library which does the hooks and offers some protocol that users can implement to handle these, or some simple pre-defined swift-metrics integration for things like task, actor counts etc.

Basically it boils down to providing implementations of those C++ functions, so perhaps actually just a swift package which does that might be good enough! :thinking:

// cc @harjas @Mike_Ash @tomerd

5 Likes

Wow, thanks @ktoso - that looks like an amazing start! And a lot more interesting hooks that also would be fantastic that I didn't dare enumerate as well! ;-).

The only thing missing there that I mentioned is not related to concurrency but simply to get class/instance statistics as well in a similar way (if one could dream) - then we'd have both actor and class stats - it can be extremely useful for many simple performance analytics cases especially when working with third party libraries.

I think a Swift library that exposes them would be the best way (then one can have a swift-metrics 'bridge' implemented on top of such a library exposing simple pre-defined metrics for whatever subset, while still allowing for other use cases to access the library directly, like e.g. integrating support to the Benchmark package for such stats).

It would be great to only take the overhead of tracing for the actual trace points required though (not sure how that would be done with a Swift wrapper, but just a reflection) as they seems to be quite extensive (and it would be nice to be able to run with such a package linked to allow flipping the switch for troubleshooting when needed).

Great news the foundation is in place already!

Oh, forgot to comment: That would also be very helpful indeed.