The Context Passing Problem

I want to touch on something tangentially related to the Generic Connection Pool post I made recently. It's been discussed a lot amongst the SSWG and Vapor community and I wanted to put my thoughts down and explain Vapor's approach for the record. I'm not sure if it has a better name, but I'm calling it "the context passing problem".

This problem exists in a lot of situations, but for the purpose of this post I want to highlight connection pooling. When grabbing a connection from a pool, you will probably want the connection to log somewhere else. For example, in Vapor, each request has its own logger with a unique request identifier. When a connection is pulled from the pool, it is vital that this logger be used so that the logs can be correctly identified later.

Explicit context passing

The most straightforward way to implement this is to have all methods on the connection accept an "override" logger. This is the method AsyncHTTPClient employs. Let's call this explicit context passing For example:

// This connection will be using whatever logger it 
// was created with.
let connection = try pool.requestConnection().wait()
defer { pool.releaseConnection(connection)

// However, we want it to log to this special logger.
// Using AHC-style, we would just pass this in.
try pool.doThing(logger: customLogger).wait()

This method works great, but the obvious downside is that the developer must remember to pass the logger in. This problem becomes worse if you need to pass additional context, like BaggageContext for tracing. Imagine this API in a high-level Vapor app:

app.get("users") { req in 
    User.query(
        on: req.db, 
        logger: req.logger, 
        baggageContext: req.baggageContext
    ).all()
}

Example usages:

Protocol-based context passing

To avoid this, Vapor implements protocols similar to the following. Let's call this protocol-based context passing.

protocol Database {
    var eventLoop: EventLoop { get }
    var logger: Logger { get }
    var baggageContext: BaggageContext { get }
}

extension Database {
    func delegating(to eventLoop: EventLoop) -> Database
    func logging(to logger: Logger) -> Database
    func tracing(to baggageContext: BaggageContext) -> Database
}

A real example of this can be seen in PostgresNIO's PostgresDatabase protocol.

The protocol extensions simply wrap the type in a struct that temporarily overrides that property and implements the provided guarantees. This is thread-safe but does add indirection.

Vapor uses these protocols and their extensions to implement req.db in a way that all of the required context is always available, without the developer needing to pass things manually. So the high level API looks more like this:

app.get("users") { req in 
    User.query(on: req.db).all()
}

Example usages:

Impact on high-level frameworks

Being able to hide context passing like this is critical to create nice, high-level APIs. Having this built into libraries like PostgresNIO natively is very important for Vapor. When libraries do not offer these types of protocols, Vapor must wrap them. AsyncHTTPClient for example uses the explicit-passing method and thus Vapor wraps its interface behind a Client protocol.

Wrapping AsyncHTTPClient is by no means the end of the world, in fact some times wrapping is the better option anyway to make APIs feel more comfortable in the framework. However, there are some situations where this required wrapping can create a lot of unnecessary code duplication.

The main example that has come up recently is RediStack. This package currently offers a protocol similar to the Database protocol declared above: RedisClient. However, instead of implementing a logging(to:) method, it uses setLogging(to:) which is not thread-safe. Because of this, one of the recommendations has been to move to AHC-style "explicit passing": [Thread Safety] - The setLogging(to:) method is not thread safe (#79) · Issues · Swift on Server Community / RediStack · GitLab

This change would by no means prevent Vapor from using RediStack, but it would mean that we would need to wrap it in a protocol similar to how we use AHC. This is rather unfortunate since, unlike AHC, RediStack offers a large number of wonderful convenience methods: Sources/RediStack/Commands · master · Swift on Server Community / RediStack · GitLab

By wrapping RedisClient in a new protocol, Vapor hides access to its methods and must re-implement them in its public API. Not only does this create duplicated work and more source for bugs, it also means that as new methods are added to RediStack, they will not be immediately available to users of Vapor.

By using the protocol-based context passing approach instead, RediStack would allow its RedisClient protocol to be used directly in high-level frameworks like Vapor.

I'm sure this issue will come up in many more places as new server-side Swift packages are created.

I'm very interested to hear everyone's thoughts on this. Thanks!

8 Likes

In ACH case logger parameter is kind of temporary, it should be replaced with context, see:

Pinging @ktoso, as he is quite knowledgable in this topic.

Whether logger should be replaced by context or not is related, but slightly different than the point I'm trying to make here. As far as AHC is concerned, I think given its current design, it would make more sense for context: BaggageContext to be a separate parameter. In other words, like:

let client: HTTPClient = ...
client.get("https://vapor.codes", logger: myLogger, context: myBaggageContext())

However, if we're rethinking its design (major break), then I do think we should at least consider a protocol-based context passing design.

I think some code examples specific to AHC will help clarify what I mean by that:

// Instead of being a concrete type, HTTPClient is a protocol.
protocol HTTPClient {
    var eventLoopPreference: EventLoopPreference { get }
    var logger: Logger { get }
    var baggage: BaggageContext { get }

    // Just the core essential method for executing a request 
    // goes on the protocol.
    func execute(
        request: Request, 
        deadline: NIODeadline? = nil
    ) -> EventLoopFuture<Response> 
}

// Extensions add in all the nice helpers.
// This means high-level frameworks get these for free!
extension HTTPClient {
    func get(url: String, deadline: NIODeadline? = nil) -> EventLoopFuture<Response> { ... }
    func post(url: String, deadline: NIODeadline? = nil) -> EventLoopFuture<Response> { ... }
    ...
}

// Extension offer ways of getting new clients that use a different
// event loop, log somewhere else, or trace to different context.
//
// These methods create _copies_ of the client and do not attempt to
// mutate the client they are called on (thread-safe).
extension HTTPClient {
    func delegating(to eventLoop: EventLoopPreference) -> HTTPClient
    func logging(to logger: Logger) -> HTTPClient
    func tracing(to baggage: BaggageContext) -> HTTPClient
}

// One or more concrete implementations of HTTPClient are provided.
// The single connection / pooled variants could be different, or just one.
//
// Users can also create their own conformers to HTTPClient that do special logic.
class HTTPConnection: HTTPClient { ... }
class HTTPConnectionPool: HTTPClient { ... }

// Example usage:
let eventLoopGroup = MultiThreadedEventLoopGroup(...)
let client: HTTPClient = HTTPConnectionPool(eventLoopGroup, ...)

// On main thread, use any EL / logger / tracing, don't care. 
try client.get("https://vapor.codes").wait()

// On a specific EL, with logger and tracing I need to use.
// We'll use Vapor for example.
app.get("test") { req in 
    client
        .delegating(to: req.eventLoop)
        .logging(to: req.logger)
        .tracing(to: req.baggage)
        .get("https://vapor.codes")
}

// This allows the high-level framework to expose an HTTPClient that 
// already has everything the user needs.
extension Request {
    var client: HTTPClient {
        self.application.client
            .delegating(to: self.eventLoop)
            .logging(to: self.logger)
            .tracing(to: self.baggage)
}

// Example using this convenience API with all the hard parts handled.
app.get("test") { req in 
    req.client.get("https://vapor.codes")
}

For AHC in particular, I'm not as strongly opinionated here. I think explicit context passing is honestly fine given this is a fairly low-level library. Vapor would likely wrap AHC even if it did offer these APIs so that it works better with our high-level content negotiation APIs.

However, for libraries like RediStack, APNSwift, etc needing to duplicate all the convenience methods is a real problem. I could also see other frameworks besides Vapor that don't have high-level content negotiation APIs wanting to use AHC's convenience methods directly without needing to reimplement them.

I pinged on chat but in case you missed it: sorry but I ran out of time in the day/week to five deep here. I want to give this discussion the proper deep dive it deserved before replying. I just took some days off so not sure I’ll be able to get to this over the next few days but I’ll try!

Short version is though: completely hiding context parameters is not viable, making them as painless as possible is a goal we should pursue though. I merged some work toward this in baggage context last week...

I promise to get back here will a full analysis, thank you very much for all the specific links — I want to give all those a proper look before concluding anything. Thanks for your patience and sorry for the timing of my time off. I’ll get to this as soon as I can!

3 Likes

I've just been implementing Logger support in AWS SDK Swift and have been coming across similar issues. So far my implementation is very similar to AHC in that every AWS command takes a Logger struct. With this method every command inherits the logger metadata from the request that made the call. But as you mention @tanner0101 this only works if the user remembers to supply a Logger for each function call.

I do quite like the logging(to...) as an alternative to passing the Logger in every function. I might consider changing to that. Means we can add tracing at a later date without creating breaking changes to all our apis. I guess it would change

s3.listObjects(.init(bucketName: "myBucket"), logger: Logger)

to

s3.logging(to: Logger).listObjects(.init(bucketName: "myBucket"))

It's not a solution to ensuring the user adds a logger to a call, but it does mean additional context can be added without breaking changes.

1 Like

I was waiting on tackling re-implementing logging in RediStack before replying so I had some design and evidence-based opinions to share.

The refactor can be seen here: Refactor Logging implementation (!130) · Merge requests · Nathan Harris / RediStack · GitLab


Foremost, I think this is an important topic that having some guidelines established will help the ecosystem, so we can recognize the design patterns and have clear ways of interoperating between them.


I agree with Tanner that the way AHC is passing around the logger makes sense for itself, as there are a lot of details going on with a given request - and the library is lower level than RediStack is.

However, adopting that style is a non-starter for RediStack as the sheer number of commands that not only are supported by Redis, but the number of Swift APIs that I have written (and will continue to develop) on top of them makes it a maintenance nightmare to add even a single new parameter.

Smoke & the AWS services' libraries will be in a similar situation.


As part of my refactor, I went though my entire codebase to see not only what levels I produce logs at, but also where and why.

I found that the vast majority of cases I log at trace or debug level, which is good! No one likes unsolicited noisy logs from a subsystem!

I also found that the majority of the cases where I log at more severe levels, like error or critical are more of "configuration" contextless situations, such as when triggering a precondition failure.

With this in mind, I took the approach that RediStack delineates between "user-space" and "system-space" logs, with both being configurable for additional context, albeit in different ways.

For "system-space" logs, every Connection[Pool] uses a "logger prototype" intended for establishing a standard labeled Logger instance, but even that can be overridden by passing a custom Logger instance in both the Connection & ConnectionPool initializers.

That logger will be used liberally for 90% of logs generated by the library.

For the other 10% that will be the hot path of the majority of user code, I take the protocol-based approach Tanner described of using an internal implementation that is masked behind an existential:

protocol RedisClient {
  func send(command: String, with arguments: [RESPValue]) -> EventLoopFuture<RESPValue>
  func logging(to: Logger) -> RedisClient // having the `some` keyword here would be nice!
}

internal protocol RedisClientWithCustomLogging: RedisClient {
  func send(/* same as above */, loggingTo: Logger?) -> EventLoopFuture<RESPValue>
}
internal struct CustomLoggingRedisClient<Client: RedisClientWithCustomLogging>: RedisClient {
  init(client: Client, logger: Logger)
  func send(/* RedisClient requirement */)
  // send invokes the send(command:with:loggingTo:) method
}

This made it much more clear on how and when to attach RediStack library metadata to the loggers, while being thread safe, and without much refactoring of the rest of the public API.

And if I were to add additional context parameters, the surface to do so is much smaller.

In both situations, it's left up to the developer at the end of the day to make sure they're passing the context they want - but at least in this it's not every single method, where my use cases are likely to be heavily chained.

2 Likes

I created a small proof of concept repo to showcase how the protocol-oriented context passing pattern can work in practice. It covers EventLoop preference, logging, tracing, and connection pooling. It includes tests to demonstrate usage and prove it works.

There are quite a few different ways that the same effect could be achieved here. I just went for what was simplest to convey the concept.

The key requirements are:

  • 1: When you specify a custom logger (or EL preference, or tracing) it is passed through to all layers. This test proves that.

  • 2: The convenience APIs are not cluttered with context passing. As you can see here and here. This makes the Client protocol acceptable for usage in a high-level framework.

Things I think could be improved:

  • 1: CustomClient indirection. If you call Connection.logging(to:), you no longer have a Connection. You have a private type that needs to forward your calls. This prevents casting back to Connection and has overhead. Maybe this could be fixed by making the override methods (i.e., logging(to:)) protocol requirements as mentioned here?

  • 2: Connection and ConnectionPool are exposing baggage, logger, and eventLoopPreference properties that make varying levels of sense. Maybe Connection / ConnectionPool should not conform to Client directly, but be able to create a Client?

2 Likes

Thanks for the small prototype to work with that isolates many of the details we're discussing.

One of the problems that these types of reductions have, is that it misses many small details in a larger scale application - namely edge cases for specific things.

  1. How do you handle deinit warnings to give more details to what's going on in the system?
    • it can be argued that this is a "superfluous" log and the assertion is good enough
  2. How do you provide a custom logger such as in the send method when you establish something outside of that event lifecycle - such as during init?
    • it can also be argued that this situation could also be handled, rather than a static configuration callback

In your sample the following code will require every function that uses client to go through three levels of indirection, which isn't ideal.

        let client = conn
            .logging(to: clientLogs.logger)
            .eventLoop(prefer: .delegate(on: clientLoop))
            .tracing(with: clientBaggage)

If you add logging(to:), eventLoop(prefer) and tracing(with) to the Client protocol. You could override them in CustomClient to reduce this down to one level of indirection. The CustomClient version instead of creating a new indirection just creates a new version of the CustomClient with all the previous changes.

struct CustomClient {
    func logging(to logger: Loggger) -> Client {
        var newClient = self
        newClient.loggerOverride = logger
        return newClient
    }
}
1 Like

As far as it goes for AWSSDKSwift I'm thinking of going down a slightly different path. First I'll add a little context. Each AWS service struct is structured as follows

struct AWSService {
    /// client used to process AWS requests
    let client: AWSClient
    /// service configuration
    let serviceConfig: AWSServiceConfig
}

The service then has an individual method for each REST operation from that service. A service can have hundreds of operations. Each of these methods follows this format.

    public func operation(_ input: OperationRequest, logger: Logger) -> EventLoopFuture<OperationResponse> {
        return client.execute(
            operation: "MyOperation", 
            path: "/", 
            httpMethod: .POST,  
            input: input,
            serviceConfig: serviceConfig, 
            logger: logger)
    }

We are passing in the ServiceConfig to every execute call. Given this my plan is to hang the Logger, BaggageContext etc off the ServiceConfig. This would mean I don't need the additional logger parameter in operation or client.execute

First I would add the following to AWSServiceConfig. I can use CoW to avoid copying all the other ServiceConfig members in the logging method.

struct AWSServiceConfig {
    ...
    var logger: Logger

    func logging(to logger: Logger) -> AWSServiceConfig {
        var config = self
        config.logger = logger
        return config
    }
}

And then I'd add a similar function to the service struct AWSService.

extension AWSService {
    func logging(to logger: Logger) -> AWSService {
        return AWSService(client: self.client, serviceConfig: self.serviceConfig.logging(to: logger))
    }
}

Now the API works in a similar way to how @tanner0101 describes above

let service = AWSService()
let serviceWithLogging = service.logging(to: logger)
let result = serviceWithLogging.operation(input) 

Thoughts? There are so many ways to do this. I could instead go down the redirection method Tanner mentions above but I don't see any real advantage to this over holding a Logger, BaggageContext in the ServiceConfig all the time and just creating a new ServiceConfig for each context call.

I'll add an example of deinit error / warning :+1:. I think the best thing we can do here is use the logger we were initialized with or have the init accept two loggers like:

init(
    // Details about initializing this connection logged here.
    connectLogger: Logger,
    // Things that happen to this connection unrelated to any
    // method calls that get a logger will log here.
    generalLogger: Logger
)

Similar answer as the previous question I think. You can see how the logger in the connect method is used to send the "connection.init" log. This would be equivalent to the connectLogger from my previous snippet.

Connection pool passes the contextual logger there automatically: https://github.com/tanner0101/context-passing/blob/main/Sources/ContextPassing/ConnectionPool.swift#L63

Yup, this is true. I'll mess around with the custom override methods as protocol requirements and see what I can come up with.

That looks great. The one comment I would have is that Logger / EventLoopPreference / BaggageContext don't make as much sense to me on a Configuration struct as they do on a Client. Would it be possible to instead add that context to AWSClient? You'd get a very similar effect:

extension AWSService {
    func logging(to logger: Logger) -> AWSService {
+        return AWSService(client: self.client.logging(to: logger), serviceConfig: self.serviceConfig)
-        return AWSService(client: self.client, serviceConfig: self.serviceConfig.logging(to: logger))
    }
}

Either that or a separate property like context that contains Logger, Baggage, etc.

struct AWSService {
    /// client used to process AWS requests
    let client: AWSClient
    /// service configuration
    let serviceConfig: AWSServiceConfig
    /// logger, baggage, etc
    let context: AWSContext
}
1 Like

That looks great. The one comment I would have is that Logger / EventLoopPreference / BaggageContext don't make as much sense to me on a Configuration struct as they do on a Client . Would it be possible to instead add that context to AWSClient ? You'd get a very similar effect:

Copying the AWSClient is not so easy. It manages the AWS credential acquisition and this can be an active process. Not sure what would happen if I copied it while that was in process. If I was to use the
AWSClient I would most likely go for your pattern above.

I know what you mean about adding it to the Config not making sense. In my actual implementation I have created a ServiceContext struct holding the Logger, BaggageContext and ServiceConfig.

Either that or a separate property like context that contains Logger, Baggage, etc.

I had thought of a separate context property as an option. Might give that a go, see how I feel about that.

Hi all,

One more use case for consideration:

I was working on a X-Ray tracer recently.
I am hoping to be able to propagate the trace context in the baggage BaggageContext.
I think it will work well only if the context has "operation" scope NOT "service" scope like in the example above.

Below how it could work (code for ilustration purposes):

// the (context) baggage including but not limited to trace context
var baggage: BaggageContext

let segment = tracer.beginSegment(name: "Segment 1", baggage: baggage)

// note that segment.baggage passes whatever was in the baggage
// AND updates the X-Ray trace context (keeps the trace id but updates parent (segment) id)
// so that segments (aka spans) created "inside" have segment "Segment 1" as their parent
// s3 client could create new subsegments and/or pass along context to instrumented HTTP client
s3.listObjects(.init(bucketName: "myBucket"), baggage: segment.baggage)

// other S3 operations could have different context
let segment2 = tracer.beginSegment(name: "Segment 2", baggage: baggage)
s3.listObjects(.init(bucketName: "myBucket2"), baggage: segment2.baggage)

Sounds good. I've also updated my example to use protocol requirements on Client instead of just extensions. This cuts down on the indirection and also seems cleaner / easier to understand (at least to me): https://github.com/tanner0101/context-passing/blob/main/Sources/ContextPassing/Client.swift#L10-L12

The way I implemented ConnectionContext (a struct wrapping a ref type) might be helpful for AWSClient: https://github.com/tanner0101/context-passing/blob/main/Sources/ContextPassing/Connection.swift#L113

You would be able to do "operation" specific baggage like so:

var baggage: BaggageContext

// segment 1
let segment = tracer.beginSegment(name: "Segment 1", baggage: baggage)
s3.tracing(with: segment.baggage)
    .listObjects(.init(bucketName: "myBucket"))

// segment 2
let segment2 = tracer.beginSegment(name: "Segment 2", baggage: baggage)
s3.tracing(with: segment2.baggage)
    .listObjects(.init(bucketName: "myBucket2"))
1 Like

If I am understanding this correctly, you are not only proposing that the context being passed be a Protocol/Interface but that all its requirements should also be Protocols/Interfaces.

protocol Database {
    var eventLoop: SomeEventLoopProtocol { get }
    var logger: SomeLoggerProtocol  { get }
    var baggageContext: SomeKeyValueContextProtocol{ get }
}

If that is the case then too bad that Swift's currently tooling is not able to help us make this contract/convention more explicit.

Nope, just that context be passed via a protocol instead of explicitly to each method call. The actual types on that protocol are the real types (Logger, BaggageContext, etc). You can see an example of a protocol like this in the demo I made here: context-passing/Sources/ContextPassing/Client.swift at main · tanner0101/context-passing · GitHub

Ok, so that's pretty much what I've done in my refactor for RediStack.

The only difference is that from both the comment on the MR and in here is you suggest propagating the runtime user-space logger all the way down in the pool when trying to provide a connection - which I currently treat as "system-space" logs

I do think that requesting a connection from the pool should log to the most contextual logger. Assume you are trying to debug a slow request in your application that is being caused by an exhausted pool. Without the request identifier attached to the pool logs, you can't track what pool operations were related to the request in question. There are some things that really have no context, like Connection.deinit. or perhaps a pool that uses scheduled tasks to do cleanup. But IMO all methods called as a direct result of a method with special context should use that context.

This reminds me a little of SwiftUI's Environment and Dispatch's QoS, both of which propagate information down into callers, potentially through intermediate layers that don't understand the context.

Personally I'd be interested in a language-level solution so that everyone isn't making up their own incompatible tool. Just spitballing here, but vaguely like

//Framework.swift
@threadscoped var activeLogger = DefaultLogger()

public func connectToDatabase() {
    activeLogger.log("Connecting to database...")
}
//main.swift
import Framework
//will log with the default logger
connectToDatabase()
with activeLogger = CustomLogger() {
    connectToDatabase() //will log with custom logger
}

The advantage of thread storage is you can trampoline the context between a high-level caller and any number of intermediate function calls, that may not know/care about participating in a context-passing-bucket-brigrade.

The advantage of a language feature is that packages that desire to coordinate on a context can do so. For example, if activeLogger is defined somewhere convenient (stdlib or a popular package), everyone can log into that, and at top level a specific logger might be chosen. For a web framework this might be a new logger per-request, for an app it might separate by subsystem, etc.

Quite a lot going on here, good -- it's an important topic we need to find solutions for as a group and share some general style about them so the various libraries compose well :slight_smile:

Sorry for the delayed response, had some days off to recharge a little bit, now then, let's dive in :fish:


Protocol-based context passing / "FrameworkContext" types / protocols "carrying" (baggage) context

I agree that we'll want to allow and encourage protocols that carry around the "common stuff" :+1: There are some details we need to iron out though...

I've been calling this "framework contexts" that carry baggage context, seems the general idea we're thinking about the general same shape :slight_smile: There are some fine details we need to iron out here though.

It's somewhat a of a mix between the Go Context passing style, and a truly Swift unique style which we can do so well thanks to protocols. I hope we can figure out a pattern that'll work for all use-cases.

I think it'd be very useful to review and keep in mind the Holding baggage in framework protocols #9 protocols in baggage context itself as we discuss things going forward. If those are not enough (and I'm more than happy to discuss naming etc), we need to figure out what would be, specifically because all this context passing is relatively "simple" while there is one library and one concurrency model involved, but we don't have that luxury! We need to compose across various libraries which may or may not depend on eachother and may or may not have the same concurrency patterns. (More on that later)

Before diving into the types and relationships, let's highlight some of the pitfalls we can fall into and were brought up already:

Pitfall: passing logger separately from baggage context

The timing of https://github.com/swift-server/async-http-client/pull/227 is a bit unfortunate but in my countless discussions with Johannes on that we're been aware that the current API is not final -- we expected to change this API, of course it would be better without needing to... but here we are.

We should not just add the context parameter though IMHO. We can do better than that, by following a protocol context style, somewhat similar to what Tanner proposes, however using types that the BaggageContext project now contains since a while: ...Carrier protocol types.

First to highlight why the client.get("...", logger: myLogger, context: myBaggageContext()) may be slightly sub optimal.

The first annoying thing about it is the doubling of noise of course. Go and such languages avoid this by NOT passing the logger around, but storing the logger or creating it ad hoc, and always using it with passing the context to the logger (and through all operations/functions). It also means that there is no obvious relation between the logger and the baggage context -- that's bad because we need this to be predictable and boringly well known rather than have to look up the relationship.

Specifically, I think that an API that uses context will want to trace those identifiers as well, thus the logger must be made aware of the baggage context_. It must be done so efficiently, only when log statements are made etc.

This means the client has to always make this dance:

get(..., logger: Logger, context: BaggageContext) { 
    logger.with(context).info() // ok
    logger.warning()  // whoops! lost trace IDs from those logs...
    somewhere(logger: logger) // whoops! lost trace IDs from those logs...
}

Note that logger.with(context:) already exists - as and is backed by an efficient baggage aware LogHandler implemented in https://github.com/slashmo/gsoc-swift-baggage-context/pull/9 available via import BaggageLogging

Which is a bit repetitive. Also, it is my understanding that many developers want to avoid "pass both logger and the context," (please confirm or deny :wink:) thus, I agree we should do better than that and use "context protocols'.

Now, since HTTPClient cannot depend on Vapor (yet Vapor depends on HTTPClient), it must accept some context carrier protocol type, thankfully we have such type in BaggageContext:

/// A `BaggageContextLogging` purpose is to be adopted by frameworks which already provide a "FrameworkContext",
/// and to such frameworks to pass their context as `BaggageContextCarrier`.
public protocol LoggingBaggageContextCarrier: BaggageContextCarrier {
     /// The logger associated with this carrier context.
     ///
     /// It should automatically populate the loggers metadata based on the `BaggageContext` associated with this context object.
     ///
     /// ### Implementation note
     ///
     /// Libraries and/or frameworks which conform to this protocol with their "Framework Context" types,
     /// SHOULD implement this logger by wrapping the "raw" logger associated with this context with the `logger.with(BaggageContext:)` function,
     /// which efficiently handles the bridging of baggage to logging metadata values.
     ///
     /// Writes to the `logger` metadata SHOULD NOT be reflected in the `baggage`,
     /// however writes to the underlying `baggage` SHOULD be reflected in the `logger`.
     var logger: Logger { get set }
 }

See Holding baggage in framework protocols by ktoso · Pull Request #9 · slashmo/gsoc-swift-baggage-context · GitHub

:warning: Not in love with the names we found so far... they're a bit long, although descriptive... Open to ideas, please chime in here or in PR 9 / baggage context via issues.

so with those in place, we can in HTTPClient:

func get(_ url: URI, ..., context: LoggingBaggageContextCarrier) -> EventLoopFuture<Response> { 
  context.logger.info() // definitely also logs the trace ids etc
  // it also allows the framework context to hand tune the logger,
  // e.g. I only want to log "trace id, but not the other baggage information"
  // so the control over loggers remains in tact, but it's harder to get wrong, and it is easier to pass around

So users can pass any context there. We could also conform logger to the carrier I suppose... allowing context: logger? a bit weird but it would allow the simplest use case to work still ("just log")?

And Database can:

protocol Database: LoggingBaggageContextCarrier { ... }

and thus this will work, as you wish for in your initial example @tanner0101:

app.get("users") { req in 
    User.query(on: req.db).all() // yay
}

It also composes with other libraries, in case they accepted a raw BaggageContext or better any baggage carrier (e.g. they are not logging, but need to pass the context around for some reason), like so:

extension Request: LoggingBaggageContextCarrier {} 
extension Database: LoggingBaggageContextCarrier {} 

func first(context: BaggageContextCarrier) {}
func second(context: LoggingBaggageContextCarrier) {}

app.get("users") { request in 
  someLibrary(context: request)

Since Vapor is a strongly opinionated framework and wants to cut the parameter passing a lot, I think it's okey if it conforms all of its Request and similar types to context carrying. "Normal" less opinionated libraries I would strongly suggest to follow the general context passing guidelines:

// my library
func take(thing: Thing, context: BaggageContextCarrier) 
func take(thing: Thing, context: BaggageContextCarrier, callback: () -> ()) 

etc.

As for the Client wrapper... wouldn't Vapor be okey with not wrapping it then, if it could pass (optionally) the request context to requests? I assume you would like to move away from the wrapper, so that is a breaking change anyway, so the shape would become <???application???>.client.get(url, context: request) which would free Vapor from having to maintain those API wrappers? WDYT @tanner0101?


Let's visit another API from this thread:

First, I do realize that yes it is annoying to pass context around, no-one loves it, but that's in reality the only way we can get proper end to end tracing in Swift today. What API guarantees are you worried about upholding that adding an optional context parameter would such a nightmare?

(I'll comment below again why thread-locals are a no-go, as that was mentioned as well).

So let's look at:

    @inlinable
    public func zlexcount<Value: CustomStringConvertible>(
        of key: RedisKey,
        withValuesBetween range: (min: RedisZLexBound<Value>, max: RedisZLexBound<Value>)
    ) -> EventLoopFuture<Int> {
        let args: [RESPValue] = [
            .init(from: key),
            .init(bulk: range.min.description),
            .init(bulk: range.max.description)
        ]
        return self.send(command: "ZLEXCOUNT", with: args)
            .tryConverting()
    }

in order to have it be traced, we absolutely need to have a context available here, so it'd become:

    @inlinable
    public func zlexcount<Value: CustomStringConvertible>(
        of key: RedisKey,
        withValuesBetween range: (min: RedisZLexBound<Value>, max: RedisZLexBound<Value>),
        context: BaggageContext? // << added
    ) -> EventLoopFuture<Int> {
        let args: [RESPValue] = [
            .init(from: key),
            .init(bulk: range.min.description),
            .init(bulk: range.max.description)
        ]
        return self.send(command: "ZLEXCOUNT", with: args, context: context)
            .tryConverting()
    }

In other words, as @pokryfka mentions:

(Baggage) Context MUST be per operation. Why is that so? Because in a trace view, it is the operations we need to see and in order to be able to attach them to the right parents in a trace, we need to have that context available, because it contains all the trace parent identifiers etc.

As I'm not sure if everyone is on the same page how these traces look like, I quickly put together a Zipkin tracer over the Swift Tracing APIs, and here's how those end up looking like:

Now, we'd like Swift to give a best-in-class user experience when operating systems in the cloud, and a best in class tracing experience is a pre-requisite for that. And tracing is not only about HTTP calls, but also databases like Redis, for example, here's what we'll gain if all client libraries will participate in this context passing chain:

(Screenshots from https://medium.com/opentracing/take-opentracing-for-a-hotrod-ride-f6e3141f7941 -- great article / example app)

So right away, we'll be able to know "aha, during my /dispatch handling I'm hitting redis n times, and here's the "waterfall" and how I can improve my response latency etc. It is tremendously powerful and I hope the vision of all SSWG projects participating in this context passing dance and tracing instrumentation is a vision you can be excited about as well :slight_smile:

But yes, it does mean that we need this context parameter in all operations. I believe it is a small price to pay for the benefits and coherent ecosystem where our users will come to expect "of course it just works!".

Side note: There is also a difference between APIs making network calls (database drivers, http clients) and "just futures" in therms of "weight" and what needs to be instrumented by default, and what not;

For example, we would not want all NIO futures involved in a request to show up in a trace -- that'd be way to much information, and mostly useless information as well! However for futures, we would want to trace specific ones so eventLoop.traced(name, context).makePromise ... - is an excellent case for those "traced(...)." APIs, but for network APIs context propagation should be the default.

In any case, looping back to the APIs... similar to what @tanner0101 and @cory in the Redis PR linked are saying, the following:

public protocol RedisClient {
    var eventLoop: EventLoop { get }
    var logger: Logger { get }

    func send(
        command: String, with arguments: [RESPValue]
    ) -> EventLoopFuture<RESPValue>
    
    func setLogging(to logger: Logger) // not thread safe AFAIR?

changing into

public protocol RedisClient {
    var eventLoop: EventLoop { get }
    var logger: Logger { get }
    func logging(to logger: Logger)

    func send(
        command: String, with arguments: [RESPValue], 
        context: BaggageContextCarrier
    ) -> EventLoopFuture<RESPValue>

sounds good to me? It allows the baggage to be passed through the operations, and it allows the customization of the logger ad hoc...


:warning: Crazy future looking side-note / idea section :warning:

Now there still remains a lot to be done and Swift's concurrency story is not fleshed out yet, but we know a part of it will be asynchronous functions. We also know that we have property wrappers, and there have been ideas floating around for function wrappers as well ( Prepitch: function wrappers not official proposal, but great idea ).

So... if we have all libraries adopt the context passing, in a similar matter, we're in better position pitch an use-case of such wrappers that would be @Traced, and could look like this:

@Traced
func hello(context: BaggageContext?) async -> String { // allow optional? :<
  // magically does: 
  //   let span = tracer.startSpan("#function", context: context)
  //   defer { span.end() } 
  return await produceThatGreeting()
}

which would automatically trace the function and attach the span to the right context. But we need this context parameter around (or one of the ...Carrier types).

If/when a concurrency story lands in Swift... we could further discuss: okey we have this context that is "local" to this "task", is this a common thing in concurrent systems (I think it is), and could we put some sweet swifty sugar on top of it?

This cannot be solved well using thread locals (in-thread proposal about them, see also discussion in Server Distributed Tracing - #28 by tachyonics) due to how diverse and reactive the async landscape is in Swift (Combine, Dispatch, NIO, just pthreads, various OSS Future impls, ...), but...

Swift will have a concurrency story someday, then we would be able to formulate and express the need for "async local" (terrible working name) abstraction, and then we'll be able to replace our explicit context passing, with async local context passing.

// ⚠️ crazy speculation, no idea if we'd ever get async context, but!
// at least it is a very specific, existing, well defined thing we
// can _point at_ and solve, rather than wait.
@Traced
func hello() async -> String {
  // magically does:                      // AsyncLocal<BaggageContext>.value
  //   let span = tracer.startSpan("#function", context: .current) 
  //   defer { span.end() } 
  return await produceThatGreeting()
}

So async functions could carry around a "context" implicitly and we'd use this to attach baggage context there. Obviously, no idea if or when such thing would manifest in Swift, but the general idea is sound, and we are but one of the specific and widely deployed uses of it then. It is very useful for arguing such features to have an existing working ecosystem, and show a painpoint (the explicit context passing), that the compiler feature would address – my hope with this work is to do just that, prove the utility of tracing and justify the work on such async local capabilities some day.

Perhaps we should re-consider optional context parameters? This would make future migrations painless...

@Traced
func hello(context: BaggageContext? = nil) { 
  // ... tracer.startSpan(context ?? AsyncLocal context)
}

after all...
:warning: End of crazy future looking side-note / idea section :warning:


This looks slightly wrong, or rather, it looks like a "mix" of things:

  • (a) either a library is not instrumented, so we need to start spans per operation on its behalf,
  • or (b) a library is instrumented so I pass it a context and it starts/ends spans.

I think it's important to put the full snippets here, because what seems to be a simple small difference, is actually pretty huge in terms of noise in user code:

So it is either:

// (a) the library does not start spans for the operation:
func handleSomething(buckedName: String) -> ??? {
  let span = tracer.startSpan("Segment 2", context: context)
  let x = s3/* .tracing(with: span.context) -- what would that do? */
      .listObjects(.init(bucketName: bucketName))
  x.onComplete { 
      case .failure(let error): 
          span.addError(error) // and more "status" etc
          span.end()
      case .success:     
      span.end()
  } 
  return x 
}

or

// (b) the library DOES start spans when necessary for the operation:func
handleSomething(buckedName: String, context: LoggingBaggageContextCarrier) -> ??? {  
  s3.listObjects(.init(bucketName: "myBucket2"), context: context) // emits spans
}

I'm also worried about looking at such snippets in isolation; such calls are never made in isolation but in real applications, so more like:

Lambda.run { hello, context, complete in 
  let result = handleSomething(bucketName: "bucket", context: context)
  complete(result)
}

Which looks pretty natural.

Now, yes, we have to pass around the context... but we were going to anyway most likely, be it for logging, or tracing or for "framework reasons". For the baggage context specifically, I do have hopes that whatever concurrency story we end up with, we can move away from explicit passing, to implicit passing of it (see crazy idea section above). But in today's Swift we do not have that luxury.

Yes this sounds right btw! :slight_smile:


Complication of tracing(...) vs. (context: ...)

So overall this pattern of logging(to:).thing() is quite nice :+1:

I think it's great for configuration, so things like the event loop preference, the logger as well I suppose, but it really does not fit context propagation -- context really should be passed to operations "along" the way the operations go, not via some side channel;

s3.tracing(with: span.context) // super worrying (!)
    .listObjects(.init(bucketName: "myBucket"))

I understand the wanting to tone on noise in the APIs, but (baggage / trace) context is really special and really must follow the execution paths to it's easy to mentally map spans to actual operations made. I think it would be much better if we loosen up a bit and allow optional context: if that is the main problem these patterns are trying to address. Optional context can still be cause of issues but it is much much easier to spot, and indeed it's less of a pain for those who don't need tracing... (Though I'd argue you "need" tracing when it's "too late" and you don't have time to suddenly go around and fix all your codebase).

Is the concern that we do not always have or want to pass context? Or is it about the library maintainers having to accept one?

If so, I think we can warm up to the optional context parameter... It has its own problems and risks, but it's a tradeoff. If people really do want to call APIs making distributed calls without context they could... It will break traces and cause issues, but if that'd get us one step in the door with distributed tracing in the majority of APIs that's already a good win. And we can hope to make async context a reality in the future, when the explicit context could be removed.


Why the big deal about "always" passing context, anyway?

Tracing can be expensive, it costs computation and slows down your responses, but it's also very useful.

So in reality, systems are never always traced, they are always passing context, and sometimes that context is recording a trace.

This works by the two parameters, like in Open Telemetry SDK guidelines:

  • IsRecording field of a Span . If true the current Span records tracing events (attributes, events, status, etc.), otherwise all tracing events are dropped. Users can use this property to determine if expensive trace events can be avoided. Span Processors will receive all spans with this flag set. However, Span Exporter will not receive them unless the Sampled flag was set.
  • Sampled flag in TraceFlags on SpanContext . This flag is propagated via the SpanContext to child Spans. For more details see the W3C Trace Context specification. This flag indicates that the Span has been sampled and will be exported. Span Processor and Span Exporter will receive spans with the Sampled flag set for processing.

The same is true for swift-tracing.

Now imagine this use-case:

// framework:
//   let span = startSpan(.empty) // Sampler decision to NOT trace
//   // span isRecording: false, context.sampled: false
// defer { span.end() } 
// then invokes:
handle { httpRequest, context in
  let span = tracer.startSpan("a", context: context)
  span.attributes.[...] = request // set a lot of attributes
  // all those operations are no-ops, because we're not tracing (!)
  span.end()

  something() // miss to pass context
}

func something(context: BaggageContext? = nil) { 
  let span = tracer.startSpan("something", context: context)
   // Sampler decision was TRACE (!), 
   // suddenly we're tracing from the middle of a request :-(
   // this isn't good -- we uselessly consume capacity, and the traces are 
   // "partrial" and weird -- not what we want to see (!)
  ... 
}

This generally depends on how a tracing backend is implemented... but you see the problem. Passing context even if not tracing a specific request is important.


Maybe the common ground is optional context:?

I'll be the first one to admit though, yes, passing context objects around if you're not even using tracing at all is annoying, yes.

Thus perhaps we can hit a middle ground with:

  • context parameter SHOULD be the last (before defaulted non-function) parameter, of any operation function that is either intended to be traced, or may contain (!) operations that will be
    • context MAY be optional and defaulted to nil, although this is discouraged in general
    • (users may always easily pass .empty after all)

Then note, because this rule is so mechanical...

  • we could offer a linter check for people who do use tracing. Simply find all functions where a context object is present, and if a value is not passed to it explicitly, warn about this. This will help people who are serious about using tracing :+1: We would not be able to write such linter or fixit's for xcode if the pattern is anything more fancy like the "tracing(to:)" or similar.

The optional parameter as a "MAY be optional" leaves the door open to APIs which do not want to be explicit about it (although again, it comes at a cost...).

(Please comment if these sound reasonable? )


I've gone far too long on this post, so I'll leave it at this and let's continue discussions until we're all reasonably happy with the recommendations, guidelines and types involved :slight_smile:

Sorry again that the post grew big, but due to my time away a bunch of topics accumulated here and I wanted to make sure to not miss the important bits... Thanks and I hope this is bearable to read!

8 Likes