[Discussion] Server Logging API

sure, but with CustomStringConvertible the common case:

var index = 2
[(name: "index", value: index)]

is a bit nicer than

[(name: "index", value: "\(index)")]

and you can still use string interpolation or String(description:) on those rare occasions when you have something which is not CustomStringConvertible.

I really like having "file" and "function", but I would prefer not to include "line". File + Function + Message by themselves allow efficient navigation to the source of the message in the code so the "line" parameter just generates noise which will change frequently either making writing tests a bit more burdensome or generating a lot of test failures which need to be fixed. Granted, the line# could be omitted by a testing LogHandler.

I think here we have a very fundamental difference of opinion, and without addressing any other points of style or design, this needs to be addressed first:

I understand that, from your perspective, it must not be possible to use the logging API without choosing a logger.

I disagree. I proceed from the basis that, for this logging API to be learnable, teachable, and usable for beginners (and beginners we all were at one point, and a general API for logging must accommodate them), it absolutely must be possible to log something somewhere without knowing what a Logger, Logging, or LogHandle is. The barrier to logging something should be as low, if not lower, than printing.

I think all the remaining API questions would unfurl very differently on the basis of the settling this point.

8 Likes

You will commonly want a session ID id as well in order to trace the users actions before the specific request. In many deployments you'll also need an instance ID for the application and/or an application ID if you have different applications (sharing common libraries) feeding into the same logging system.

The Logger manages filtering and adding context, the handler takes care of formatting and posting, right?

If we could get info about the calling module (even adding to Swift a new #function-style parameter, #moduleName / #moduleIdentifier) I think it would be great to add a global logger.

I don't see the need to have Logging / LoggingSystem and Logger separated. Logging.make() should simply be Logger.init(identifier:) and Logging.bootstrap() should be something like Logger.addHandler().

2 Likes

The single global variable per module might be easy enough, if it were not for this:

There is a global default logger(-ish) already available. It's called print(...), and I know that I at least (and I suspect many others) would probably resort to just using that "just for now" and then never get around to changing it to something proper.

I think it is crucial this API is as frictionless to adopt as possible, and to do that it needs to be a drop-in replacement for print. If we could get that to carry module information next to file, function and line (structured SourceLocation :pray:) that would be great, but even with just the source location info we have, that would bee good enough, imo.

1 Like

Thanks @johannesweiss and @tomerd for putting this together.

A couple of initial things-

TL;DR

  1. I would recommend splitting Logging.make into-
    1. an API the application can call to create a LogHandler implementation to explicitly pass around after having bootstrapped the logging system (what the API is currently).
    2. an API to create a proxy for libraries to create a static logger that resolves the LogHandler implementation at log time to avoid instantiation/bootstrap ordering issues.
    3. an API to create a logger from a logger for libraries to create a Logger to explicitly pass around; again the library doesn't need to understand if the logging system has been bootstrapped - the existing logger has either been passed from the application who ensured bootstrapping occurred or is a proxy created by itself or another library.
  2. There is a potential performance impact of this for logging from libraries but potentially is worth it if the framework is more robust against incorrect bootstrapping.
  3. The ability for an application to "lock" the mutation from Logging.bootstrap to prevent foreign libraries messing with the logging system during application execution.

More detailed thoughts

Logging.bootstrap should have the ability to "lock" to the current _handler to prevent bad actor or poorly implemented foreign libraries suddenly changing the logging implementation on an application.

Tracing through OneGlobalLoggerExample.swift, everything - unless I have setup the code incorrectly - is actually logged to the StdoutLogger instance that was the result of the current _factory at the time the static logger was instantiated.

IMO, for the static logger case Logging.make shouldn't actually be creating implementations from the current _factory, which particularly for the case of static loggers which would require paying close attention to call and instantiation order. Easy to get wrong.

I feel it would be preferable that Logging.make simply create a proxy that resolves the actual LogHandler implementation at log time. Something like-

public protocol LogHandler {
    static func handler(for: String)
    ....
}

For the explicit passing case, this doesn't make sense however and perhaps there isn't a great way to unify these two use cases. Maybe Logging.make needs to be spilt into two-

  1. for the use case where the caller is responsible for bootstrapping the logging system (ie. is the application), a call to directly create a LogHandler implementation to be passed around.
  2. for the use case where the caller isn't responsible for bootstrapping the logging system (ie. is a library), they can create a proxy to specify who they are but they don't know what that actually means in terms of the underlying LogHandler implementation.

There is a distinction here between these two roles which the framework could make clearer-

  • if the caller is not responsible for bootstrapping the logging system (ie. a library) they shouldn't be relying on a global function to create LogHandler implementations as by definition they can't know if the logging system has been bootstrapped.
    • for the static logger case there should be an API to create a proxy that defers the determination of the LogHandler implementation to log time
    • for the explicit passing case, there should be an API to create a logger from a logger - the choice for a library is they create a static logger which is a proxy and pass that around or create further proxies from it or receive an explicit logger from someone who has bootstrapped the logging system and pass that around or create further direct implementations
  • if the caller is responsible for bootstrapping the logging system (ie. is the application), they have the knowledge to be able to safely create LogHandler implementations from a global function

Logger delegates everything to the underlying handler, so basically a "dumb" proxy into the concrete logger implementation. the only logic it has is basic log level filtering (L34-36) which together with the fact that message is an @autoclosure gives us a performance edge: we dont need to evaluate the message when the log level is insufficient

note that the log level itself is managed by the underlying handler, which enables implementating config based logging libraries which were mentioned earlier in this thread. this is demonstrated in GlobalLoggerTest::test3 and TestLogger in the unit tests. TestLogging uses in-memory config for simplicity, but a real implementation of a config based logging library is likely to use config files external to the application. the idea is to allow "hot" changes of log levels for specific namespaces via external resources like config files when debugging an issue in prod or iow without needing to restart or redeploy the application. this is a very common pattern in large scale systems.

to confirm, are you worried about racy initialization when using static loggers?

you are right the lack of such protection opens the door to mistakes. one idea we were playing with in earlier versions was to only ever allow a single bootstrap call. we eventually removed it from the proposal to be less opinionated / more flexible

I think that's a good suggestion. I filed the question and after we reach consensus on the core APIs, we can add (through fresh Pitches/Proposals) features before we release 1.0.0 for the logging API.

Very happy to discuss this but I think that's additive and if you don't mind, let's discuss it after the core issues. (filed though).

So that's fundamentally at odds with what others say (who feel adding one extra line let logger = Logging.make("program") is already too complicated.

I think we should provide a default so people don't need to know what a LogHandler is and how the log system works. The reason why I think it's okay without fully understanding it is that at any point in time you make add one line (Logging.bootstrap(MyBetterThanDefaultLogHandler.self)) to your program and the LogHandler is changed throughout the whole application, including any libraries.

Does that make sense?

I'm inclined to also punt this to 'after the core issues', I filed this issue too so we won't forget.

:+1:

Agreed!

Okay, I see what you mean: Even though the LogHandler is responsible for formatting the meta-data dictionary, there'll be some pre-formatting going on to put it into a [String: String] shape which might be a problem sometimes... Given the amount of great feedback however, I'd say let's decide that in a separate thread once we can decide on the core issues (issue for metadata types here).

It's just offered to the LogHandler but I would also assume that many if not most would just ignore the #line. I don't think there's any harm caused offering it so I'd leave it in there but I'm not married to it either.

1 Like

I don't have much to add to this (very promising!) proposal, but I think I would find non-mutating variants of the metadata setters useful (as an addition, not as a replacement, I'm sure there are use cases where mutation is the most practical solution). I.e. something like

public struct Logger {
    ...
    public func setting(metadataKey: String, value: String) -> Logger
    // or, maybe even:
    public func setting<A>(metadataKey: String, value: String, f: (Logger) -> A) -> A {
        var logger = self
        logger[medataKey: metadataKey] = value
        return f(logger)
    }
}

The reason being that it might be useful to have ways of guaranteeing that we don't clutter up the logger metadata inadvertently or reuse some metadata between requests because we forget to overwrite it.

Not quite, we provide a default LogHandler so if you don't want to customise it, you ignore it. But you will need to make a logger. The easiest way to get there is to just put a let logger = Logging.make("my-program") just under import Logging.

Let's have a look at what is needed to go from nothing to a log message:

  1. create a package: swift package init --type=executable
  2. add .package(url: "https://github.com/weissi/swift-server-logging-api-proposal", .branch("master")), to the Package.swift in dependencies: [ ... ]
  3. (assuming Xcode) swift package generate-xcodeproj
  4. xed . or open MyApp.xcodeproj or open the project in Xcode
  5. import Logging
  6. let logger = Logging.make("MyApp")
  7. logger.info("Hello World")

Is the only step we could avoid is step (6), is that one line really the make or break issue here? Especially that the documentation will need to tell you the module name (so you can import Logging), we can just add to the documentation that if you don't want to learn anything more you just copy & paste

import Logging
private let logger = Logging.make("MyApp")

?

Stepping back, I don't think if it's two or three lines of code is the most interesting point of discussion because we're talking about a logging API for a whole server application. Even ignoring deploying this to the cloud or something you'll use some (probably web) framework like Vapor, Kitura, Perfect, smoke or so because otherwise your barrier to entry is likely parsing/serialising HTTP. The authors of those frameworks know their target audience best and if they feel logging is too hard, then they can always offer a very straightforward logging functionality, be it a free function, a static class or whatever they feel is the best. They can offer a global logger that logs under com.company.framework.user-code or something.

The door is also open for another package which could be called JustLog which nothing but a global public func log(level: LogLevel = .info, message: String, logger: Logger = defaultGlobalLogger) { ... }. Again, it would be a compatible solution that I wouldn't recommend but also have nothing against.

On the flip side however (which I believe is more likely): If we did offer some global default logger and the framework wants you to obtain the logger from some request context, then the users of the framework are in a dilemma of which one to use. The framework one will be more appropriate as it might attache request ID, session ID, etc but the other one still always exists.

You don't need to know what a LogHandler is, the only thing you need to do is to paste let logger = Logging.make("MyApp") and from then on you can use logger.info/warn/error/....

Sorry but I can't follow here. print comes with the standard library and is always available. There's no way you can design a logging API has a lower barrier than print. Even the same barrier is only achievable if we shipped this with the standard library which isn't realistic because server logging APIs tend to be need more features than a client-side library. That is fundamentally because an iOS app usually handles one user at a time who does one thing. Server applications on the other hand might be handling thousands of concurrent requests and what people usually want is a log stream for one particular request.

Also: print will not go away, if print is good enough for your application: Go ahead and print. If however you're creating a library then you probably shouldn't use print and I agree the bar should be low. IMHO pasting one simple line of code is low enough. There are quite a few other things you'll need to learn when you want to publish a SwiftPM package (SwiftPM itself, git tags, SemVer, ...).

And a JustLog package which simplifies the API proposed here for some very straightforward use-cases wouldn't be able to cover what you're after?

6 Likes

First, thanks @johannesweiss for your work on this proposal!

As an alternative to @yxckjhasdkjh's suggestion to limit logger mutation, I think that passing metadata directly in the log function could be a good idea, eg:

public struct Logger {
    ...
    public func trace(_ message: @autoclosure () -> String, _ metadata: @autoclosure () -> [String: String] = [:], file: String = #file, function: String = #function, line: UInt = #line)
    ...
}

That way we can guarantee that logger itself is not mutated.

Unfortunately, that won't quite work. There will be multiple distinct LogHandlers and when creating a Logger you want to have a Logger that is configured with the 'correct' LogHandler. Now, a library can't and shouldn't know what the correct LogHandler implementation is because that should be chosen by the application.

The one and only job of Logging.make is to create a Logger with the correct LogHandler as selected by the application (using Logging.bootstrap).

But let me explain if we didn't have that. Then Logger's constructor would probably look like: Logger(identifier: String, _ logHandler: LogHandler) and for the whole thing to work together, each and every Logger instantiation would need to chose the exact same LogHandler. Otherwise, some logs would end up in a database, some would be saved to a file and yet others would end up on stdout.

This proposal is purely about providing the API to the logging system, it deliberately doesn't touch on LogHandler implementations but tries to support a wide range of different approaches to logging. Said that, the API package would ship say StdoutLogger which is a very simple implementation which just prints to stdout.

Thanks, I think this makes sense and I'd be totally open to also allow adding 'temporary meta-data' to each of the logging calls. Would that work for your use-cases @yxckjhasdkjh?

Hey Johannes, Tom, everyone,
great work overall here! I think this is overall very good and not much changes are needed. Good minimal baseline of APIs that I hope we'll be able to agree on and ship :-)

I have spent a significant today (and some time ad hoc before that) on making sure that potential use-cases are well served by this form of the APIs and over all I think we're good here. I specifically focused my attention on the metadata aspects and initializing/obtaining loggers, and how it would work with libraries – I think that's the more tricky parts as well.

So overall I think the proposal is quite solid, wanted to structure the feedback some more though, so here we go:

Overall types:

The separation between Logging, Logger and LogHandler is nice and clear, I like it. Implementing proxy of fan-out LogHandlers is possible and should work just fine.

Loglevels

Since this seems to have been brought up a few times in the thread wanted to share my perspective as well. I think we're good as is, and having the levels available as an enum is very useful for uniformity of logging across libraries.

Very good to include trace, as it's a very useful level for library authors that want to include a lot of logging however it would be "too much" for debug level.

Custom log levels: I do not see a need for custom log levels "as custom log levels", however the use case can be solved by metadata or configuration on log handlers, where one would add "component: database-stuff" and have a LogHandler that inspects metadata and e.g. does not log the database-stuff; No need to complicate the log levels. This is also exemplified by the GlobalLoggerTest::test3 Tom mentions.

Metadata

Getting the bikeshed out of the room: love the use of Metadata rather than MDC, and I'm coming from an MDC-heavy ecosystem :wink:

I have verified that common Metadata / MDC patterns are fine to implement using this API, e.g. one typical pattern being that some metadata is bound to the logger's "context" or lifetime, however some metadata is added per each invocation of some user handler adding (and potentially overriding) the life-long data. After such invocation the Metadata is restored to the previous state. I see this pattern as very very common, and I think we're good here thanks to exposing the metadata setter, example:

    let perContextMetadata: LoggingMetadata = context.log.metadata ?? [:]

    let data = Data() // we got it from somewhere
    
    // naively merging things
    let moreMetadata = extractMeta(from: data, withBase: perContextMetadata)
    var allMetadata: LoggingMetadata = perContextMetadata
    allMetadata.merge(moreMetadata, uniquingKeysWith: { (l, r) in r })

    context.log.metadata = allMetadata
    defer { context.log.metadata = perContextMetadata }
    userCallback(&context, data)

So the above snippet gives all the right semantics, a "long lasting" metadata container can be overriden by a more specific one, and at the same time we handle the "reset to previous state" quite well – thanks to allowing the set on the metadata itself.

Metadata: temporary metadata

The just mentioned by @artemredkin addition for per log statement metadata may be fine, however it may be worth adding this as functionality instead as a scope: .withMetadata(@autoclosure ...) { within this block that metadata is added } instead.

This allows for nested calls within such block to work properly – in addition to just "the calls I know about". This shows up again in tracing examples where sub calls should expose the same "request id" or similar as as the encapsulating call.

Metadata: signature discussion: var metadata: LoggingMetadata?

It is not stated explicitly, but after stating at the API for a while I realized that the reason to allow nil is to support the metadata = nil as a way of clearing the underlying data. I "get it", but it took me a (longer) moment to realize this.

At the same time we end up paying all the "is it nil or is it just empty" checks.

I would propose skipping the Optional wrapper on this type signature, and ask implementations which do not handle metadata to use = LoggingMetadata() (which is exactly [:]) as the "nil object" instead. This allows us to avoid the creeping up of such code patterns: self._metadata![metadataKey] = newValue (multiple ! and ? found in the examples), as well as makes general interactions simpler I believe, no need to differentiate the "nil" from "no metadata" IMHO.

Metadata: type signature, do not force String values

This has already have had some input in the thread ( and has a ticket decide type for metadata value · Issue #5 · weissi/swift-server-logging-api-proposal · GitHub ), but as it is the one "biggie" I have about the current proposal I wanted to shine some more light onto it with a specific use case, so here we go:

I also made a PR to showcase the specific type of situation I have in mind when I talk about this change, see here: [DO NOT MERGE] Exploring slight changes to APIs by ktoso · Pull Request #6 · weissi/swift-server-logging-api-proposal · GitHub

More details are in the PR, however the main point is the following:

I really would want to avoid forcing the metadata values to be String, because "rendering" the string representation of some metadata may be "wasteful", especially if the metadata is some binary structure, that in order to be printed has to be formatted a little bit, e.g. like a traceId. (Please note that tracing and logging are quite separate things, however the one place where they do overlap is wanting to log a traceId in all log statements within such trace's execution. I've seen this in action in a number of systems, though here's a nice discussion as well with Nike describing their use of this pattern with Zipkin: Question: Can zipkin be used to replace logging? · Issue #1453 · openzipkin/zipkin · GitHub ).

Consider the following situation, which we'd like to avoid:

  • we have some library code setting up metadata in the logger before invoking the user code
  • user code MAY log things, so we need to offer it the trace ID as well,
    • if we force values to be String, then we cannot offer our TraceContext as a value for "trace" metadata key, and we are forced to render it before invoking user core, we don't know if it will be used or not.
    • consider this rendering to be "expensive enough that I care about not doing it if I don't have to"
  • user code is invoked, does not log anything – thus our rendering of the metadata was in vain.

In contrast, allowing some other type as value that is "lazy in rendering the String representation" (I don't really care all that much which type we pick for this... I would be fine with Any actually, since the purpose is only for "print it"), the execution is as follows:

  • library code sets "trace": trace, trace being the "optimal native" representation of the trace (a bunch of integers)
  • user code runs, does not log
  • we never invoke rendering of any metadata objects as they were never used. Yay, we did not waste cycles.

The reason this is important is that we DO want to have many debug logging statements around for the time when we might need them. However we most likely don't have them print / turned on all the time. We do so because of the a) noise they generate and b) because it takes time to print/render things, which impacts latencies. However if we do not delay rendering of metadata... then regardless of what loglevel we set, we would always incurr the pentalty for rendering the not used metadata :scream:

TL;DR; The metadata API should allow either Any as value (which in quick silly benchmarking I did not find to be a performance issue hmm...), or some other type that "delays the rendering", be it CustomStringConvertible, some LoggingStringConvertible or anything else the community decides upon.

Minor naming bikesheds...

Minor things, feel free to ignore, or ponder about:

  • Logging.make("hello.world") – Not in love with the "make" in the name... reads very weird. I don't "make a logger", neither do I "make the hello.world". I understand the need for this rather than a plain init since the dance with the lock, but perhaps we can work around it somehow... Logging("...") or Logging.get if we need the method would read nicer. Anyway, minor thing, does not affect usefulness of the API.
  • freezing and/or intercepting bootstrap() calls is somewhat very tempting to myself and my use cases. But I like going forward with a minimal subset and only later on figuring out the additions here. (This is a thing that could be added later on on Logging if we decide it would help). I think we're fine to start with "recommendations" in documentations and if we see there is a lot of problems here then we can figure something out. Only instance where I can imagine issues is where 2 "runtimes" / "frameworks" collide by wanting to take over the Logging.make capabilities.

Having that said... the overall API looks great, and I'd like to polish the Metadata part a little bit and it would, in-my-eyes, seem to be ready to go. We don't aim to solve all things, but get a "common ground" with this API after all :slight_smile:

Hope this helps!

7 Likes

First of all, thanks for doing all the validation work, that is super useful!

This would require us to store the information in a thread-local variable and I'm not sure if we should offer this with the base API. You could build that as library functionality but I see how it might be nicer but my personal opinion is still to start without it and to maybe add it at a later stage. What do you think?

Actually, it being nil was more meant for LogHandlers that don't want to support logging meta-data at all (they would just always return nil from the getter and throw the value away in the setter). But maybe we just won't support that at all.
As you point out, removing everything is just logger.metadata = [:] or logger.metadata.removeAll()

I hear you and I agree with most things. The only reason I'm not fully convinced yet is that I don't think it's right to abuse the logger as a general meta-data storage. I know this is useful and needed for at least logging and tracing but I feel we should do the 'minimally viable metadata-storage' on the logger for now and then come up with a more generally useful and strongly typed meta-data storage type later. That could be called Context and could be able to give you access to the logger, its metadata, the tracing metadata and other contextual information. But my opinion is that we should keep this separate from this proposal but I do understand if people feel differently. So if it's a matter of making the meta-data [String: Any] to get everybody happy I can probably be convinced :wink:

1 Like