[Discussion] Server Logging API

Needless to say that as soon as this is available to us the server logging API should incorporate all of the exciting performance wins that will become available using custom string interpolation and compile-time interpretation of that.

Our API will likely always be wider than osLog because of the extra requirements but we will still be able to leverage performance gains by the new compiler/language support :raised_hands:.

2 Likes

Questions for clarification:

  • Are "Log", "Logging", and "Logger" distinct types or typos? Why do API names jump between using these three words seemingly randomly?
  • Does "Log", "Logging", and/or "Logger" conform to "LogHandler". If not, why not?

Overall feedback on the API:

This design seems to deviate pervasively from sometimes basic Swift design precedents, and it does not appear to justify those deviations. For example:

  • Why is initializing a logger spelled as a static function make and not as an initializer?

  • Why are subsidiary types such as LogLevel and LoggingMetadata (which, as per above, why aren't these both Logging or Log?) not nested types as per the example of Int.Words, Mirror.DisplayStyle, String.Index, Unicode.Scalar, and others?

  • Why does the protocol-with-global-default pattern here deviate from the precedent of RandomNumberGenerator, with the protocol named Log and the default named DefaultLog?

  • Why does using a specific logger not follow the pattern of print(_:to:), with the global logger as the default argument for to:, instead requiring users to find some way to spell the global logger? Critically, why is the "currency" (i.e., go-to) function not called log?


From the user perspective, and with due respect for principles of progressive disclosure, I would propose that the most basic API should look like this:

// Log a message to the default logger, with the default level
log("Hello, world!")

// Log a warning or error to the default logger
log(.warning, "Hello, warning!")
log(.error, "Hello, error!")

// Log a message to the default logger, with the default level,
// with custom source location
log("Hello, world!", file: "hello.swift", function: "", line: 2)

// Log a message to a custom logger, with the default level
var customLogger = CustomLogger("com.example")
log(to: &customLogger, "Hello, custom world!")

// Log a warning or error to a custom logger
log(.warning, to: &customLogger, "Hello, custom warning!")
log(.error, to: &customLogger, "Hello, custom error!")

// Append some metadata
customLogger.metadata["world"] = "hello"

// (Incidentally, why does the proposed API have a distinct subscript
// to access metadata keys and values *and* a publicly accessible
// member `metadata` that's a dictionary?)

Once we have a consensus on how use sites should look, the rest of the design can be derived on that basis.

10 Likes

First, thanks for moving this API forward.

I have some concerns with the proposed API:

  • There should be an easy & quick way to access a shared Logger instance without needing to make one. The API should include one and should be the default. Maybe something like Logger.default.info("Starting process...") or even better, using static methods Logger.info("Starting process...").
  • If I have understood correctly, it seems that with Logging.bootstrap() you can only attach one LogHandler. As with the Python logging API, I think we should make very easy to attach or detach any number of handlers to the logging system. Something like Logger.attach(handler:) and Logger.detach(handlerIdentifier:).
  • Each handler and logger should include a unique identifier. It would be great if using the default Logger, implies sending the messages with the identifier or name of the module, but I don't know if that's currently possible.
  • The existence of Logging and Logger is a bit confusing, I rather prefer to only expose a Logger struct and a Logger.Handler protocol.
  • Logger.warn() should be Logger.warning().
  • The LogHandlers should receive the identifier of the Logger being used, so we could filter messages emitted from some given subsystem or module.
1 Like

Thanks for the feedback, let me address them one-by-one

No, they are distinct and I'm obviously totally open to change the names or add clarification:

  • Log: I can't find this anywhere
  • Logging is the logging system itself, the one where you get Loggers from and the one where you register your LogHandler
  • Logger is the type that you invoke info, warning, error, etc on. It's the type that users would interface with

You would ask the logging system (Logging) to make you a logger (Logger) emitting its messages to a logging backend LogHandler. In most cases, the main function will choose a LogHandler, install this into the Logging system. Libraries and other parts of the program will never interface with a LogHandler, they will just ask the Logging system for a Logger if they don't get handed one from their execution context.

  • what is Log?
  • no, neither Logging, nor Logger conforms to LogHandler. Logger holds a LogHandler and Logging creates Loggers.

That's a fair enough suggestion, happy to take it if others feel similarly.

There is no global default. There is a very barebones LogHandler that is installed if you forget to choose one (StdoutLogHandler). A server application will likely have opinions about the logging system they want to use. They will need to create their own LogHandler and make the format match their Splunk/Kibana/... configuration so it can later on be parsed by those systems.

I'd say most serious server applications will have metadata attached to the logger itself. If you don't attach a request ID for example in many systems you might just as well not log at all. So I think

logger.info("hello")

just looks and feels better than

log(.info, "hello", &logger)

but obviously that's subjective. Also we would not hog a global free function (log).

Is there such a thing as a 'default log level'? And again, we likely need some metadata so people should re-use loggers if possible. It should always be easier to use a logger that may already be available than some default logger otherwise attaching the correct meta-data is very hard.

Even if people do the moral equivalent of what you write above which would be

Logging.make("com.myapp.MyComponent").info("Hello World")

then at the very least we have some component. Remember that in server software you often want to configure a different log level per request (say you do every 10,000th request at .trace level) or per subsystem (if something went wrong with the com.myapp.MyComponent component you want to log that component at say the .debug level and everything else at .warning). But obviously many requests and components are active in the same process concurrently.

3 Likes

Thanks very much for your thorough feedback!

I think I disagree here. If there's a default logger available, everybody will use it because we're lazy. But I do understand that in certain places this just what people want. I thought that we cater enough towards that crowd by supporting putting a single global variable

internal let logger = Logging.make("com.cool-library.TheBestLibrary")

in the TheBestLibrary. Then everywhere in TheBestLibrary you can just write

logger.info("hello")

But at the very least you need to reveal 'who you are'. Is this one extra line per module really such a big deal?

You did understand that correctly. If you really want multiple, why wouldn't you install a MultiLogHandler that would allow you to attach and detach handlers? I deliberately didn't want to design for 'all eventualities' but have an API expressive enough that you can get there if you need to.

Agreed, every logger should have an identifier but I'd like not to have a default logger if possible. AFAIK we can currently only get #function, #file, #line, #column and #dsohandle. #dsohandle in most SwiftPM applications will be the same across the whole executable (including all libraries). But if getting the current "scope's" name (ie. current type if any/module/...) were possible I'd also be in favour of using it.

Yes, I should have made a better job explaining that. Logger is the logger you call your log methods like info, warning and error on. Logging is the logging system where you obtain loggers from. Very happy to rename if anyone has good suggestions. Previously I had it down as LoggingAPI but that's not too nice, LoggingSystem would maybe be more clear but it a bit long. On the other hand, apart from the initialisation in the main function, it's only ever used to create new loggers.

Cool, happy with that, filed.

Oh sorry, yes I forgot to add that to the protocol, filed.

3 Likes

I do not want to hijack this thread, nor do I intend to ruin its success. I've seen quite a few logging approaches in Swift over the past few years and I also stumbled into few corners where I wished we had more flexibility on how we can obtain logs.

By the end of the day logging is just another way of printing a message to some sort of destination stream. That makes me wonder of we potentially could improve the Streaming API in the stdlib and either generalize print for logging, or build a standalone API on top of the improvements.

Here is one thread which I wish we could pursue as part of this discussion (if that's a reasonable thing to do):

2 Likes

Really good start. That said, I really encourage us to stay open minded and to challenge our core assumptions on what the ergonomics of a best-in-class logging API would look like for Swift. In other words, we have the ability and the responsibility to break away from the standards defined by other languages that lack expressiveness (like Java) where it makes sense to do so.

1 Like

I think that's exactly the right spirit. I'm not claiming this being superior to anything else, it's just the best we could come up with so far given the constraints.

I like the overall design very much. I also think though that it would be beneficial to have some kind of MultiLogHandler as you called it shipping by default. I guess that many developers want to use multiple logging platforms, which could each provide their own handlers.
Of course this could be done by any developer after the fact, but having one 1st party solution might be beneficial.

Another design I could imagine could work like NIO's channel handlers. Provide a pipeline for log handlers, where you can attach and detach as many and in whatever order you like. (I would actually prefer this to the one MultiLogHandler design)

3 Likes

I know it's sometimes considered an anti-pattern to use nested declarations as a sort of namespace, but perhaps we can have something like this?

/// Renaming of `LogHandler`
public protocol LogMessageHandler { ... }

public enum LoggingSystem {
    public struct Logger { ... }

    public struct Configuration {
        let handler: LogMessageHandler
        let metadata: ?? ?? // Bikeshed?
        let minLogLevel: ?? // Bikeshed?

        // pseudo until discussion on metadata & log level API
        public init(metadata = [], minLogLevel = .trace, handler = StdoutLogHandler)
    }

    public static bootstrap(with config: Configuration)
}

// usage
let configuration = LoggingSystem.Configuration(handler: MyCustomHandler())
LoggingSystem.bootstrap(with: configuration)
let logger = LoggingSystem.Logger("com.myapp.MyComponent")
logger.info("Running startup...")

I definitely would like to see this provided by the library - but it can wait until the underlying design is done.

EDIT: As I think we should keep the discussion on the core foundational API, and additional feature requests / discussions happen in the original pitch thread or we create new ones.

This is an amazing start! Some thoughts:

  • I've found the ability to attach multiple LogHandlers (or the equivalent) to be of considerable use in other logging systems, especially when hooked into whatever configuration system is in use.
  • I wish we didn't need to attach a bare-bones handler by default. I agree that making the failure to do so fatal creates an excessively high barrier to entry, but it's something very easy to just leave in place and never bother with later. Could we possibly print a warning when that behavior happens so users get nagged about it? (And, of course, permit explicitly specifying the StdoutLogHandler to say "yes I really just want this" and silence the warning.)
  • I like the subscripting for metadata, but I think they would be better CustomStringConvertible than straight-up String, or maybe a new CustomLoggerMetadataConvertible protocol if the former proves difficult for whatever reasons. The flip side of this is it potentially hits performance issues - I don't know what unwrapping existentials at runtime costs these days versus static dispatch.
  • I think an MDC API is overkill at this layer, it's certainly easy enough to add such a layer if one wants it. Maybe something that could be built as an optional add-on.
  • In general more control over how the metadata gets formatted might be worthwhile. You linked to the OSLog proposal already; that has some very interesting ideas for possibly making the metadata keys strings after all, such that the interpolation would be used to generate already-well-formatted strings.
  • I do like the idea of namespacing things - Logger.Level etc. strikes me as more "Swift-y" than LogLevel.
1 Like

I didn't include that in my snippet proposal above, but I definitely agree with this.

Personally, this is something I'd like the library to have an opinion on and provide me - but we can do it as another abstraction provided by the library, rather than baked into the lowest implementation.

I get the feeling that we should let LogMessageHandler/LogHandler should be responsible for the formatting and just receive passed information metadata, identifier, level, etc. - perhaps some sort of Context type?

that is exactly the design. LogHandler (ie the specific logging implementation) determines formatting

This post has several different not-interrelated bits, so I've broken it into sections:

Formatting

On reflection, I find myself oddly ambivalent about how I think formatting - especially for metadata - should be handled. Consider these examples:

protocol CustomLoggerMetadataConvertible {
    var descriptionForLogging: String { get }
}
extension CustomLoggerMetadataConvertible {
    /// Default to simple description for all objects
    var descriptionForLogging: String { return String(describing: self) }
}

struct MyComplexContextuallyImportantThing { /* ... */ }
extension MyComplexContextuallyImportantThing: CustomLoggerMetadataConvertible {
    var descriptionForLogging: String { return "\(/* machine-parseable logging format */)" }
}
extension MyComplexContextuallyImportantThing: CustomStringConvertible {
    var description: String { return "\(/* human-readable format */)" }
}

//...
let context = MyComplexContextuallyImportantThing(/*blah*/)
myLogger[metadata: context]
// OR, remove the `CustomLoggerMetadataConvertible` conformance and:
myLogger[metadata: "\(context./*logging format*/)"]

The two different ways of setting the context as metadata have some interesting tradeoffs:

  • The protocol conformance allows the same magic that CustomStringConvertible does for String.init(describing:): Each type knows how best to represent itself for logging and can do so without that knowledge leaking into the logger setup.
  • On the other hand, the protocol conformance represents (a very constrained form of) spooky action at a distance - the code setting the metadata has no control over how a given object may represent itself and changes in that representation may break things.
  • Using a new protocol instead of CustomStringConvertible allows an object to represent itself differently inside a logging context than outside, and the default implementation avoids needing anything other than the conformance declared fro objects that don't care.
  • You still end up needing the conformance, unless the subscript takes Any and does as? checks for conformance at runtime.

This is ultimately the same problem as shows up in any use of conformances - the decision about where the control should be is complicated. I see the metadata subscript as having several possible signatures:

// Strings are represented directly
public subscript(metadataKey metadataKey: String) -> String? { get set }

// `String(describing:)` is used
public subscript(metadataKey metadataKey: Any) -> Any? { get set }

// Same as previous, but also checking for a `CustomLoggerMetadataConvertible` conformance, and using it if present
public subscript(metadataKey metadataKey: Any) -> Any? { get set }

// Only conforming types may be used - good thing `CustomLoggerMetadataConvertible` doesn't have Self or associated type requirements!
public subscript(metadataKey metadataKey: CustomLoggerMetadataConvertible) -> CustomLoggerMetadataConvertible? { get set }

// Same as above, but using `CustomStringConvertible` instead of a separate protocol
public subscript(metadataKey metadataKey: CustomStringConvertible) -> CustomStringConvertible? { get set }

I'm not sure where I personally land on this one. I like CustomLoggerMetadataConvertible (name open to debate of course :slight_smile:), but I don't like having to conform all metadata I might use to it and I also don't like the idea of the accessor having to take and return Any. Maybe have two overloads, where the second one takes String or CustomStringConvertible, though that obviously complicates LoggingMetadata's internal representation.

LogHandler API

Log handlers need func openLog(file: String, function: String, line: UInt) (maybe without the parameters, I'm not sure how useful they are in this context) so logs that care about/need initialization after init() but before the first log message aren't forced to maintain a "did I do my extra setup yet" flag.

There's also potential room for func closeLog(/* etc */), with the caveat that a log handler can never rely on the close function being called since it wouldn't be in the case of sudden, abnormal, or crashing termination. ... In fact, come to think of it, that caveat is so significant and so easy to accidentally ignore (there's certainly no way for the compiler to enforce it!) that it's probably better not to provide a close method at all. I'm only leaving the thought in this post for the sake of transparency of thought process (in other words I don't want anyone to accuse me of not thinking of it :laughing:).

Custom log levels?

An additional issue I find I run into in real-world usage is that the predefined log levels provided by a given logging library are almost never perfectly suited (often not even well-suited) to the needs of my project. For example, in Vapor-based services I tend to use LogLevel.custom("database") to enable separating (often very noisy) database query logging from other logging. Obviously this is difficult to make cleanly extensible in Swift. I find Vapor's .custom(String) API undesirable due to being stringly-typed - I always end up wrapping it in a non-trivial amount of extra logic to avoid accidental misuse, and that still doesn't address the problem possible memory and performance issues from adding a case with a payload to the level enumeration.

The problem is I don't know what an alternative to the stringly-typed case would be. Even if pure Swift supported "open" enums (which it doesn't), conflicts between custom cases would be immediately apparent between modules (which is part of why open enums aren't a thing, if I recall!). This is also a problem with the stringly-typed method. Namespacing of log levels would tend to be painful to use, and reading the results as much or more so. This might be addressable by making LogLevel a struct (in the style of OptionSet) instead of an enum and working out a means by which conflicts would be minimized.

The real question is, at that point, is it worth the complexity? I find that it is, but I don't know that it would be for the majority of projects.

Logging configuration

This brings me to the other feature I find myself missing consistently in Vapor's support that this proposal doesn't yet address: Configurability of log levels. This covers:

  • Do I log messages at this level at all?
  • Are log levels strongly ordered (so that a currentLogLevel of warn covers both warn and error but nothing else) or individually controlled (such that the current level would be represented by something like Set<LogLevel>)? If the latter, is the "level" more accurately described as a "type"?
  • Is there value in an additional field such as "subsystem" and/or "category" (as found in os_log()) for providing (even) more control?
  • Should this decision be left up to LogHandlers or be encapsulated by the Logger?
  • Is "current level" global, per-Logger, per-LogHandler, or something else?

I'm not sure whether or not this is even in scope for this proposal, but I thought it worth mentioning. If it is, I do have some thoughts on API for it, but it's not sensible to take the time for that unless it's in scope and until some of the questions are discussed.

And so on

I could keep going with more ideas, there's so many things a really "robust" logging API might do, but I think I'm already on the knife edge of the complexity threshold - one of the explicit design goals of having this in SSWG is to make it both simple and robust, after all.

Apologies to anyone whose thoughts I've repeated without giving due credit - I didn't read the entire thread thus far exhaustively; I've only skimmed.

6 Likes

to add a bit of background, the core idea behind the proposal is offering a standard logging API, not a standard logger implementation. in the server ecosystem, you want strong decoupling between the logging API and the implementation for two main reasons:

  1. you dont control the operating system and architecture, so may want to plug-in architecture/os specific logger implementation when writing cross-platform systems, eg use os_log based logger implementation on darwin, and something else on linux

  2. large enough server deployments typically integrate with log indexing systems such as splunk, or elk, or similar systems that enable scalable monitoring / operation of the fleet. the integration with such systems may impose certain requirements on how log entries are dealt with, for example what format they are written in (so they are easier to index), or if they are written to files, or posted over HTTP to a remote sever, or put in a queue, etc. this means that making opinionated logging decisions at the calling site (API) can get in the way of the application trying to integrate with such systems. similarity, if the application dependencies (ie libraries) make opinionated logging decisions or assumptions, that will get in the way of the application trying to integrate with such systems.

for this reasons you want to:

  1. define a logging API that is unopinionated enough so that all libraries in the ecosystem can rely on it without getting in the way of composition

  2. define a pluggable logging system that allows application to make the opinionated decisions by bootstrapping with a logger implementation that fits their operational model. eg some may want to write to files in a human readable format, but others may want to send to one of the log indexing systems mentioned above in json or key-value format. similarity, you may want to do different things when developing/debugging vs when in testing vs when in production

tying this back to the proposed API:

  1. Logger is the unopinionated calling site API which users are expected to interact with the most. its designed for libraries (eg database drivers, web frameworks) developers so they can provide transparency to their users, as well as applications developers to log throughout their application code

  2. Logging is the pluggable logging system, designed to allow the application owner to plug the logger implementation suited to their operational needs. this is done by calling Logging.bootstrap in the application main

  3. LogHandler is the API designed for developers of logger implementation, aka logging libraries. iow, this API is not designed to be used directly by library (eg database drivers, web frameworks) or application developers. there will probably only ever be a handful implementations of this API in the ecosystem

one other point of reference for the reviewers of this proposal are the unit tests (GlobalLoggerTest, LocalLoggerTest). they include different implementations and usage of these APIs that could answer some/many of the questions raised on this thread

4 Likes

I suggest adding the parameter

data: [(name: String, value: CustomStringConvertible?)]? = nil

to each of the methods used to emit a message (i.e. trace(), debug() etc.). This makes it easy for the developer to provide additional information with the message structured in a way that enables the LogHandler to format it consistently. The only real "gotcha" for application developers would be that the CustomStringConvertible would need to be thread safe or value types. Of course, LogHandlers would need to extract their strings and release that reference promptly. I find having the "name" and "value" labels useful when specifying data which contains several data tuples but of course that's personal taste.

I also think its worth considering that "context" is one or more name/value tuples that a Logger prepends to any data received with each message. Adding a feature

public struct Logger {
     forwarder (level: LogLevel, context: [(name: String, value: CustomStringConvertible?)]? = nil) -> Logger {}
}

which returns a Logger that prepends its context to the data and forwards the message along to its parent (or filters it) could also be useful.

A design which enabled adding arbitrary log levels through extensions would be desirable, but the non enum based implementation required might introduce too many issues. Without doing the analysis needed to really understand the trade offs, I think I would prefer library fixed levels. However, I would suggest a somewhat more granular set of options:

↓ Levels that would generally be activated in development, maybe in testing

  • trace
  • debug

↓ Levels that would generally be activated in testing, maybe in production

  • info: normally expected event
  • warning: an event which may or may not indicate a problem

↓ Levels that would generally be activated in production

  • business: normally expected event whose entry is consumed by other systems
  • error: an event which indicates a problem
  • emergency: the system requires immediate attention

The task of isolating logging output by subsystem strikes me as orthogonal to log levels. Having application developers provide different loggers (set to different log levels) to different subsystems strikes me as a simpler approach than enabling the log system to filter by both level and subsystem. The "forwarder" function I propose above would make this easier.

One thing I see about these APIs is that they violate the recommendation of Ole Begemann: Don’t use CustomStringConvertible as a constraint, because "anything is printable".

1 Like

Yeah, that was one of my concerns with using CustomStringConvertible - though in a "logger" context, "anything is printable" may actually be the desired behavior. I'm not sure if that really applies to metadata, though, especially when the desire is specifically to have metadata that can be used to machine-parse the log files.

sure, but with CustomStringConvertible the common case:

var index = 2
[(name: "index", value: index)]

is a bit nicer than

[(name: "index", value: "\(index)")]

and you can still use string interpolation or String(description:) on those rare occasions when you have something which is not CustomStringConvertible.