Logging levels for Swift's server-side logging APIs and new os_log APIs

Hi all,

As some of you may know, two new logging APIs are currently under development/evolution: a general-purpose logging API for Swift discussed here aimed at server-side programming, and a new Swift API for Apple's os logging system which is discussed here. The former is aimed at server-side development and needs to accommodate different logging backends and platforms, whereas the latter is meant for Apple's systems/devices and is meant to interface with Apple's native os-logging system. While there are differences between server and device-side logging, it would be desirable to provide a common interface and a consistent user model for functionality that is common across server and devices. A step towards this is to standardize the log levels, such as info, debug, error, that would be supported on both server and devices.

The aim of this post is to converge on this common set of log levels along with their recommended semantics. We propose to adopt the log levels defined by syslog (standardized by RFC 5424), which are listed below. They mostly only add to those available in os_log and the current server-side logging proposal. The levels also have a numerical code which is standardized by syslog and is listed below.

Level Numerical code Recommended semantics
Debug 7 Messages meant to be useful only during development. This is meant to be disabled in shipping code.
Info 6 Informational messages that are not essential for troubleshooting errors. These can be discarded by the logging system, especially if there are resource constraints.
Notice 5 Conditions that are not error conditions, but that may require special handling or that are likely to lead to an error. These messages will be stored by the logging system unless it runs out of the storage quota.
Warning 4 Abnormal conditions that do not prevent the program from completing a specific task. These are meant to be persisted (unless the system runs out of storage quota).
Error 3 Error conditions
Critical 2 Logging at this level or higher could have a significant performance cost. The logging system may collect and store enough information such as stack shot etc. that may help in debugging these critical errors.
Alert 1 Action must be taken immediately
Emergency 0 System is unusable

The semantics description proposed above is only meant to be a guideline and is borrowed from the syslog standard and os_log documentation. Each logging backend may more accurately define the levels in accordance with this recommendation. In the case of Apple's os logging system, the levels such as notice, warning, critical and above do not have a namesake in the system. They will be mapped on to levels that closely matches their recommendation. In this case, those are probably default, error and fault.

We welcome feedback on this.

cc @Ben_Cohen

15 Likes

This sounds quite pragmatic. It is indeed mostly adding to what already is there in the current proposal.

  • +1 on using syslog as our baseline here, I'm not in love with it, but it is true that it is well established and the log levels make sense™
  • +1 for the adding Error / Critical / Alert / Emergency if you think those will work well for os_log.
  • Q: Do you think we could keep/add the Trace = 8 level (present in current proposal, modeled after slf4j)? For libraries this is quite useful.
    • The specific use case I have in mind and have been struggling with for a long time (with an API locked into Debug being the lowest level) is the following: Without Trace available the library resorts to logging "everything I could possibly need to diagnose a problem" on Debug. This results in too much log output for end users under normal conditions, so they end up running on Info log level in production. Cutting off valuable debug logging, whereas I "had to" log on Debug since there was no other more fine grained level available. Workarounds are adding additional flags for logging in the library "log on debug those additional things" but this gets complex for end users while fire-fighting when I'd have to explain "turn on those 4 add-to-debug-logging flags" and someone would miss one etc.
1 Like

I very much approve of matching an existing IETF RFC.

6 Likes

This additional level makes sense to me. The only trouble I can see is if syslog separately also adds such levels, and they don't quite match what we define (say, they add an 8 and a 9 with different meanings). This risk of this seems very small though.

2 Likes

Thanks @ravikandhadai for starting this thread. Definitely one of the things we need to nail down before we can start using this.

Agree with @Ben_Cohen, think the risk is small enough that we can do this.

Big +1. I don't know all the server-side use cases, but if syslog is a superset of them, then this is a clear-cut win. Ship it!

@ktoso Thanks for the inputs. I have a few questions to understand the motivation for this level a bit more. And have added some comments on what this unification of levels will not guarantee.

  • As per the above recommendation (which is also a subject of discussion), Debug is meant to be log information that may be disabled in shipping code. syslog standard and os_log documentation seem to agree on this. So, one question is that if there is information that needs to be available in "end user" logs, why it couldn't be "info" or "notice". debug is meant to be disabled anyway in end-user logs. And, why not log the messages that you would like to log at trace level, at debug level instead and disable it in the end user logs. In other words, is the semantics you are associating with each level consistent with the above?

  • The levels here are meant to facilitate writing code that will compile on server and devices. But, the full semantics of the levels or to what extent they are fine tunable are all specific to a backend. (It is very hard to standardize this without changing existing systems.) What this means is, even if we create a "trace" level, when the backend is os_log or a syslogd it will be mapped to "debug" as there is nothing else available. So in these systems we can either enable both "trace" and "debug", or disable both. For this reason, the recommended semantics is very loosely defined and don't state whether disabling or enabling each individual level should be possible (as there may not be a one-to-one mapping for an existing logging backend).

  • Finally, there can be levels specific to a backend or those that are only available on servers and not on devices. This will not limit adding such levels. But, using any of those levels means that the code may need some rewriting to port it to devices or a different logging backend. This may be perfectly fine for applications that don't require this flexibility. (Applications that require this flexibility will have to compromise a bit and restrict itself to this common APIs.)

5 Likes

+1

one point worth clarifying / agreeing on is that logging, even at emergency level, is not expected to have any side effects on the process lifecycle. iow, backends implementing this, should never terminate the program or otherwise get involved in the process lifecycle

3 Likes

Thanks for digging into this Ravi, and especially the reminder to look closely at the proposed associated semantics.

Right, I guess it even means "should be disabled in shipping code"; which is how I read the "meant to" in the Debug level semantics:

Messages meant to be useful only during development. This is meant to be disabled in shipping code.

This sounds good and matches what I had thought of as Trace. It is hard sometimes to re-associate a words meaning in one's head... so I had still in my head the meaning of Debug I've been used to.

I think it all works out in the end thanks to an level/severity between Info and Warning: the Notice level. We did not have anything between Info and Warning the current logging proposal (which was modeled after slf4j and similar ones).

This has API impact so we need to decide if we allow this or not, so let's make sure we counciously pick what we want here:

Since Level is nicely modeled as an enumeration such "levels specific to backend" means we'd require being able to represent them as "custom" Level which was initially decided against in the SSWG Logging thread. This can be considered again, but should be a conscious decision that we do allow these.

If we want to allow such "specific to platform/backend/anything" log levels then the enum Level would have to be gain case custom(Int) (requiring that int to be > 7). Otherwise in the user facing APIs (.log(level:message:...)) there would not be a way to pass in a different value. It has some "abuse potential" technically speaking... Though it could be seen as an escape hatch, and if an "unknown level" is seen by any LogHandler it would flatten it down to Debug anyway for example. No strong opinion here to be honest, just wanting to make sure we specifically allow or disallow this (with current proposal being on the disallowing side).

Summing up: For my, and the general "library which wants to log a lot of debug info" use cases, I think the above proposed semantics can work. :+1: With the proposed semantics what people may be used to as "trace" from e.g. Java logging ecosystems here would be "debug." The trace question though may have become an "do we allow custom" which was discussed a bit already and decided against in the logging proposal.

"Very verbose logging" can be then Debug, and things to look at during normal operations can be split up between Info and Notice.

1 Like

I see. Thanks for pointing that out. I was thinking, since users can extend the struct Logger (of the server-side logging proposal), creating custom log levels could be possible. But I noticed that Logger.logHandler has internal access, and as you point out the log levels are also enums. So I see that creating custom levels is really difficult (if not impossible) in the server-side logging proposal.

What I would also like to convey is that server and device side loggers are free to enrich their APIs and provide more levels/functions that makes sense only on one side, if needed. That shouldn't be limited by this standardization.

I'm not sure I agree with this. Logging systems have a diverse set of design criteria and a history of competing "standards" to attempt to unify back-end systems. In this case, I would see the difference as:

  • os_log focuses on performance, and itself considers itself a unifying system (piping NSLog and syslog on Apple platforms into the new system).
  • general purpose logging interfaces focus on flexibility for code reuse, such as having components shared on client and server or across platforms, or deployed in customer environments with differing logging requirements.

A generalized interface indicates which features must be exposed and which features can't be exposed. For example, syslog backed by the apple os_log system (itself being a unifying interface) cannot directly distinguish between critical, alert, and emergency levels. Conversely, features such as diagnostic contexts, markers, and privacy protection of dynamic parameters are challenging to expose via a common interface as they don't have equivalent functionality under all logging systems you may want to map into.

Using a generalized interface also results in performance/memory impacts that you may not have by using a logging system directly. This includes serializing to a string rather than using a packed format, creation of closures to defer creation of that string, dispatch to a protocol rather than using final types, and missed opportunities for the compiler to optimize (such as potentially stripping debug and trace statements from release builds).

Thats not to say that either a generalized or specific interface isn't valid; both are valid interfaces to expose, and thus I disagree with the assertion that syslog and os_log should be exposed via a 'common' interface. Rather, I think using os_log or syslog directly would be appropriate for some applications, while a generalized interface would be appropriate for others.

This also potentially allows for said generalized interface to be simplified, since the more specific interfaces to directly utilize a logging system's advanced features is available.

Thanks for raising those important points. Server-side logging APIs and os_log will remain as two different logging libraries. This post is aimed at using same names and similar semantics for log levels that will be common between the two APIs, to reduce confusion when porting code from one system to the other.

It wouldn't be desirable for an often-used level such as debug to have one meaning/behavior on the server-side API and a very different one in the os_log API. We would like to converge on some high-level recommendations for the log levels to reduce confusion and prevent such divergences.

Just to add further clarity, the following are some important differences between these two logging libraries:

  • There won't any dependency from server-side logging library to os_log or vice versa. os_log will not be implementing any protocol of server-side logging library such as LogHandler. They are two different logging libraries sharing some common names.
  • There will not be any common binary interface between the two logging APIs. In fact, the log calls such as info, debug, log etc. will have very different types in the two APIs.
  • The plan described here for improving os_log applies to the new os_log APIs. Only the names of the log calls will change.
  • The new os_log APIs will expose all functionality offered by os_log using Swift 5 extended string interpolation syntax as described in that thread. E.g. it will expose privacy qualifiers, custom formatting etc. The os log calls will be optimized using compile-time interpretation.

What this unification of log levels will enable

An existing os log call of the form os_log(.info, customLogger, "message: %lf", timeInterval) will now look like customLogger.info("message \(timeInterval)").

The type of customLogger has to be established at compile-time. This will enforce at compile-time whether info is an os_log call or a call to the server-side logging API.

If customLogger is defined as an os_log logger (by importing suitable libraries), when the program is compiled and executed, the string interpolation passed to the log call will be deconstructed into a format string and argument buffer and will be passed to the os logging system (see here for details on how this can be accomplished).

On the other hand, if customLogger is an instance of the type Logger of the server-side logging proposal, the string interpolation will get serialized to strings and the message will be logged through the backend used in the initialization of Logger.

What this unification of log levels will not enable

Log calls using features that are supported by only one of the two libraries will not compile with the other.

E.g. if a user uses customLogger.info("message \(timeInterval, is: .private)") indicating that the timeInterval is a private value, compiling the code on the server-side (by making customLogger an instance of Logger) will trigger a type error. (The syntax for specifying privacy in the os_log APIs is not fully finalized yet.)

Similarly, a log call of the form customLogger.info("message \(timeInterval)", metadata: myMetadata) will compile with the server-side logging library but not with os_log.

However, it is to be noted that sever-side logging APIs may evolve in the future to support formatting options for logged data, and to support privacy qualifiers. When such features are added, it presents an opportunity to unify those functionalities between the two logging libraries.

5 Likes

Are you saying that the os_log(.level, ...) interface will change to customLogger.info(...)? If yes, how would we initialise the customLogger? Is all this documented somewhere? I can't see any mention of it on the os_log thread.

Anyway, this sounds great as it would allow greater reuse between the two systems. :heart:

That is right. Please find here a PR that defines the new APIs for very basic logging. The APIs are very much under development and can change. They are also not fully functional yet. They will be enhanced through incremental PRs. They are intended to be used only in tests and for experimentation purposes. In the current model,

Logger() creates a default logger instance i.e., OSLog.default. Logger(subsystem:,category:) creates a custom logger instance. So you would do,

  let customLogger = Logger(subsystem: "com.org.App", category: "network")
  customLogger.info("A number \(10)")
  customLogger.debug("A number \(20)")
  ...

The levels can also be passed as parameters to a log method: customLogger.log(.info, "A number \(10)").

2 Likes

Awesome! Loving the convergence between the two APIs. Btw @johannesweiss, could the Server Logging framework also benefit from the custom String interpolation?

Edit: removed an idea I had as it's not feasible

1 Like

Yes! In fact, I'm moving repos right now and the struct Logger.Message as the message type is already in there. We're not taking advantage of it really yet but we will be able to without breaking API. Stay tuned :slight_smile:.

EDIT: Same for the logging levels, switching to syslog levels as proposed here.

3 Likes

I mostly agree with @ktoso comment:

  • +1 for syslog as baseline
  • +1 for adding levels above error
  • +1 for adding Trace

Things I'd like to add:

  • Description for Trace: “Messages that might to be useful when searching hard to track bugs. This is purely opt-in, it is disabled by default even in standard debug builds. (E.g. Things like Entering/Leaving function)”

  • Naming of the levels: What's worse, an alerting situation or a critical situation?

    • Rename Alert to Fatal: Alerts are dialogs that users ignore. This does not fit here.
    • Rename Emergency to Panic: Kernel-panics are well known and this is the equivalent.
1 Like

To clarify though: In my latest summary I noted that lack of Trace is likely fine, since the presence of Notice and Info offering more granularity than if we just had Info already.

Trace would be nice (and I like @dhoepfl's proposed wording of it), but sticking directly to syslog levels is also good... Either works I think, so leaving it up to owners of the logging APIs and this proposal to decide :slight_smile:

Either way, very much looking forward to this landing!

@ravikandhadai/@jrose/@Ben_Cohen/@ktoso/@Michael_Ilseman/@tomerd/@dwaite/@hartbit/@dhoepfl Sorry to mention literally everybody on this thread but it has gone quiet for a while...

As you might know, we have adopted the proposed syslog levels for swift-log and we haven't heard anything that might suggest the proposed log levels might not be okay, so I think we're all good here :+1:.

I just have one question left regarding the 'numerical codes'. Should we really expose the log level as integral values? The reason I ask is: Before the switch to syslog log levels, swift-log used the numerical codes the other way around. So debug's numerical value was less than than error's whereas syslog has debug's numerical value greater than error's.

Now I don't want to discuss which system is 'right' or 'better' but I think with these two examples show that people seem to have different assumptions about log levels. After thinking about it, I would propose to completely hide the numerical value and to not offer the < operator/Comparable conformance and instead have only one function

extension Logger.Level {
    @inlinable
    func isMoreSevere(than: Logger.Level) -> Bool { ... }
}

It feels like everybody would agree that Logger.Level.error.isMoreSevere(than: Logger.Level.debug) should be true. Needless to say we'd still have the Equatable conformance.

Does that sound okay or do we feel we should also keep syslog's numerical codes?

5 Likes

The only place I can see a standard numerical value be beneficial is for serializong log messages in a binary format. If someone were to define such format I think they would have to define much more then just the number of values so it becomes their problem how to represent those logs anyway => so even in the one case where this could matter for someone I don’t think it wins much...

I’d be okey with hiding the numeric values :+1:

The wording for the method seems ok... though would loggers want:

if incomingLevel.isMoreSevereOrEqual(myConfiguredLogLevelAtWhichIShouldLog) { 
   handler.log(...) 
}

(which currently we have as if self.logLevel >= level { in the API itself.

?

1 Like