[Feedback] Server Logging API (with revisions)

I agree that allowing uninitialized use as well as multiple initialization are both problematic designs. The design of this logging API is vulnerable to exactly these problems and they are part of the reason I am arguing against this design!

These problems arise when a library exposes globally visible APIs rather than requiring explicit initialization of instances and it also requires initialization of some kind . The best libraries I have used avoid this kind of global state and use explicit instantiation instead before making their APIs available. This avoids the possibility of use before initialization and also avoids multiple initializations applicable to the same instance. This is the kind of design I am advocating for here.

I haven't seen your questions answered for the proposed logging API. How should it behave if someone requests a logger before bootstrap is called. Is it a trap or do they get a no-op logger or a logger with some default handler? What if bootstrap is called more than once? Does the application's handler get overwritten and every subsequent logger that is created get routed to the "wrong" handler (i.e. not the one the application bootstrapped with)?

My argument is precisely that we should not adopt a problematic design for this API because we think it will avoid trouble for other libraries. We should adopt a well-designed API with clean semantics and hope that other libraries doing the same will win out in our ecosystem.

For starters, see the questions Cory asked above which to my knowledge have not been answered for this library and particularly with regard to #2, I don't believe a satisfactory answer exists. As far as I can tell, as designed a library has the ability to stomp on an application's handler (as well as arbitrary application code that has no business messing with the handler). The same issues apply to mutating metadata, setting the log level, etc.

Can people ship functioning applications using designs that have semantics like this? Of course they can. Does that mean it's a good idea to design APIs with semantics like this? No it does not.

I am not arguing against the ability for applications to do things like this! Using the approach I have suggested, an application could trivially support a global makeDebugLogger() function. All I am arguing against is the idea that we should impose this capability on all code that uses this logging API.

Are you really suggesting that we should make important semantic design choices of an ecosystem standard logging API based on the stated indifference about logging of an application developer "who doesn't care at all about logging"? Sure, they might grumble about having to pass a logger to a library but people grumble about all kinds of things and then move on.

My point is that there is no good design choice that can be made here. If you support dynamic reconfiguration then the application's handler can get overwritten. If you don't, code that expects to specify a handler is broken.

Both of these have multiple perfectly reasonable valid answers (I'm not going into them here, happy to do so off thread if you'd like to follow-up on it). The advantage with this proposal is that there is only one place that needs to answer them. Failure to initialise should get a default behaviour: what, exactly, that default behaviour is can be decided by this working group. Multiple initialisation should be reasonably straightforward to avoid by way of noting that the only place that should ever call bootstrap is whichever Swift module owns main.swift, but if it is not avoided we can choose any failure mode we believe to be reasonable, up to and including trapping.

In the multiple initialisation case, however, the likelihood of this happening is almost zero. It would require that a library choose to call bootstrap, which they should not be doing. Your proposal does not prevent this either: in a hypothetical proposal where there is no global logger factory, libraries should not be manufacturing their own isolated logging subsystems, but nothing prevents them from doing so if they want to. This API discourages that behaviour by making it extremely straightforward to get hold of a logger.

I personally strongly recommend that libraries be willing to be passed loggers, rather than magicking them out of thin air. However in my view there is simply too much prior art in favour of "magic loggers that appear from nowhere due to the magic of mutable global state" to decide to ditch that, even when as purists we would prefer to.

How is this industry ever going to move forward if this is the attitude adopted when defining a standard API for a very young ecosystem? Are we supposed to be forever wedded to that prior art which many of us agree is not a great idea?

It would be straightforward in application code (by recreating from scratch a facility this proposal provides), but wouldn't it be a bigger headache in library code (I mean when debugging library code used in an application)?

By picking our battles.

When designing this standard API we need to wrestle with the fact that, while the ecosystem is young, it has running code already in production. Abandoning those early adopters or forcing them to change is sometimes valuable, when that change unlocks important new abilities for them, but in this instance I don't believe it does. The downside of taking such a hard-line approach is that it fails

Logging is simply a library, it costs nothing more than code to build and use an alternative, and most of the ecosystem has already built their way of doing it. If we don't support them, they will simply continue to use their own model, and this new API has failed before it started.

Worse, the "injected logger" approach is trivial to support when using it from "magic global logger" frameworks and libraries, as they can simply wrap their magic global logger in an instance logger. This means that users adopting the new API that want to use "magic global logger" libraries just get no logging (or worse, two incompatible logging infrastructures), while users keeping their "magic global logger" can trivially also use libraries and frameworks with the new API at full functionality. This hardly provides an incentive to change your code!

I think there is a time and a place for refusing to support disfavoured ways of doing things, but in my view the tradeoffs here don't support that course of action. Of course, reasonable people may disagree. :slightly_smiling_face:

There are a quite a few approaches to meeting the desire for configuring different log levels for different parts of the codebase without allowing the global level to be set from anywhere. They all rely on using the derived logger approach rather than mutating an existing logger.

The smallest change to the current design would be to simply allow a new log level to be set when deriving a new logger, although that gives up all upstream control over the log level. A small enhancement to this approach would be to allow the application to specify a minimum and maximum level when configuring the root logger (and wiring up the handler). When a new logger is derived from an existing logger, an optional log level could be requested (perhaps along with a new requested minimum and maximum). These values would be clamped to the min and max of the existing logger.

The downside of this approach is that the derived logger may not actually get the level it asks for, but that behavior would be made clear by the names used in the API. The upside of this approach is that an application still has the ability to control log levels in a centralized manner if desired (just set min and max to the same value) or to provide a bound on the configurability of the log level. Applications that don't care would just not specify the min and max log levels.

Another approach would be to make Logger generic over a phantom Capabilities type. Using this approach, you would be able to use constrained extensions to provide different APIs depending on the specified capabilities. This approach introduces some complexity, but would scale up to support lots of different behaviors (and absence of behaviors!) allowing the API to adapt itself to more usage scenarios. For example, you could also make the ability to set thread local metadata a capability. Typealiases could be used for common configurations so if it desirable.

A simpler, but more rigid approach than making Logger generic would be to just hard-code a second concrete logger type that supports the ability to specify the log level, leaving the "basic logger" without that capability and only supporting it on an opt-in basis by user the LoggerSupportingDynamicLevels (or whatever you want to call it). The premise of this approach would be that changing the level is the only capability you really care about differentiating and that most of the time you don't want code to do that, but do need to do it in a few places.

Another approach I have seen used in some places is to specify a "context" alongside the log level when emitting messages and support setting a log level for each "context" at the root. This approach would require the ability to optionally map the context when deriving new loggers. It's probably not the right approach for an API like this, but I mention it as another point in the design space that I have seen people use.

You're right that it could be, depending on the design of the library. But how often would this actually be needed in a shipping version of a library?

Absolutely! I am taking the long-term view here.

One option we haven't discussed is layering the API so that the baseline functionality has clean semantics, but people can also choose to import GlobalLogging. Libraries could choose whether they require users to bootstrap the global logging facility or whether they use an injected logger. This would make the tradeoffs explicit and applications wishing to avoid this could seek out libraries that do not rely on it.

This concern could be addressed (while still supporting dynamic reconfiguration) by adding a UUID parameter to bootstrap defaulting to a new UUID. The private key would be set at first bootstrap and any subsequent attempts to reinitialize without that same key would be logged and ignored.

This issue could be partially mitigated by having the bootstrap routine emit a warning (to the new handler) if the logging system has already received messages at bootstrap time.

Hello everyone,
I'll try to shift gears a little bit here.

I've spent a considerable amount of time implementing and testing the proposal in a library and making sure it's ready for future direction and extensions.

There's some nice small additions being raised in this thread (e.g. adding a Fatal log error I certainly agree with, let's make a ticket for it), and I do believe those can be applied as the project lives on in the sandbox (and onwards). The general idea and tradeoffs taken are solid enough to start out building an ecosystem around it, esp. since it needs to make a lot of various people/communities/styles happy™.

I also strongly agree that this proposal is a great step forward and enables building out an ecosystem around logging. A thing being missed from the discussions it seems to me how tremendously valuable it is to have a defined minimal API for LogHandler, which is where we as a community can build up a lot of value – by building adapters to external systems (splunk or whatnot), and async file logging or similar facilities. oslog has been mentioned, and while this API is not (and can not) be exactly the same, it can definitely be used to target oslog.

We can't win all battles in this proposal, since it has to serve existing users and thus ends up more flexible than a single library would be. Truth is, logging backends always come with their own small caveats (sync / async, etc), so there's always some info that end users of a system should take into account.


Having that said, hopping into the questions:

What is your evaluation of the proposal?

I am in favor of the proposal :+1:

I am confident we can evolve the minor additions during its life in the sandbox if there was a need to. It does take various tradeoffs, as the environment it has to serve is rather diverse. The tradeoffs are balanced and overall quite good I believe.

I also assessed performance aspects of the API and I think we have enough wiggle room to make things highly optimized.

Is the problem being addressed significant enough?

Yes, these APIs will unlock the ecosystem's ability to provide shared log handler implementations, and make it possible for libraries to use logging backends without having to hard-bake-them-in.

Without such common API for Logger and LogHandlers we are holding back the server side community from growing a serious ecosystem.

Does this proposal fit well with the feel and direction of Swift on Server?

Yes. It is consistent with the Metrics proposal in the way a backend is configured. The types and feel of the library seem agreeable to me. No major concerns other than things we already addressed.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I'm very familiar with the Java and Scala ecosystems where I've been part of creating foundational libraries for the ecosystem, including some which had their of Logging APIs and have made sure that mistakes we made back there are not carried over to this API.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I spent a few full days between the initial proposal and the current revised thread using the API and embedding in real libraries and exploring future evolution concerns. I have implemented a number of metadata patterns and log-handler patterns one would want to be able to pull off using such API and can confirm all those are now possible.

All in all, very much looking forward to this and what we can unlock by having such shared types.
Keep in mind also that we're talking about a project entering the sandbox phase, so small things can still change and I think doing so via PRs and as normal open source project work will be more efficient.

5 Likes

thanks @ktoso. love to see following reviewers stick to the feedback format as you did

@anandabits thanks for all your feedback. some really interesting ideas there. if you have an alternative proposal in mind, i suggest you throw it up on github and share with all of us. there are unit tests against the specific use cases this proposal is targeting/covering, so i would ask your proposal demonstrates via similar tests how they are covered

1 Like

here is a proposal to enforce single bootstrapping: enforce that logging system can only be bootstrapped once by tomerd · Pull Request #32 · weissi/swift-server-logging-api-proposal · GitHub

1 Like

here is a proposal to add support for creating loggers using custom factory as an escape hatch for the global factory design. we think this should be used very rarely but worth adding for flexibility: support custom logging factories by tomerd · Pull Request #33 · weissi/swift-server-logging-api-proposal · GitHub

2 Likes

+1 to this. I think it is essential that any modern server logging framework consider privacy. Marking private data means it can be masked and easily anonymised later.

I support that and will merge in a few days assuming no one has any troubles with that.

1 Like

I think this is important. Assuming no one has any problems with that, I'll merge that too in a few days (and update the proposal).

1 Like

Would you be interested in putting together a separate proposal which explores the idea you think would work best?

Thank you for this invitation, unfortunately I don't have time available to put together a full proposal. That said, I would be happy to discuss the directions I mentioned upthread further if anyone else is interested in pursuing them further.

Yes, I think that's a good idea. In the previous discussion thread I already mentioned that we would leverage the features that come for os_log. Therefore I definitely think we should also adopt custom string interpolation instead of plain String s.

Sounds great.

Do you think we should do this now or wait for the Apple's OS logging API to land and then use the same/similar type?

That is an interesting question. I think it will be difficult to reuse that same custom string interpolation type for this application as that type has to do many things that are specific to OS log (such as supporting os-log specific formatting options, constructing C-style format strings etc.) If we reuse it, we may end up customizing large parts of that type. I think it would be better to design a new custom string interpolation type for this application. However, the interface of these types could be made similar, especially on how formatting is specified, how privacy is/can be specified etc. It would be great if the log calls have the same syntax and are syntactically compatible with each other. (There could be options that are only available on one platform, but I guess those can be separated and the rest can be unified.) Ideally, it should just suffice to change the creation of the logger (e.g. using if'ery) and compile it on the server or on the device.

cc @Ben_Cohen @Devin_Coughlin @moiseev

3 Likes

I did just a very quick read but this:

let logger = Logging.make("com.example.app")
logger.warning("uh oh, this is unexpected")

Per Swift API Guidelines and OOP best practices:

  • make should use an argument label (and init is preferred)
  • logger (in the example) and Logger/Logging are the Log
  • LogHandler is the Logger
  • methods should generally be verbs, e.g. warning is not a verb

I would expect something like:

let mainLog = Log(identifier: "com.example.app")
mainLog.add(.warning, "uh oh, this is unexpected")

And the types:

- public struct Logger
+ public struct Log

- public protocol LogHandler
+ public protocol Logger

As for MetadataValue, do we need the Value suffix? It says nothing.

5 Likes