Logging

I’m not going to argue a hypothetical logging message.

This person is just straight out trolling. Do not feed the troll.

I'm dealing with Vogel about their posts here; please continue with the discussion.

15 Likes

I'm a out of my depth here, but I think lazy rendering would still be helpful in the context of sending logs to Splunk, etc.

As I understand it (and I may be totally wrong), Log4J version 1 wrote out log messages synchronously. We were writing structured log messages to a file, and a daemon would read those files and move them over to Splunk for indexing. Most of the time this worked just fine. However, we ran into a situation once where the logging slowed down so much, that our server would run out of threads to allocate to requests. We switched over to Log4J version 2, which I believe buffers log messages instead, and writes them out to file on a separate thread, and the problem went away.

IMO, this is the way to go. Java does something similar as well. You have a factory to create loggers, so you'd have something like this at the beginning of most classes:

private static final Logger LOGGER = Logger.getLogger(Some.class);

The LOGGER object inherits all of its configuration from the global factory. The logger is named with Some (with the package name prefixed).

And at the log site something like this:

LOGGER.warn("some message");
1 Like

I think the general idea behind this thread is great. I wonder if it would make sense to polish the stdlib part in that section and then start building the logging API on top of it. It would be great if the steaming API could be extended or overhauled so that we can inject listeners to destination streams at link time. Quite some time ago I started a thread in that direction.

2 Likes

More general back-pressure aware streaming would be very useful before starting to write libraries, which then have to do all that on their own in some way. That "logging thread" is nothing but a custom solution here (and kinda a step back wrt async ...).

The console implementation in Noze.io is a little outdated and not super nice (and I don't want to bring it up as the "solution"), but it actually uses async streams to do the logging. I.e. the logging doesn't block the event loop, it rather schedules the log writes on the same and makes use of the same async underpinnings.

In the context FileChannels and PipeChannels would be required / nice to have.

None of the above implementations seem to account for structured logging. This has been incredibly useful to me in the past.

Python's structlog has a handy page that explains why: Why… - structlog 22.1.0 documentation.

Java achieves something similar with MDC.

IMO, it'd be great to include this as part of the base logging abstraction.

A problem that neither Python nor Java solve well (AFAIK) is the ability to "extract" structured log data from exceptions. I rarely want to log when I throw an exception, but I would ideally like to have the exception include structured log data that gets written when the exception is actually logged (likely in some error handling middle ware). I don't really have a good solution though...

What I've done in the past is to write exception message strings in some predictable format (like say key1="value1", key2="value2") that gets parsed when logging, but that's ugly and slow.

1 Like

Do you have good documentation links on what form of lazy rendering Splunk supports when ingesting data? I'm still unconvinced that the benefit is large enough to add that extra complexity given the variety of log ingesting solutions people use. But I'm very happy to be convinced otherwise if it's a thing that most solutions support and is shown to have actual benefit.

Yes, all synchronous logging from the calling thread is probably not a good idea. I'd also recommend to use a 'large enough' buffer. The 'running out of threads' issue is a different issue though, right?

I think the Apache example is essentially "structured logging", just not a general purpose one. (and I agree with you, it makes sense to think about this. logging doesn't have to be reduced to string lines).

AFAIK, Splunk itself doesn't support it. I'm likely not being very clear here, or I may be misunderstanding what "lazy rendering" means in this context. Let me try again.

In the setup we used, a log like this:

logger.info("Response sent");

would get written to a file like this:

[start event]
timestamp=<some timestamp>
thread=<some thread>
level=INFO 
logger=<some logger>
clientIP=Unknown
method=POST
msDB=79
msRequest=134
numDB=1
path=<some path>
queryParams=
requestId=88472589-16e0-4240-bba3-1c7073d49cca
statusCode=200
userAgent=python-requests/2.18.1
message=Response sent
[end event]

A separate process parses this file, and pushes data into Splunk's indexers (I don't know how exactly this worked since I didn't write it).

IMO it could be useful for the conversion from "Response sent" to the large String that actually gets logged to not happen on thread that calls logger.info?

Makes sense. FWIW, in the event of the buffer running out, my preference would be to block new logs, rather than to drop older logs.

Yes - I only mentioned it because it was a consequence of synchronous logging.

Agreed! Structured logging would be great to have but it's somewhat orthogonal to the question of how we reach the logger itself (local v. global). Not sure if we should discuss both axes at the same time or not.

Maybe how you "reach it" should actually be framework specific? General purpose libraries would probably need to get it injected. (and this project would essentially just provide the protocol/API?)

Good point, we never really defined what 'lazy rendering' actually is. Let's say we use os_log to log

os_log_error(OS_LOG_DEFAULT, "%d %d", 1, 2);

it doesn't actually do the string interpolation when logged. It will just ship the static string "%d %d" and the 1 and the 2 as binary data to the logging system. The string interpolation is done when you actually look at the message using the log utility or Console.app or similar. That's really quite cool because you shift some of the costs from producer of log messages to the consumer. On the server however there are lots of standard solutions which expect textual/structural output in a certain format so there's not necessarily that much value in it. Again, very happy to change my mind if there's evidence that it will help.

@Joe_Groff was is something like this:

[app thread 1] \
                \
[app thread 2]   +===logging firehose (1)===> [logger daemon thread] ==(2)==> logging mechanism (ie. file/db/stdout/...)
                /
[app thread 3] /

where we have essentially two pipes, the logging firehose (1) and then a rendering stage (2). I think Joe was suggesting to support lazy rendering in (1) but not in (2). That way we could shift the rendering overhead from log message producers (app thread N) to the logger daemon thread. The benefit would be that in case the logging mechanism supports lazy rendering we could just forward it.

4 Likes

:+1: This makes sense to me.

Injection works, but it is more "setup/boilerplate", which is often annoying (but not insurmountable). I think it would be easier to have how you reach it to be defined here, and have all general purpose libraries use the same mechanism.

Yes, that's how I would imagine too. A HTTP server carries around some context anyway so it'd be able to hand you the logger from there. For SwiftNIO we can build a feature that lets you attach context to a Channel. I just wanted to be upfront because there's a lot of 'general purpose' libraries and carrying around extra context is invasive (but IMHO what we need to do).

In a previous life I worked on the "trace engine" inside IBM's Java Virtual Machine. It uses a somewhat similar model.

Each thread in the JVM (Java threads that are running bytecode etc. and internal JVM threads) has an in-memory trace buffer. When an enabled tracepoint is hit (either via a hook from the running Java code or directly from the JVM's C++ code) the tracepoint data is written into the thread's local buffer. This data includes the tracepoint identifier, optionally a high-res timestamp, and the tracepoint's parameter data (the final three parameters in your os_log_error() example). Because the trace buffer is thread-local, writing to it is extremely fast and does not require a lock.

When the trace buffer fills, the thread grabs a new buffer from a pool (using CAS), and ships the full buffer to a separate "buffer writer" thread for processing. If the user has enabled file tracing, the buffer writer thread writes the raw unformatted trace buffer to disk (along with some metadata).

There is a separate offline "trace formatter" tool which processes the binary trace file and generates a text file output for analysis.

It's obviously quite a complex subsystem, but it has proven invaluable over the years, for debugging both problems in user code (because you can pipe arbitrary Java methods to the trace engine, including support for globs) and bugs in the JVM (which is very well instrumented internally).

If anyone is interested in learning more, have a look at IBM Documentation

3 Likes

Oh cool, very good to have you over here! Just so I fully understand: If the thread local buffer has filled up and a fresh one is created using an atomic compare-and-swap, what happens to the old one? Will it still be processed? If yes: What happens if the production of messages (into the buffers) is continuously faster than the consumption (out of the buffer)? Is that just ballooning it into memory or is there some back-pressure mechanism?

Yes, the full buffer is shipped to a separate thread for processing.

By default, if the production of tracepoints into buffers is faster than they are processed, memory usage increases over time. There are diagnostics emitted which indicate that this may be happening.

Alternatively, if the user chooses to use "non-dynamic" buffers, each thread is allowed a maximum of 2 buffers: one that it's using to write tracepoints, and one that is being processed. If the current buffer fills while the other buffer is still being processed, tracepoints are dropped and a diagnostic emitted.