Side note: See your structured logs in action
Head over to Better Stack and watch your JSON logs transform into filterable, searchable data in real-time with our live tail feature.
Log formatting plays an important role in capturing and organizing various application event details. It encompasses concerns such as:
Sifting through poorly formatted logs will only lead to frustration akin to finding a needle in a haystack. Therefore, prioritizing proper log formatting is essential to enhance the effectiveness of your logging strategy, and it paves the way for more efficient log analysis and management.
Before discussing specific formatting guidelines, let's take a brief look at the three main categories of log formats.
Head over to Better Stack and watch your JSON logs transform into filterable, searchable data in real-time with our live tail feature.
When it comes to formatting application logs, there are three primary approaches:
Unstructured logs are the freeform artists of the logging world. They defy predefined formats to offer flexibility and ease of reading in development.
Consider these examples:
Each entry tells a story, but without a consistent structure or sufficient context. The lack of a uniform format means that as your application scales, troubleshooting specific issues using these logs becomes an Herculean task.
Command-line tools like sed, awk, and grep can help filter messages or
extract key information, but this is more of a makeshift solution than a
sustainable one.
Semi-structured logs are a step up from unstructured logs. They maintain some semblance of structure, but they often lack a recognizable format. For instance, consider the following records:
In each entry, elements like the timestamp, log level, process and thread IDs, and application name are standardized, but the log message retains a flexible, narrative style that embeds all other contextual details.
Structured logging is a contemporary and highly effective approach to logging. Each log entry adheres to a recognizable and consistent format that facilitates automated searching, analysis, and monitoring using log management tools.
The most widely embraced structured format is JSON:
Structured log entries present well-defined fields and values that capture precise information about the logged event or message.
The main downside to structured logging is reduced human-readability, which is essential in development environments.
This can usually be resolved by configuring your framework to output a semi-structured and colorized format in development, while defaulting to JSON output in production.
To enhance the utility of your logs, begin by adjusting the format of the data included in each log entry. Below are nine practices to follow for more effective logging:
| Impact | Difficulty | |
|---|---|---|
| Use structured JSON logging | ⭐⭐⭐⭐⭐ | ⭐⭐ |
| Standardize on string-based log levels | ⭐⭐⭐ | ⭐ |
| Log timestamps as ISO-8601 | ⭐⭐⭐⭐ | ⭐⭐ |
| Include log source information | ⭐⭐⭐ | ⭐ |
| Add the build version or Git commit hash | ⭐⭐⭐ | ⭐ |
| Capture a stack trace when logging errors | ⭐⭐⭐⭐⭐ | ⭐ |
| Standardize your contextual fields | ⭐⭐⭐ | ⭐⭐⭐ |
| Use a correlation ID to group related logs | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| Selectively log object fields | ⭐⭐⭐ | ⭐⭐⭐ |
The easiest way to generate structured logs is to adopt a framework that provides this ability natively or through a plugin. Most frameworks support this requirement, and many modern libraries even default to it. Here's an example of structured logging in Go:
Your application dependencies can also usually be configured to produce structured logs. For instance, both PostgreSQL and Nginx support logging in JSON format:
With the above Nginx access log configuration, you'll go from the archaic Combined Log Format representation:
To one with clearly defined fields:
Log levels signify the recorded event's severity, and logging frameworks usually represent these levels internally with integers. For instance, here are Go's default levels and their integer representations:
Usually, the level is emitted as strings for clarity regardless of their internal representations:
However, some frameworks like Node.js's Pino use their integer representations directly:
In Pino, 60 corresponds to the FATAL level, per this schema:
To minimize confusion and ensure log uniformity across all the applications in your environment, it's better to normalize log formatting to always use string levels.
This approach mitigates ambiguity arising from differing interpretations of integer levels in various languages.
Learn more: Log Levels Explained and How to Use Them
When formatting timestamps, I recommend using the ISO-8601 format (or its stricter counterpart, RFC 3339) as it provides a human-readable and unambiguous representation of the date and time (up to nanosecond precision) with optional timezone information.
Some examples:
Ensure your timestamps are normalized to location-independent universal time (UTC) to establish a standard reference across all your services. If the location is relevant, offset the time zone accordingly.
Every log entry should contain information about its source or origin to help identify where it was generated. Typically, logging frameworks can automatically include the source file, line number, and function in the log with minimal setup:
In general, include whatever information helps you quickly pinpoint where the log message originated. In distributed systems, the hostname, container ID, and other relevant identifiers can also be immensely valuable for identifying issues on specific nodes.
Incorporating the build version number or commit hash into your log entries associates logs with a specific version of your application. This connection is crucial for reproducing and troubleshooting issues, as it allows you to identify the exact state of your project when the log was generated.
As codebases evolve, source information like function names and line numbers may change, making correlating logs with the updated code challenging. Including version information resolves this by providing a clear reference point.
Consider this log entry example:
If the associated code is refactored or relocated, matching this log with the current codebase could be confusing. However, with the application version or commit hash included, you can quickly revert to the specific state of the project (using commands like git checkout) for a more accurate investigation.
Another relevant detail is the compiler or runtime version used to compile or execute the program. This can also help ensure perfect reproducibility when investigating problems in the code.
Including stack traces in your error logs is necessary for swiftly pinpointing the problem's source. Most well-designed frameworks will automatically capture stack trace details for exceptions, but some may need extra configuration or additional packages.
Here's an example of a stack trace produced by Python's standard logging module coupled with python-json-logger:
The entire stack trace is included as a string property in this instance. Some frameworks such as Structlog, support structured stack traces in JSON format, which is highly preferable:
Where possible, prefer a structured stack trace as above but a large string block is better than nothing.
To make your log messages more informative and actionable, you need to add contextual fields to your log messages. These fields should help answer key questions about each logged event such as what happened, when, where, why, and the responsible entity.
In the past, logging APIs typically required embedding contextual details directly into the log message string:
This approach has a few downsides:
Instead, log contextual information as key/value pairs:
This results in logs that look like this:
This approach also makes it easy to bind data attributes to your loggers to ensure they are present in subsequent logging calls:
This way, the user_id will be included in all subsequent log records:
To maintain log consistency, establish standards for field names and content
types to help prevent situations where user IDs are logged as user, user_id,
or userID in various places.
Additionally, consider including units in field names for integer values (e.g.
execution_time_ms or response_size_bytes) to eliminate ambiguity.
A single log entry often represents just one event within a larger operation or workflow. For instance, when handling an HTTP request, numerous log entries may be generated at various stages of the request's lifecycle.
To better understand and collectively analyze these logs, including a correlation ID that ties together all the log entries related to a specific request is crucial.
Such IDs can be generated at the edge of your infrastructure and propagated throughout the entire request lifecycle. This ensures each related log message carries this identifier, enabling you to easily group and analyze logs from a single request.
To reduce the verbosity of your logs and safeguard against accidentally exposing sensitive data, it's necessary to control which aspects of your custom objects/structs are logged.
This is often achievable by implementing a method (usually toString(),
String(), or similar) in your objects that specifies which fields are safe to
log. By doing so, you can ensure that only necessary object fields are included
in your logs.
Here's an example using Go's Slog package:
By implementing the LogValuer interface, you effectively control the logging
output of the User struct. Here, only the ID field will be included in the
log record whenever a User instance is logged.
This approach not only reduces log clutter but also prevents inadvertent logging of sensitive fields (including those added in the future).
Learn more: Best Logging Practices for Safeguarding Sensitive Data
Stop using grep and regex. Better Stack transforms your structured JSON logs into interactive dashboards—filter by any field instantly, spot patterns at a glance, and track metrics that matter.
Once you've started generating structured and well-formatted logs, the next step is to aggregate and centralize them in a log management service that provides an easy way to filter, visualize, and set up alert rules to promptly notify you about specific events or patterns that warrant attention.
For further reading on log management, read our newest article.
Sounds interesting? Spin up a free trial of Better Stack to see how easy log analytics can be.
Thanks for reading, and happy logging!
We use cookies to authenticate users, improve the product user experience, and for personalized ads. Learn more.