Minimum Requirements for a Custom Logger, with an Integration Test Checklist

· · Windows Development, Logging, Integration Testing, Test Design, Reliability

If you can use an off-the-shelf logging framework, that is the safer choice. Even so, there are situations where application constraints or operational circumstances make a custom logger unavoidable. The first thing people agonize over there is how much to implement so the design is “neither too sloppy nor too heavy.”

In this article, we narrow the target to application logs used for failure investigation. Rather than taking on audit trails, distributed tracing, a metrics platform, and cloud aggregation all at once, we first define a minimum configuration that is useful in the field, and then lay out the integration test angles needed to make that configuration genuinely trustworthy.

The Conclusion First

The essentials to nail in the first version are these.

  • Use UTF-8 JSON Lines as the format
  • Never break the one-record-per-line rule
  • Required fields are timestamp, level, category, message, structured fields, sessionId, and processId
  • The baseline is one file per process
  • Use synchronous writes at low volume; at higher volume use single writer + bounded queue
  • Synchronously flush Error / Critical and the session start/end records
  • Include rotation and retention from v1
  • When the log destination is unavailable, do not silently divert to another location

Narrowing things to about this level makes both the implementation and the operations much less likely to collapse.

First, Narrow the Scope

Custom loggers tend to become difficult because they try to handle everything from the start. Try to combine diagnostic logs, audit logs, performance measurement, distributed tracing, and user behavior analytics into one mechanism, and the requirements explode at once.

The target here is diagnostic logs used to isolate application failures. That is, we prioritize being able to trace afterwards “when,” “in which operation,” “what happened,” and “what the context was at the time.” Just this narrowing makes the initial design decisions considerably easier.

The Minimum Requirements

1. The format is UTF-8 JSON Lines

You can keep logs as concatenated plain text, but they become hard to process mechanically later. Conversely, starting with a heavy proprietary binary format hurts observability in operations.

The convenient middle ground is UTF-8 JSON Lines. With one record per line, the file is readable as text and easy to analyze later with scripts and tools. Even if a write is cut off midway, it is easy to isolate which line broke — a practical advantage.

2. Fix the required fields up front

The minimum set of fields to have in place is these seven.

  • timestamp
  • level
  • category
  • message
  • fields
  • sessionId
  • processId

A string-only log of just message becomes a problem when search criteria multiply later. Conversely, too many fields and the burden on the call sites jumps. It is safest to fix the set at about this size at first, and consider additions only when genuinely needed.

3. Make one file per process the baseline

A design where multiple processes append to the same file carries more accident potential than it appears to. Mutual exclusion, partial writes, rotation timing, and handling of abnormal termination all become difficult at once.

Start with one file per process as the baseline. If you want to combine multiple processes, it is safer to aggregate downstream, or to stand up a dedicated aggregation process explicitly.

4. Split the write strategy by load

While log volume is low, synchronous writes are easier to understand and easier to investigate failures with. Forcing things asynchronous can lose the logs written just before exit, or leave the flush conditions on exceptions ambiguous.

On the other hand, if log volume grows and synchronous I/O becomes the bottleneck, adopt single writer + bounded queue. What matters then is deciding the overflow policy in advance. Do not leave it vague whether you drop old logs, drop new logs, or emit a warning.

5. Decide the flush conditions

Synchronously flushing Error and Critical, plus the session start and end logs, pays off in failure investigations. Flushing everything down to routine Info slows things down, so not treating everything the same is the realistic choice.

6. Include rotation and retention from v1

Rotation is often considered something to “add later,” but it is a feature whose absence suddenly hurts once you are in operations. The scheme can be anything — size-based, daily, per-launch — but at minimum you should be in a state where “it does not grow without bound” and “how many files are kept” are decided.

7. No improvised fallback storage on save failure

A design that silently writes somewhere else when the log destination is unavailable makes later investigation difficult. The mere fact that the logs are not in “the place they should be” delays the operations team’s initial response to an incident.

If saving fails, surface the failure through an explicitly visible channel: an in-app notification, the event log, standard error, or the like. At the very least, avoid the state of “no one knows where the logs went.”

A Minimal v1 Configuration

For the first version, something like the following is often enough.

  • UTF-8 JSON Lines
  • One file per process
  • Per-session file names
  • Size-based or per-launch rotation
  • An upper bound on retained files
  • Synchronous flush of Error / Critical
  • An API that accepts structured fields

For anything beyond this, adding features only after real operations reveal “what actually hurt” results in a logger that is easier to maintain.

Common Anti-Patterns

Here are the typical patterns to avoid.

  • Cramming everything into the message string
  • Sharing the same file across multiple processes
  • Going fully asynchronous without deciding the flush conditions
  • Postponing rotation and retention
  • Silently diverting to another folder on save failure
  • Putting network transmission or local DB storage into v1

Each looks convenient at a glance, but they are all items that tend to weigh down isolation and operations.

Think of Integration Tests in Terms of Real Files, Real Threads, Real Processes

A logger is a component that unit tests alone cannot make you comfortable with. Verifying only the string formatting and JSON serialization misses what actually causes problems in production: I/O, concurrency, rotation, flush at shutdown, and permission errors.

So integration tests need to verify with real files, real threads, and, where necessary, real processes. At minimum, you want to avoid the state of “passes day to day, but cannot be trusted during an incident.”

Integration Test Items Worth Running

Health of a single write

  • Is each line exactly one JSON record?
  • Can it be re-read as UTF-8?
  • Are the required fields present every time?
  • Has an embedded newline broken a record across multiple lines?

Concurrency within the same process

  • Do records stay intact when multiple threads write simultaneously?
  • Is the record count neither short nor over?
  • With a queue in use, do ordering and loss behave per the specification?

Flush and shutdown behavior

  • Are Error / Critical reflected immediately?
  • Is the queue empty after a normal shutdown?
  • Do the necessary final logs survive on paths close to an exceptional exit?

Rotation and retention

  • Does the logger switch to a new file when the rotation condition is met?
  • Are old files beyond the retention limit deleted per the specification?
  • Do JSON lines stay intact immediately before and after rotation?

Failure paths

  • Behavior when the destination directory does not exist
  • Behavior when write permission is missing
  • Notification or return value when a write fails under disk-full-like conditions
  • Behavior on queue overflow

Handling multiple processes

If the specification is one file per process, then the very fact that another process does not try to enter the same file can itself be a verification target. Conversely, with an aggregation-process scheme, verification needs to include handoff failures to that process.

The Minimum Set of Tests to Pass in v1

Trying to do everything at first makes the tests too heavy. The minimum to pass in v1 is about these six.

  1. Normal writes from a single thread
  2. Simultaneous writes from multiple threads
  3. Flush of Error / Critical
  4. Rotation and retention
  5. Failure notification when the destination is unavailable
  6. Drain and final flush on normal shutdown

Even with just these six passing, you are already a long way from “a logger that emits strings but cannot be trusted in operations.”

Summary

The first goal of a custom logger is not feature richness but “being believable during an incident.” To get there, it is effective to fix the format as UTF-8 JSON Lines, keep the required fields tight, make one file per process the baseline, and decide flush, rotation, retention, and failure behavior early.

And whether that design actually works has to be verified with integration tests that use real files, real threads, and real processes. Before growing the implementation, lock down the minimum configuration and the minimum test set first, and the logger becomes easy to grow without strain later.

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

These topic pages place the article in a broader service and decision context.

This article connects naturally to the following service pages.

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

Back to the Blog