{
"$type": "site.standard.document",
"canonicalUrl": "https://rednafi.com/shards/2026/04/no-stacked-loglines/",
"description": "Why logging at every layer of a service produces noise, and how to log only at the handler level while propagating context from below.",
"path": "/shards/2026/04/no-stacked-loglines/",
"publishedAt": "2026-04-07T00:00:00.000Z",
"site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
"tags": [
"Go",
"Distributed Systems",
"Observability"
],
"textContent": "Consider a typical layered Go service. There's a repository layer that talks to the\ndatabase, a service layer that contains the business logic, and a handler that deals with\nprotocols like HTTP or gRPC.\n\nOne common thing I often see is emitting log lines from all three layers. When an error\noccurs, each layer logs it as it bubbles up, producing a stack of duplicate lines for the\nsame failure. At low throughput this is just noisy. At high throughput it taxes the logging\npipeline. We've seen stacked logging from a 250k rps service put enough pressure on the o11y\ninfrastructure to cause its own incidents.\n\nOne error, three log lines\n\nThe repository hits a database timeout and logs it:\n\nThe service catches the wrapped error and logs it again:\n\nThe handler logs it a third time before returning a 500:\n\nOne database timeout, three error log lines. You search during an incident, see 3,000\nerrors, and think three thousand things broke. In reality it was one thousand requests that\neach logged the same failure three times.\n\n> [!IMPORTANT]\n>\n> There's a related but separate problem in GetByID and GetUser: they log the error\n> _and_ return it. Dave Cheney in [Let's talk about logging] warns us against it: \"if you\n> choose to handle the error by logging it, by definition it's not an error any more \\- you\n> handled it.\"\n>\n> The [Uber Go Style Guide] says the same: \"the caller should not, for example, log the\n> error and then return it, because its callers may handle the error as well.\" Either log it\n> or return it, not both.\n\nMove the log line to the top\n\nLower layers should return errors with context and leave logging to the handler. Wrap each\nerror on the way up:\n\nThe handler gets\nuserService.GetUser: userRepo.GetByID: query user abc123: connection refused. The full\ncall chain is in the error string and no layer had to log independently.\n\nThe handler logs the error once with request context:\n\nOne error, one log line. The handler has the request context and the full wrapped error\nchain, which is all you need for error logging.\n\nCollecting log fields on the way up\n\nError wrapping solves the failure case. But what about non-error diagnostic data - query\ntiming, cache hit/miss, downstream call latency? The handler only has access to the request\nand the error string. It can't know how long the database query took unless the repository\ntells it somehow.\n\nYou can stash a mutable field collector in context.Context at the start of a request.\nLower layers append to it. A middleware reads it back and emits one structured line when the\nrequest completes.\n\nStart with a thread-safe container for log fields:\n\nThen a helper that lower layers call to attach fields without logging:\n\nThe middleware creates the collector, passes it down through context, and emits the log line\nafter the handler chain completes:\n\nHere:\n\n- (1) stashes an empty field collector in context\n- (2) runs the handler chain; lower layers may call AddLogField during this\n- (3) merges request fields with whatever lower layers attached\n- (4) only two log levels: Info for normal requests, Error for 5xx\n- (5) emit the log line\n\nNow each layer can inject fields into the context without logging. The repository attaches\nquery timing:\n\nThe service layer could do the same for cache hit/miss, downstream call latency, or which\ncode path a feature flag selected. None of them emit a log line. They just call\nAddLogField and move on. If a lower layer genuinely needs to record a decision - like\nfalling back from a primary to a secondary node - that's fine. It's a separate concern from\nthe duplicate error logging this post is about.\n\nThe middleware at the top collects all of it and emits one line. If the repository called\nAddLogField(ctx, \"db_duration\", ...) and the request returned a 200, the output looks\nsomething like:\n\nOne request, one log line, with context from every layer.\n\nThe Kubernetes API server does exactly this with [AddKeyValue] and [respLogger]. Caddy's\ndeferred [logRequest] and HashiCorp Nomad's [wrap] follow the same pattern.\n\nCanonical log line\n\nStripe took this further with what they call the [canonical log line]: one structured line\nper request, but with every dimension you might want to query on. Auth type, DB query count,\nrate limit status, team ownership:\n\nFields like database_queries=34 come from lower layers injecting into context, exactly the\nmiddleware pattern from the previous section. The canonical log line carries queryable\ndimensions. For the full error chain you still have the wrapped error that the handler\nlogged. Brandur Leach, who worked on this at Stripe, called it \"[the single, simplest, best\nmethod of getting easy insight into production that there is].\" [go-chi/httplog] implements\nthe same idea in Go on top of log/slog.\n\n---\n\nA few things that came up when I [posted this on Reddit]:\n\n_\"What about debug logs in lower layers?\"_ - Debug-level logs in the repository or service\nlayer are fine as long as they're disabled by default in production. The problem is when\ninfo or error level logs fire from every layer on every request. Debug logs you flip on for\na specific investigation are a different thing.\n\n_\"Some lower-layer logs aren't duplicates though.\"_ - True. A service layer logging that it\nfell back from primary to secondary, or that a feature flag changed the code path, is\nrecording a decision and adds information. Those are worth keeping. The problem this post\ntargets is the same error or the same \"processing request\" message repeated at every layer\nwith no new information. Nevertheless, hoisting your log lines to the upper layer is a\ngeneral rule of thumb.\n\n_\"Canonical log lines are tracing at home.\"_ - They're not a replacement for distributed\ntracing. They fill a different niche: request-level telemetry you can query ad-hoc without a\ntrace viewer. They work best alongside metrics and traces.\n\n_\"You still need normal logs for verbose error details.\"_ - Yes. The canonical log line\ncarries queryable dimensions like database_queries=34 or rate_allowed=false. The full\nerror chain still comes from the wrapped error that the handler logged separately.\n\n_\"Won't the canonical log line get too big?\"_ - It can. Most log aggregators have a max line\nsize. The canonical log line should carry structured fields like counts, durations, and\nidentifiers - not full request/response bodies or long error messages. Keep the fields\nnarrow and let the error wrapping carry the verbose details separately.\n\n\n\n\n[respLogger]:\n https://github.com/kubernetes/apiserver/blob/master/pkg/server/httplog/httplog.go#L259-L296\n\n[AddKeyValue]:\n https://github.com/kubernetes/apiserver/blob/master/pkg/server/httplog/httplog.go#L234-L238\n\n[logRequest]:\n https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/server.go#L380\n\n[wrap]:\n https://github.com/hashicorp/nomad/blob/main/command/agent/http.go#L738-L815\n\n[Let's talk about logging]:\n https://dave.cheney.net/2015/11/05/lets-talk-about-logging\n\n[Uber Go Style Guide]:\n https://github.com/uber-go/guide/blob/master/style.md#handle-errors-once\n\n[canonical log line]:\n https://stripe.com/blog/canonical-log-lines\n\n[the single, simplest, best method of getting easy insight into production that there is]:\n https://brandur.org/nanoglyphs/025-logs\n\n[go-chi/httplog]:\n https://github.com/go-chi/httplog\n\n[posted this on Reddit]:\n https://www.reddit.com/r/golang/comments/1sfvks6/stacked_log_lines_considered_harmful/",
"title": "Stacked log lines considered harmful"
}