{
  "$type": "site.standard.document",
  "canonicalUrl": "https://rednafi.com/go/context-cancellation-cause/",
  "description": "How Go 1.20's WithCancelCause and Go 1.21's WithTimeoutCause let you attach a reason to context cancellation, plus a gotcha with manual cancel and the stdlib pattern that covers every path.",
  "path": "/go/context-cancellation-cause/",
  "publishedAt": "2026-02-24T00:00:00.000Z",
  "site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
  "tags": [
    "Go",
    "Error Handling",
    "Concurrency"
  ],
  "textContent": "I've spent way more hours than I'd like to admit debugging context canceled and\ncontext deadline exceeded errors. These errors usually tell you that a context was\ncanceled, but not exactly why. In a typical client-server scenario, the reason could be any\nof the following:\n\n- The client disconnected\n- A parent deadline expired\n- The server started shutting down\n- Some code somewhere called cancel() explicitly\n\nGo 1.20 and 1.21 added cause-tracking functions to the context package that fix this, but\nthere's a subtlety with WithTimeoutCause that most examples skip.\n\nWhat \"context canceled\" actually tells you\n\nHere's a function that processes an order by calling three services under a shared 5-second\ntimeout:\n\n- (1) creates a derived context that automatically cancels after 5 seconds\n- (2) cleans up the timer when the function returns, standard practice per the [context\n  package documentation]\n- (3) if anything goes wrong, including a context cancellation, the error is returned as-is\n\nWhen a context gets canceled, the underlying reason is either context.Canceled or\ncontext.DeadlineExceeded. Libraries wrap these in their own types (url.Error for\nnet/http, gRPC status codes for grpc), but errors.Is still matches the sentinel.\n\nSo if checkInventory makes an HTTP call and the client disconnects while it's in flight,\nthe error that bubbles all the way up is:\n\nIf the 5-second timeout fires while chargePayment is waiting on a slow payment gateway:\n\nTwo sentinel errors. No reason, no origin, nothing. The caller of processOrder has no idea\nwhat actually happened.\n\nYou'd think wrapping the error helps:\n\nNow the log says:\n\nBetter. You know it happened during the inventory check. But you still don't know _why_ the\ncontext was canceled. Was it the 5-second timeout? A parent context's deadline? The client\nhanging up? A graceful shutdown signal? The error doesn't say.\n\nWithout the cause, you can't tell whether to retry, alert, or ignore, and your logs don't\ngive on-call enough to triage.\n\nWhen this happens in production, you end up scanning logs for other errors around the same\ntimestamp, hoping something nearby gives you a clue. If the logs don't help, you trace the\ncontext from where it was created, through every function that receives it, looking for\ncancel calls and timeouts. In a small service this takes a few minutes. In a larger codebase\nwith middleware, interceptors, and nested timeouts, it can take a lot longer.\n\nThis has been a known pain point in the Go community for years. Bryan C. Mills noted this in\n[issue #26356] back in 2018:\n\n> I've seen this sort of issue crop up several times now. I wonder if context.Context\n> should record a bit of caller information... Then we could add a debugging hook to\n> interrogate _why_ a particular context.Context was cancelled.\n>\n> -- [bcmills on #26356]\n\nOn [proposal #51365], which eventually led to the cause APIs, bullgare described the\nproduction experience:\n\n> I had a case when on production I got random \"context canceled\" log messages. And in the\n> case like that you don't even know where to dig and how to investigate it further. Or how\n> to reproduce it on a local machine.\n>\n> -- [bullgare on #51365]\n\nThat proposal led to the cause APIs that shipped in [go 1.20].\n\nAttaching a cause with WithCancelCause\n\ncontext.WithCancelCause gives you a CancelCauseFunc that takes an error instead of a\nplain CancelFunc. Here's the same processOrder rewritten to use it:\n\n- (1) cancel(nil) as the default, sets the cause to context.Canceled\n- (2) before returning the error, records a specific reason that includes the original error\n  via %w\n\nNow you can read the cause with context.Cause(ctx). If checkInventory fails because of a\nconnection error, the cause comes back as:\n\nInstead of just context canceled. You know it was the inventory check, you know it was a\nconnection error, and because the original error is wrapped with %w, the full error chain\nis preserved for programmatic inspection.\n\nThe first call to cancel wins. Once a cause is recorded, subsequent calls are no-ops. So\ndefer cancel(nil) only takes effect if nothing else canceled the context first. This means\nthe most specific cancel, the one closest to the actual failure, is what gets recorded. If\ncheckInventory sets a cause and then defer cancel(nil) runs on the way out, the\ninventory cause is preserved.\n\ncontext.Cause is a standalone function rather than a method on Context because Go's\ncompatibility promise means the Context interface can't add new methods. Err() will\nalways return nil, Canceled, or DeadlineExceeded. If you call context.Cause on a\ncontext that wasn't created with one of the cause-aware functions, it returns whatever\nctx.Err() returns. On an uncanceled context, it returns nil.\n\nThis handles explicit cancellation, but the function still has no timeout. The original\nversion used WithTimeout for the 5-second deadline. To label that timeout with a cause, Go\n1.21 added WithTimeoutCause:\n\nWhen the timer fires, context.Cause(ctx) returns the custom error instead of a bare\ncontext.DeadlineExceeded. There's also WithDeadlineCause, which is the same thing but\ntakes an absolute time.Time. If all you need is a label on the timeout path,\nWithTimeoutCause works. But there's a subtlety in how it interacts with defer cancel()\nthat can silently discard your cause.\n\nWhy defer cancel() discards the cause\n\nWithTimeoutCause returns (Context, CancelFunc), not (Context, CancelCauseFunc). The\ncancel function you get back doesn't accept an error argument. [Proposal #56661] defined it\nthis way explicitly:\n\nThink about what happens when processOrder finishes normally in 100ms, well before the\n5-second timeout:\n\n- (1) cancel() fires on return, before the timer\n\nIf the timer fires first (the function ran too long), the context is canceled with\nDeadlineExceeded and context.Cause(ctx) returns your custom message. That path works\ncorrectly.\n\nBut if the function returns first, which is the common case, defer cancel() fires. Since\nit's a plain CancelFunc, it can't take a cause argument. The Go source shows what it does\ninternally:\n\nIt passes Canceled with a nil cause. Your custom cause only gets recorded when the\ninternal timer fires. On the normal return path, the cause is just context.Canceled.\n\nThis isn't a bug. WithTimeoutCause is a new function, so it could have returned\nCancelCauseFunc. The Go team chose not to. rsc explained the reasoning when closing\n[proposal #51365]:\n\n> WithDeadlineCause and WithTimeoutCause require you to say ahead of time what the cause\n> will be when the timer goes off, and then that cause is used in place of the generic\n> DeadlineExceeded. The cancel functions they return are plain CancelFuncs (with no\n> user-specified cause), not CancelCauseFuncs, the reasoning being that the cancel on one\n> of these is typically just for cleanup and/or to signal teardown that doesn't look at the\n> cause anyway.\n>\n> -- [rsc on #51365]\n\nHe also acknowledged that this creates a subtle distinction between the two APIs:\n\n> That distinction makes sense, but it makes WithDeadlineCause and WithTimeoutCause\n> different in an important, subtle way from WithCancelCause. We missed that in the\n> discussion...\n>\n> -- [rsc on #51365]\n\nSo WithTimeoutCause only carries the custom cause when the timeout actually fires. On the\nnormal return path and on any explicit cancellation path, defer cancel() discards it. If\nyou have a middleware that logs context.Cause(ctx) for every request, it'll see\ncontext.Canceled instead of something useful on the most common path.\n\nCovering every path with a manual timer\n\nThe way around this is to skip WithTimeoutCause and wire the timer yourself using\nWithCancelCause. Since there's only one CancelCauseFunc, every path goes through the\nsame door, and first-cancel-wins handles the rest. Here's processOrder one more time:\n\n- (1) one CancelCauseFunc for everything\n- (2) the default cause if nothing else cancels first\n- (3) the timer fires with a timeout-specific cause\n- (4) stop the timer on normal return\n\nThree possible paths, one cancel function. If the timer fires, context.Cause(ctx) returns:\n\nIf checkInventory fails with a connection error:\n\nOn normal completion:\n\nThis is actually what the stdlib does internally; WithDeadline uses time.AfterFunc under\nthe hood.\n\nThe trade-off is that ctx.Err() always returns context.Canceled, never\ncontext.DeadlineExceeded, because you're using WithCancelCause instead of WithTimeout.\nctx.Deadline() also returns the zero value, which matters if downstream code or frameworks\nuse it to propagate deadlines (gRPC, for example, sends the deadline across service\nboundaries via ctx.Deadline()). If downstream code branches on\nerrors.Is(err, context.DeadlineExceeded), that check won't match either.\n\nWhen you also need DeadlineExceeded\n\nIf downstream code relies on errors.Is(err, context.DeadlineExceeded) to distinguish\ntimeouts from explicit cancellations, stack a WithCancelCause on top of a\nWithTimeoutCause:\n\n- (1) outer context for error-path and normal-completion causes\n- (2) inner context with a timeout cause for the deadline path\n- (3) deferred first, runs last (LIFO), cleans up the inner timeout context\n- (4) deferred second, runs first (LIFO), cancels the outer context with a cause\n\nWhen the timeout fires, the inner context gets canceled with DeadlineExceeded and the\ncustom cause. errors.Is(ctx.Err(), context.DeadlineExceeded) works as expected. On the\nerror path, cancelCause(specificErr) cancels the outer context, which propagates to the\ninner. On normal completion, cancelCause(\"processOrder completed\") runs first because of\nLIFO defer ordering, canceling the outer and propagating to the inner. Then\ncancelTimeout() finds the inner already canceled and does nothing.\n\n> [!NOTE]\n>\n> Notice the defer ordering. cancelCause must be deferred _after_ cancelTimeout so it\n> runs _before_ it (LIFO). If you reverse them, cancelTimeout() cancels the inner context\n> with context.Canceled before cancelCause gets a chance to set a meaningful cause.\n\nOne subtlety: after line (2), ctx points to the inner context. If you call\ncontext.Cause(ctx) on it after a cancelCause(specificErr) call, you'll see\ncontext.Canceled (propagated from the outer), not the specific error. The specific cause\nlives on the outer context. In practice this doesn't matter because the caller inspects the\nreturned error, not context.Cause, but it's worth knowing if you add logging inside\nprocessOrder itself.\n\nThe manual timer pattern is simpler and covers most cases. This stacked approach is for when\ndownstream code specifically relies on errors.Is(err, context.DeadlineExceeded).\n\nReading and logging the cause\n\ncontext.Cause returns an error, so the full errors.Is and errors.As machinery works\non it. Since the cause in processOrder wraps the original error with %w, you can unwrap\nthrough it to reach the underlying error.\n\nIf checkInventory failed because the inventory service refused the connection, the cause\nis \"order ord-123: inventory check failed: connection refused\", and the wrapped error is a\nnet.OpError. You can pull it out:\n\nerrors.Is works the same way. If the timer cause had wrapped context.DeadlineExceeded\n(e.g., with fmt.Errorf(\"order timeout: %w\", context.DeadlineExceeded)), you could check\nfor it:\n\nFor logging, ctx.Err() and context.Cause(ctx) serve different purposes. ctx.Err()\ngives you the category (cancellation or timeout), and context.Cause(ctx) gives you the\nspecific reason. Keeping them as separate structured log fields makes them easy to query:\n\nThat produces:\n\nA useful pattern is wrapping the request context with WithCancelCause at the middleware\nlevel so every handler downstream gets automatic cause tracking. The cancel function is\nstashed in the context via WithValue so handlers can pull it out and set a specific cause:\n\n- (1) wrap the request context with WithCancelCause\n- (2) default cause for normal completion\n- (3) stash the cancel function so downstream handlers can reach it\n- (4) only fires if the context was canceled _during_ request handling (client disconnect,\n  handler cancel), not on normal completion; defer cancel(...) hasn't run yet at this\n  point\n\nAny handler can pull the cancel function out and set a cause:\n\nFirst cancel wins, so the most specific reason is what shows up in the middleware log.\n[streamingfast/substreams] uses this approach in production, storing a CancelCauseFunc in\nthe request context so worker pools downstream can cancel with a specific error.\n\nOne thing to know: the stdlib's HTTP server and most third-party libraries cancel contexts\nwithout setting a cause, since they predate Go 1.20. If a client disconnects,\ncontext.Cause(ctx) will return context.Canceled, not a custom error. The cause APIs are\nmost useful for reasons set by your own code.\n\nClosing words\n\nMost of the time, WithCancelCause is all you need. It covers explicit cancellation with a\nspecific reason, and context.Cause gives you a way to read it back. If you also need a\ntimeout, WithTimeoutCause labels the deadline path without extra wiring. The gotcha is\nthat defer cancel() on the normal return path discards the cause, so if you need causes on\nevery path, including normal completion, the manual timer pattern fills that gap. The\nstacked approach on top of that is for when downstream code also needs DeadlineExceeded.\n\nThe cause APIs have seen steady adoption since Go 1.20. golang.org/x/sync/errgroup uses\nWithCancelCause internally since v0.3.0, so context.Cause(ctx) on an errgroup-canceled\ncontext returns the actual goroutine error. [docker cli] uses it to distinguish OS signals\nfrom normal cancellation. [kubernetes cluster-api] migrated its codebase to the *Cause\nvariants. gRPC-Go had a [proposal] to use it for distinguishing client disconnects from gRPC\ntimeouts and connection closures.\n\nRunnable examples:\n\n- [playground: the debugging problem]\n- [playground: attaching a cause]\n- [playground: the timeout gotcha]\n- [playground: manual timer pattern]\n- [playground: stacked contexts]\n- [playground: reading and logging]\n\n\n\n\n[context package documentation]:\n    https://pkg.go.dev/context#WithCancel\n\n[go 1.20]:\n    https://go.dev/doc/go1.20#context\n\n[proposal #51365]:\n    https://github.com/golang/go/issues/51365\n\n[proposal #56661]:\n    https://github.com/golang/go/issues/56661\n\n[issue #26356]:\n    https://github.com/golang/go/issues/26356\n\n[bcmills on #26356]:\n    https://github.com/golang/go/issues/26356#issuecomment-404870718\n\n[bullgare on #51365]:\n    https://github.com/golang/go/issues/51365#issuecomment-1064461434\n\n[rsc on #51365]:\n    https://github.com/golang/go/issues/51365#issuecomment-1307812595\n\n[docker cli]:\n    https://github.com/docker/cli/blob/419e5d136cc8785f9aae7b36f068decedb9115e0/cmd/docker/docker.go#L56\n\n[kubernetes cluster-api]:\n    https://github.com/kubernetes-sigs/cluster-api/issues/11280\n\n[proposal]:\n    https://github.com/grpc/grpc-go/issues/7541\n\n[streamingfast/substreams]:\n    https://github.com/streamingfast/substreams/blob/develop/reqctx/context.go\n\n[playground: the debugging problem]:\n    https://go.dev/play/p/sxY1R_yD15S\n\n[playground: attaching a cause]:\n    https://go.dev/play/p/zGkd2EzoYRS\n\n[playground: the timeout gotcha]:\n    https://go.dev/play/p/GfEv42EdKRc\n\n[playground: manual timer pattern]:\n    https://go.dev/play/p/WmX6WywiL7o\n\n[playground: stacked contexts]:\n    https://go.dev/play/p/ASYa0IngONt\n\n[playground: reading and logging]:\n    https://go.dev/play/p/l7XlYaAg0Qw",
  "title": "What canceled my Go context?"
}