Early return and goroutine leak

Redowan Delowar September 7, 2025
Source
At work, a common mistake I notice when reviewing candidates' home assignments is how they wire goroutines to channels and then return early. The pattern usually looks like this: - start a few goroutines - each goroutine sends a result to its own unbuffered channel - in the main goroutine, read from those channels one by one - if any read contains an error, return early The trap is the early return. With an unbuffered channel, a send blocks until a receiver is ready. If you return before reading from the remaining channels, the goroutines writing to them block forever. That's a goroutine leak. Here's how the bug appears in a tiny example: one worker intentionally fails, causing the main goroutine to bail early. That early return skips the receive from ch2, leaving the sender on ch2 stuck. One simple fix is to make sure you always read from both channels before you decide what to do. This guarantees that every send has a matching receive and no goroutine gets stuck: This is safe but it means you always wait for both workers even when the first one already failed and the second result is irrelevant. If you want to return early without leaking, another option is to use buffered channels so the producers don't block on send. A buffer of size one is enough for this pattern. Buffered channels remove the blocked send, but they also make it easier to forget that a second result exists at all. If that second value carries data you must process, you should still receive it. If it is truly fire and forget, buffering is fine. Often the cleanest approach is to drop the channel plumbing when you only need to run tasks and aggregate errors. The [errgroup] package lets each goroutine return an error while the group does the waiting. There is nothing to forget to receive, so there is nothing to leak. Sometimes you also want peers to stop once one task fails. errgroup.WithContext gives you a context that gets canceled as soon as any task returns an error. You pass that context into your workers and have them check ctx.Done() so they can exit quickly. At this point it is natural to ask if tools can catch the original bug for you. go vet cannot. Vet is static analysis that runs at build time. Whether a send blocks depends on runtime control flow and timing. Vet cannot prove that the function returns before a particular receive in a general way, so it doesn't flag this pattern. go test -race cannot either. The race detector detects unsynchronized concurrent memory access. A goroutine stuck on a channel send isn't a data race. You may see a test hang until timeout, but the tool won't point to a leaking goroutine. You can turn this into a failing test with [goleak] from Uber. goleak fails if goroutines are still alive when a test ends. It snapshots all goroutines via the runtime, filters out the standard background ones, and reports the rest. Wire it into a test that triggers the early return and you will see the blocked sender's stack in the output. Here is a test that leaks and fails: This test fails and prints the goroutine stack stuck in the send to ch2. If you switch the implementation to a fixed version, the test passes. For example, the draining fix: If you prefer suite wide enforcement, add goleak to your TestMain. This way your entire test run fails if any test leaks goroutines. If you start goroutines that send on channels, think carefully about early returns. An unbuffered send waits for a receive, and if you return before that receive happens, you've leaked a goroutine. You can avoid this by: - always draining all channels - buffering intentionally so sends don't block - or using errgroup, with or without context, so tasks return errors and cooperate on cancellation Add goleak to your tests so leaks surface early during development. [errgroup]: pkg.go.dev/golang.org/x/sync/errgroup [goleak]: github.com/uber-go/goleak

Discussion in the ATmosphere

Loading comments...