{
  "$type": "site.standard.document",
  "canonicalUrl": "https://rednafi.com/go/splintered-failure-modes/",
  "description": "Simplify Go error handling by consolidating validation and system errors. Learn when to return boolean vs error for clearer failure modes.",
  "path": "/go/splintered-failure-modes/",
  "publishedAt": "2025-11-30T00:00:00.000Z",
  "site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
  "tags": [
    "Go",
    "Error Handling"
  ],
  "textContent": "> A man with a watch knows what time it is. A man with two watches is never sure.\n>\n> -- [Segal's Law]\n\nTake this example:\n\nThis function returns two signals: a boolean to indicate if the string is valid, and an\nerror to explain any problem the function might run into.\n\nThe issue is that these two signals are independent. Put together, they produce four\npossible combinations:\n\n1.  true, nil: The input is valid and the function encountered no issues. This is the only\n    obvious mode.\n2.  false, nil: Implies the function didn't hit a system error but the input was invalid.\n    However, in many codebases, this combination is accidentally used to hide real errors\n    that were swallowed.\n3.  true, err: A contradiction. The function claims success and failure at the same time.\n4.  false, err: Looks like a clean failure, but it creates a priority trap. The Go\n    convention dictates you must check the error first. If a caller checks the boolean\n    first, they might see false and treat a major system crash as a simple validation\n    failure.\n\nIn this specific case, we never return true, err, but the caller doesn't know that. They\nhave to read the code to understand which subset of the possible combinations the function\nactually uses.\n\nSplintered failure modes\n\nFor lack of a better term, I call this splintered failure modes. It is one of the cases\nthat the adage _[make illegal state unrepresentable]_ aims to prevent.\n\nIn our case, validate encodes the success/failure state in _two_ places. These two signals\ncan disagree. The boolean tries to express validity, and the error tries to express system\nfailure, yet both attempt to answer the same question: _did this succeed?_\n\nWhen combinations like false, nil or true, err appear, the caller needs to know how to\nreconcile the conflicting states.\n\nRepresent failure modes exclusively via the error\n\nWe fix the ambiguity by removing the boolean status flag entirely.\n\nIn this refactored version, the error assumes total responsibility for the function's\nstate (success vs. failure). The first return value becomes purely the payload.\n\nThe caller checks one place and one place only: the error.\n\nThis makes the call site trivial because the state is no longer split. If the error is\nnon-nil, the operation failed. If it is nil, the operation succeeded.\n\nDistinguishing failure types within the error\n\nSometimes the caller of a function needs to take different actions depending on the type of\nan error. In that case, just knowing whether a function succeeded or failed isn't enough.\n\nRemoving the boolean removes the ambiguity, but it introduces a new question: _How do we\ndistinguish between \"validation error\" and \"system failure\"?_\n\nPreviously, the boolean represented validation outcome (valid/invalid), and the error\nrepresented the system failures (crash/upstream). Now that we have consolidated everything\ninto error, we need a way to differentiate the _kind_ of failure without re-introducing a\nsecond return value.\n\nSentinel errors\n\nWe can use [sentinel errors] to encode multiple failure modes into one error variable. The\nerror return value remains the single source of truth for \"did it fail?\", but the\n_content_ of that error tells us \"how it failed.\"\n\nWe have unified the failure state (it is always just an error), but we haven't lost the\ngranularity. The caller can now use errors.Is to switch between the failure modes:\n\nError types\n\nIf sentinels aren't enough (for example, if you need to know _which_ field failed\nvalidation), you can use [error types]. This allows the single error value to carry\nstructured metadata while still adhering to the standard error interface.\n\nHere, we map both \"Empty\" and \"Corrupted\" to a ValidationError type, while leaving system\nerrors as standard errors.\n\nThe caller can then use errors.As to inspect the failure mode in detail:\n\nBy sticking to the error value as the single indicator of failure, we eliminate the _\"two\nwatches\"_ paradox. Whether the failure is a simple validation error or a catastrophic system\ncrash, all the failure modes are encapsulated inside the single error value itself.\n\n\n\n\n\n[segal's law]:\n    https://en.wikipedia.org/wiki/Segal%2527s_law\n\n[make illegal state unrepresentable]:\n    https://khalilstemmler.com/articles/typescript-domain-driven-design/make-illegal-states-unrepresentable/\n\n[sentinel errors]:\n    https://dave.cheney.net/2016/04/27/dont-just-check-errors-handle-them-gracefully#::text=three%20core%20strategies.-,Sentinel%20errors,-The%20first%20category\n\n[error types]:\n    https://dave.cheney.net/2016/04/27/dont-just-check-errors-handle-them-gracefully#::text=will%20discuss%20next.-,Error%20types,-Error%20types%20are",
  "title": "Splintered failure modes in Go"
}