{
"$type": "site.standard.document",
"canonicalUrl": "https://rednafi.com/go/circuit-breaker/",
"description": "Build a production-ready circuit breaker in Go from scratch with closed, open, and half-open states to prevent cascading failures.",
"path": "/go/circuit-breaker/",
"publishedAt": "2024-10-06T00:00:00.000Z",
"site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
"tags": [
"Networking",
"Go",
"Design Patterns"
],
"textContent": "Besides retries, [circuit breakers] are probably one of the most commonly employed\nresilience patterns in distributed systems. While writing a retry routine is pretty simple,\nimplementing a circuit breaker needs a little bit of work.\n\nI realized that I usually just go for off-the-shelf libraries for circuit breaking and\nhaven't written one from scratch before. So, this is an attempt to create a sloppy one in\nGo. I picked Go instead of Python because I didn't want to deal with sync-async\nidiosyncrasies or abstract things away under a soup of decorators.\n\nCircuit breakers\n\nA circuit breaker acts like an automatic switch that prevents your application from\nrepeatedly trying to execute an operation that's likely to fail. In a distributed system,\nyou don't want to bombard a remote service when it's already failing, and circuit breakers\nprevent that.\n\nIt has three states: Closed, Open, and Half-Open. Here's a diagram that shows\nthe state transitions:\n\n\n\n{{< mermaid >}}\nstateDiagram-v2\n [] --> Closed: Start\n Closed --> Open: Failure threshold reached\n Open --> HalfOpen: Recovery period expired\n HalfOpen --> Closed: Success threshold reached\n HalfOpen --> Open: Request failed\n\n note right of Closed: All requests are allowed\n note right of Open: Requests are blocked\n note right of HalfOpen: Limited requests allowed to check recovery\n{{</ mermaid >}}\n\n\n\n1. Closed: This is the healthy operating state where all requests are allowed to pass\n through to the service. If a certain number of consecutive requests fail (reaching a\n failure threshold), the circuit breaker switches to the Open state.\n\n2. Open: In this state, all requests are immediately blocked, and an error is returned\n to the caller without attempting to contact the failing service. This prevents\n overwhelming the service and gives it time to recover. After a predefined recovery\n period, the circuit breaker transitions to the Half-Open state.\n\n3. Half-Open: The circuit breaker allows a limited number of test requests to see if the\n service has recovered. If these requests succeed, it transitions back to the Closed\n state. If any of them fail, it goes back to the Open state.\n\nBuilding one in Go\n\nHere's a simple circuit breaker in Go.\n\nDefining states\n\nFirst, we'll define the constants for our states and create the circuitBreaker struct,\nwhich holds all the configurable knobs.\n\nThis struct includes:\n\n- mu: A mutex to ensure thread-safe access to the circuit breaker.\n- state: The current state of the circuit breaker (Closed, Open, or HalfOpen).\n- failureCount: The current count of consecutive failures.\n- lastFailureTime: The timestamp of the last failure.\n- halfOpenSuccessCount: The number of successful requests in the HalfOpen state.\n- failureThreshold: The number of consecutive failures allowed before opening the circuit.\n- recoveryTime: The cool-down period before the circuit breaker transitions from Open to\n HalfOpen.\n- halfOpenMaxRequests: The maximum number of successful requests needed to close the\n circuit.\n- timeout: The maximum duration to wait for a request to complete.\n\nInitializing the breaker\n\nNext, we provide a constructor function to initialize a new circuitBreaker instance.\n\nThis function sets the initial state to Closed and initializes the thresholds and timeout.\n\nImplementing the Call method\n\nThe Call method is the primary interface for executing functions through the circuit\nbreaker. It dispatches the appropriate state handler based on the current state.\n\nWe use a mutex to protect against concurrent access since the circuit breaker might be used\nby multiple goroutines. The Call method uses a switch statement to delegate the function\ncall to the appropriate handler based on the current state.\n\nHandling closed states\n\nIn the Closed state, all requests are allowed to pass through. We monitor the requests for\nfailures to decide when to trip the circuit breaker.\n\nIn this function:\n\n- We attempt to execute the provided function fn using runWithTimeout to handle possible\n timeouts.\n\n- If the function call fails, we increment the failureCount and update lastFailureTime.\n- If the failureCount reaches the failureThreshold, we transition the circuit to the\n Open state.\n- If the function call succeeds, we reset the circuit breaker to the Closed state by\n calling resetCircuit.\n\nResetting the breaker\n\nWhen a request succeeds, we reset the failure count and keep the circuit in the Closed\nstate.\n\nHandling open states\n\nIn the Open state, all requests are blocked to prevent further strain on the failing\nservice. We check if the recovery period has expired before transitioning to the HalfOpen\nstate.\n\nHere:\n\n- We check if the recovery period (recoveryTime) has passed since the last failure.\n- If it has, we transition to the HalfOpen state and reset the counters.\n- If not, we block the request and return an error immediately.\n\nHandling half-open states\n\nIn the HalfOpen state, we allow a limited number of requests to test if the service has\nrecovered.\n\nIn this function:\n\n- We attempt to execute the provided function fn.\n- If the function call fails, we transition back to the Open state.\n- If the function call succeeds, we increment halfOpenSuccessCount.\n- Once the success count reaches halfOpenMaxRequests, we reset the circuit breaker to the\n Closed state.\n\nRunning functions with timeout\n\nTo prevent the circuit breaker from hanging on slow or unresponsive functions, we implement\na timeout mechanism. You probably noticed that inside each state handler we called the\nwrapped functions with runWithTimeout.\n\nThis function:\n\n- Creates a context with a timeout using context.WithTimeout.\n- Executes the provided function fn in a separate goroutine.\n- Waits for either the result or the timeout.\n- Returns an error if the function takes longer than the specified timeout.\n\nTaking it for a spin\n\nLet's test our circuit breaker with an unreliable service that sometimes fails.\n\nIn the main function, we'll create a circuit breaker and make several calls to the\nunreliable service.\n\nThis loop simulates multiple service calls, using the circuit breaker to handle failures and\ntransitions between states.\n\nThis prints:\n\nThe log messages will give you a sense of what's happening when we retry an intermittently\nfailing function wrapped in a circuit breaker.\n\nThe API could be better\n\nOne limitation of Go generics is that you can't use type parameters with methods that have a\nreceiver. This means you can't define a method like\nfunc (cb CircuitBreaker[T]) Call(fn func() (T, error)) (T, error).\n\nFor this, we have to use workarounds such as using any (an alias for interface{}) as the\nreturn type in our function signatures. While this sacrifices some type safety, it allows us\nto create a flexible circuit breaker that can handle functions returning different types.\n\nHandling incompatible function signatures\n\nWhat if the function you want to wrap doesn't match the func() (any, error) signature? You\ncan easily adapt it by wrapping your function to fit the required signature.\n\nSuppose you have a function like this:\n\nYou can wrap it like this:\n\nNow, wrappedFunc matches the func() (any, error) signature and can be used with our\ncircuit breaker.\n\nHere's the [complete implementation on GitHub] with tests.\n\n\n\n\n\n[circuit breakers]:\n https://martinfowler.com/bliki/CircuitBreaker.html\n\n[complete implementation on GitHub]:\n https://github.com/rednafi/circuit-breaker",
"title": "Writing a circuit breaker in Go"
}