{
"$type": "site.standard.document",
"canonicalUrl": "https://rednafi.com/misc/etag-and-http-caching/",
"description": "Implement client-side HTTP caching with ETag headers. Learn If-None-Match, 304 Not Modified responses, and weak validation in Go servers.",
"path": "/misc/etag-and-http-caching/",
"publishedAt": "2024-04-10T00:00:00.000Z",
"site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
"tags": [
"API",
"Go",
"Web"
],
"textContent": "One neat use case for the HTTP ETag header is client-side HTTP caching for GET requests.\nAlong with the ETag header, the caching workflow requires you to fiddle with other\nconditional HTTP headers like If-Match or If-None-Match. However, their interaction can\nfeel a bit confusing at times.\n\nEvery time I need to tackle this, I end up spending some time browsing through the relevant\nMDN docs on [ETag], [If-Match], and [If-None-Match] to jog my memory. At this point, I've\ndone it enough times to justify spending the time to write this.\n\nCaching the response of a GET endpoint\n\nThe basic workflow goes as follows:\n\n- The client makes a GET request to the server.\n- The server responds with a 200 OK status, including the content requested and an ETag\n header.\n- The client caches the response and the ETag value.\n- For subsequent requests to the same resource, the client includes the If-None-Match\n header with the ETag value it has cached.\n- The server regenerates the ETag independently and checks if the ETag value sent by the\n client matches the generated one.\n - If they match, the server responds with a 304 Not Modified status, indicating that\n the client's cached version is still valid, and the client serves the resource from\n the cache.\n - If they don't match, the server responds with a 200 OK status, including the new\n content and a new ETag header, prompting the client to update its cache.\n\nWe can test this workflow with GitHub's REST API suite via the [GitHub CLI]. If you've\ninstalled the CLI and authenticated yourself, you can make a request like this:\n\nThis asks for the data associated with the user rednafi. The response looks as follows:\n\nI've truncated the response body and omitted the headers that aren't relevant to this\ndiscussion. You can see that the HTTP status code is 200 OK and the server has included an\nETag header.\n\nThe W/ prefix indicates that a [weak validator] is used to validate the content of the\ncache. Using a weak validator means when the server compares the response payload to\ngenerate the hash, it doesn't do it bit-by-bit. So, if your response is JSON, then changing\nthe format of the JSON won't change the value of the ETag header since two JSON payloads\nwith the same content but with different formatting are semantically the same thing.\n\nLet's see what happens if we make the same request again while passing the value of the\nETag in the If-None-Match header.\n\nThis returns:\n\nThis means that the cached response in the client is still valid and it doesn't need to\nrefetch that from the server. So, the client can be coded to serve the previously cached\ndata to the users when asked for.\n\nA few key points to keep in mind:\n\n- Always wrap your ETag values in double quotes when sending them with the If-None-Match\n header, just as the [spec says for conditional header values].\n\n- Using the If-None-Match header to pass the ETag value means that the client request is\n considered successful when the ETag value from the client doesn't match that of the\n server. When the values match, the server will return 304 Not Modified with no body.\n\n- If we're writing a compliant server, when it comes to If-None-Match, the [spec tells us\n to use weak comparison for ETags]. This means that the client will still be able to\n validate the cache with weak ETags, even if there have been slight changes to the\n representation of the data.\n\n- If the client is a browser, it'll automatically manage the cache and send conditional\n requests without any extra work.\n\nWriting a server that enables client-side caching\n\nIf you're serving static content, you can configure your load balancer to enable this\ncaching workflow. But for dynamic GET requests, the server needs to do a bit more work to\nallow client-side caching.\n\nHere's a simple server in Go that enables the above workflow for a dynamic GET request:\n\n- The server generates a weak ETag for its content by creating a SHA-256 hash and adding\n W/ to the front, indicating it's meant for weak comparison.\n\n You could make the calculateETag function format-agnostic, so the hash stays the same\n if the JSON format changes but the content does not. The current calculateETag\n implementation is susceptible to format changes, and I kept it that way to keep the code\n shorter.\n\n- When delivering content, the server includes this weak ETag in the response headers,\n allowing clients to cache the content along with the ETag.\n\n- For subsequent requests, the server checks if the client has sent an ETag in the\n If-None-Match header and weakly compares it with the current content's ETag by\n independently generating the hash.\n\n- If the ETags match, indicating no significant content change, the server replies with a\n 304 Not Modified status. Otherwise, it sends the content again with a 200 OK status\n and updates the ETag. When this happens, the client knows that the existing cache is\n still warm and can be served without any changes to it.\n\nYou can spin up the server by running go run main.go and from a different console, start\nmaking requests to it like this:\n\nThis will return the ETag header along with the JSON response:\n\nNow, you can make another request with the value of the ETag in the If-None-Match\nheader:\n\nThis will return a 304 Not Modified response with no body:\n\nIn a real-life scenario, you'll probably factor out the caching part in middleware so that\nall of your HTTP GET requests can be cached from the client-side without repetition.\n\nOne thing to look out for\n\nWhile writing a cache-enabled server, make sure the system is set up so that the server\nalways sends back the same ETag for the same content, even when there are multiple servers\nworking behind a load balancer. If these servers give out different ETags for the same\ncontent, it can mess up how clients cache that content.\n\nClients use ETags to decide if content has changed. If the ETag value hasn't changed, they\nknow the content is the same and don't download it again, saving bandwidth and speeding up\naccess. But if ETags are inconsistent across servers, clients might download content they\nalready have, wasting bandwidth and slowing things down.\n\nThis inconsistency also means servers end up dealing with more requests for content that\nclients could have just used from their cache if ETags were consistent.\n\n\n\n\n[etag]:\n https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag\n\n[if-match]:\n https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Match\n\n[if-none-match]:\n https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match\n\n[github cli]:\n https://cli.github.com/\n\n[weak validator]:\n https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests#weak_validation\n\n[spec says for conditional header values]:\n https://www.rfc-editor.org/rfc/rfc7232#section-3.2\n\n[spec tells us to use weak comparison for ETags]:\n https://www.rfc-editor.org/rfc/rfc7232#section-2.3.2",
"title": "ETag and HTTP caching"
}