Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieeya7f5hx3ifzcxr4zbyybpoazd477kdscv5vmji3eyqlsmo6nca",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3moj2wmgvluo2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiduwmmcomsjfncolsivbr3spknzhjrs7xsft5zkj5m6spnahtxlga"
    },
    "mimeType": "image/webp",
    "size": 179876
  },
  "path": "/wilfridterry/200-ok-is-not-the-same-as-it-works-14cd",
  "publishedAt": "2026-06-17T19:19:33.000Z",
  "site": "https://dev.to",
  "tags": [
    "javascript",
    "monitoring",
    "testing",
    "webdev",
    "NorthDuty",
    "free to use right now"
  ],
  "textContent": "A few months ago a team I know shipped a routine Friday deploy. Every monitor stayed green all weekend. On Monday they discovered the signup form had been throwing a JavaScript error since Friday afternoon. The server was returning `200 OK` the whole time. The page loaded. The HTML was valid. And not a single person could create an account for three days.\n\nNobody filed a bug. Customers don't file bugs. They hit a wall and leave.\n\nThis is the uncomfortable truth about most monitoring setups: **they answer \"is the server responding?\" when the question that actually matters is \"can a customer do the thing they came to do?\"** Those are not the same question, and the gap between them is where revenue quietly leaks out.\n\n##  Why `200 OK` lies\n\nA classic uptime check does roughly this:\n\n\n\n    curl -s -o /dev/null -w \"%{http_code}\" https://yoursite.com/\n    # 200\n\n\nGreen. Ship it. But `200` only tells you the origin returned _something_. It says nothing about whether that something is usable. All of the following return `200` while being completely broken for a real human:\n\n  * A page that renders blank because a JS bundle 404'd and the framework never hydrated.\n  * A checkout button that disappeared after a CSS refactor changed a class name.\n  * A login form that submits to an endpoint now returning `500`, but the page itself loads fine.\n  * A third-party script (payments, analytics, a chat widget) that fails and takes the rest of the page down with it.\n  * A layout that \"works\" but pushes the CTA below a broken hero image, so conversions crater.\n\n\n\nThe server is healthy. The _experience_ is dead. And the longer your front end leans on client-side rendering, third-party scripts, and multi-step flows, the wider this gap gets.\n\n##  Three layers, not one\n\nClosing the gap means monitoring at three levels, each catching a class of failure the others miss.\n\n###  1. Health — but the deep kind\n\nPinging a URL is table stakes. A genuinely useful health check on a single request should also surface:\n\n  * **HTTP, SSL, DNS, redirects** — the boring stuff that still takes you down at 2 a.m. when a cert expires.\n  * **Blank-page / empty-render detection** — did the DOM actually paint meaningful content, or did you ship an empty `<div id=\"app\">`?\n  * **Broken resources** — any sub-resource (JS, CSS, images, fonts) that failed to load.\n  * **Console JavaScript errors** — the silent killers, since a thrown error can break interactivity without changing the status code.\n  * **First-party API calls** — did the XHR/fetch calls the page depends on actually succeed?\n  * **Core Web Vitals, security headers, basic a11y and SEO** — slower-moving signals, but cheap to grab in the same pass.\n\n\n\nThe key shift: stop treating \"responded\" as \"healthy.\" Healthy means _rendered and interactive_.\n\n###  2. Visual regression — catch what you can't assert\n\nSome breakage has no clean assertion. A button moved. The hero image is 404ing so the layout collapsed. A font swap pushed everything 40px down. You can't easily `expect()` your way to \"the page looks right.\"\n\nSo you do what humans do — you look. Programmatically:\n\n  1. Capture a screenshot on a schedule (daily/weekly for stable pages).\n  2. Diff it pixel-by-pixel against the previous baseline.\n  3. Surface the changed percentage and the diff image so a human can glance and decide: intended change, or regression?\n\n\n\nThis is the same idea behind tools like Percy or BackstopJS, applied continuously to production rather than only in CI. A 2% diff after a deploy you didn't ship is a great early-warning signal.\n\n###  3. Journey monitoring — test the verbs\n\nHealth checks test nouns (the page). Journeys test verbs (the actions). This is where real money lives:\n\n  * **Search → add to cart → checkout** for ecommerce.\n  * **Signup → verify → onboard** for SaaS.\n  * **Login → load dashboard → key action** for everything.\n\n\n\nA journey monitor drives a real (headless) browser through these steps on a schedule and reports failure _at the step level_ — so you don't just learn \"checkout is broken,\" you learn \"step 4, clicking 'Place order,' timed out.\" Historically this meant maintaining brittle Playwright/Cypress scripts that break every time a selector changes. The newer approach is to describe the flow in plain language and let the tooling resolve the steps, which dramatically lowers the maintenance cost that kills most synthetic-monitoring efforts.\n\n##  Where this lands in practice\n\nYou can absolutely assemble this yourself: a cron'd headless-Chrome script for health, BackstopJS for visual diffs, Playwright for journeys, and something to route alerts. I've gone down that road; the wiring and the _upkeep_ are the expensive parts. Selectors rot, baselines drift, and the alerting glue becomes its own side project.\n\nThe other option is a tool that bundles the three layers. NorthDuty is one I looked at recently that's built squarely around this \"up but broken\" thesis — it runs health checks (every 5 minutes by default), screenshot-based visual diffs, and user-journey checks on the same project, and notably lets you define journeys as plain text instead of scripts, plus AI-suggests a handful of likely happy-path flows per site. It's free to use right now, so it's easy to point it at a site and see what your current monitoring has been missing. There are others in adjacent space (Checkly leans script-first and developer-heavy, Better Stack and Pingdom lean uptime-first, Visualping is visual-only). The point isn't the brand — it's that you should be covering all three layers, however you get there.\n\n##  A pragmatic starting point\n\nIf you want to close the biggest part of the gap with the least effort, in order:\n\n  1. **Upgrade your health check** to detect blank renders, console errors, and failed sub-resources — not just status codes. This alone catches a surprising share of \"green but broken\" incidents.\n  2. **Add one journey** for your single most revenue-critical flow (checkout or signup). One good journey beats ten URL pings.\n  3. **Add visual diffs** on your 3–5 highest-traffic, rarely-changing pages, where an unexpected diff is almost always a regression.\n  4. **Set thresholds, not just on/off** — alert on response time, health score, SSL expiry, and journey failure, and route them somewhere your team already reads (Slack/Discord/Teams), with maintenance windows to mute planned-work noise.\n\n\n\n##  The takeaway\n\n`200 OK` is a promise from your server, not from your product. The deploys that hurt most are rarely the ones that take the site _down_ — they're the ones that leave it _up and quietly broken_ , where every dashboard is green and your customers are the only ones who know the truth.\n\nMonitor the experience, not just the endpoint.\n\n_How does your team catch \"up but broken\" today — custom scripts, a hosted tool, or do you find out from support tickets? Curious what's actually working for people._",
  "title": "200 OK Is Not the Same as \"It Works\""
}