{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreiehbmhsbvo6s5cjx73mpekm6qsiy5akssmnm5dgc5xyopjhhionlq",
"uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mohlvu7jqud2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreihiiyfeaga2gn2xim5rhdpjjqrzawolf2w7hyrhl436xs6b7nayaq"
},
"mimeType": "image/webp",
"size": 82112
},
"path": "/bluewhale-quant-lab/how-to-benchmark-api-latency-to-any-endpoint-polymarket-case-study-512o",
"publishedAt": "2026-06-17T04:51:36.000Z",
"site": "https://dev.to",
"tags": [
"python",
"performance",
"networking",
"devops",
"the Amsterdam box I use"
],
"textContent": "\"Just ping it\" is bad latency advice. ICMP gets deprioritized behind CDNs and tells you almost nothing about real request latency. This is how to benchmark API latency _properly_ , with a real case study: finding where Polymarket's order book lives.\n\n## Why `ping` lies\n\n`ping` measures ICMP echo round-trip. But:\n\n * CDNs and load balancers often **rate-limit or deprioritize ICMP** , so the number is noisy or misleadingly high/low.\n * It ignores **TLS handshake** cost, which dominates short HTTPS requests.\n * It tells you nothing about **server processing time** (TTFB).\n\n\n\nFor an API, measure what the API actually does: TCP connect, TLS, and time-to-first-byte.\n\n## A proper latency harness (Python)\n\n\n import socket, ssl, time, statistics, http.client\n\n def percentiles(xs):\n xs = sorted(xs); n = len(xs)\n return {\n \"min\": round(xs[0], 2),\n \"p50\": round(statistics.median(xs), 2),\n \"p95\": round(xs[int(n*0.95)-1], 2),\n \"p99\": round(xs[int(n*0.99)-1], 2),\n \"max\": round(xs[-1], 2),\n }\n\n def tcp_connect_ms(host, port=443):\n t = time.perf_counter()\n s = socket.create_connection((host, port), timeout=5); s.close()\n return (time.perf_counter() - t) * 1000\n\n def ttfb_ms(host, path=\"/\"):\n t = time.perf_counter()\n c = http.client.HTTPSConnection(host, 443, timeout=5,\n context=ssl.create_default_context())\n c.request(\"GET\", path); r = c.getresponse(); r.read(1); c.close()\n return (time.perf_counter() - t) * 1000\n\n def bench(host, n=200):\n return {\n \"tcp_connect\": percentiles([tcp_connect_ms(host) for _ in range(n)]),\n \"ttfb\": percentiles([ttfb_ms(host) for _ in range(n)]),\n }\n\n import json\n print(json.dumps(bench(\"clob.polymarket.com\"), indent=2))\n\n\n## Read p99 and jitter, not just the average\n\nThe **average** is marketing. What kills a trading bot is the **p99** — your latency during the volatile windows you actually trade in. Always report `min / p50 / p95 / p99 / max`. A box with p50=1.2 ms but p99=12 ms is worse than a steady p50=3 ms box.\n\nCheck jitter over time, too:\n\n\n\n ping -i 0.5 -c 600 clob.polymarket.com | tail -3 # watch min/avg/max/mdev spread\n\n\n## The case study: where is Polymarket's CLOB?\n\nI ran the harness from VPS boxes in five regions:\n\nRegion | TCP connect p50 | TTFB p50\n---|---|---\n**Amsterdam** | **~1.4 ms** | **~6 ms**\nFrankfurt | ~9 ms | ~16 ms\nUS-East | ~90 ms | ~110 ms\nSingapore | ~168 ms | ~195 ms\n\nA ~1.4 ms TCP connect is only possible within ~100 km (fiber does ~200 km/ms RTT). So the endpoint is in **Amsterdam** — proven by physics, not vibes. (Whether the matching engine is co-located vs behind an edge is a fair inference from the low TTFB, but the hosting decision is the same either way.)\n\n## Turning the benchmark into a decision\n\nThe whole point of benchmarking is to _act_ on it. For Polymarket, the data says: host in Amsterdam. I moved my bot to an AMS-metro VPS and the connect time went from ~90 ms to ~1.2 ms. The box I use: **the Amsterdam box I use**\n_Disclosure: affiliate link — I earn a referral. The numbers above are from this box._\n\n## Reusable checklist\n\n * ✅ Measure **TCP connect + TTFB** , not just ICMP.\n * ✅ Report **percentiles** , especially p99.\n * ✅ Test **jitter** over minutes, at different times of day.\n * ✅ Compare **multiple regions** with hourly VPS boxes.\n * ✅ Convert sub-2 ms numbers into \"same metro\" conclusions via the fiber speed limit.\n\n\n\nThis harness works for any endpoint — exchanges, RPCs, your own APIs. Polymarket just happens to have a satisfying answer: Amsterdam.\n\n_Numbers from my own 2026 tests. Not financial advice._",
"title": "How to Benchmark API Latency to Any Endpoint (Polymarket Case Study)"
}