{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreia2qzapj2nzlymeqlk2c6oj3enhuwgysnuatproes6giif7zoy5qe",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mgrbayou72a2"
  },
  "path": "/t/technical-blog-post-streaming-algorithms-and-numerical-stability-in-ml-systems/174162#post_1",
  "publishedAt": "2026-03-11T01:42:02.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "huggingface.co",
    "FlashAttention, Streaming Algorithms, and Numerical Stability in Modern ML..."
  ],
  "textContent": "Hey everyone — I published a technical blog post today on streaming algorithms and numerical stability in ML systems, using FlashAttention as the main example:\n\n**What it covers:**\nFlashAttention’s tiled computation avoids materializing the full attention matrix — but doing so requires maintaining numerically stable running statistics (running max, normalization constant, output accumulator). This turns out to be the same design constraint behind stable softmax, log-sum-exp, and Welford’s variance algorithm.\n\nThe post traces that common pattern and includes two small experiments:\n\n  * Variance: four mathematically equivalent formulas that produce wildly different results under float32 (including one that returns -65,542 when the correct answer is ~1)\n  * Softmax: naive vs. subtract-max, showing overflow propagation to NaN\n\n\n\n**Why I wrote it:**\nA lot of the “implementation details” in ML infrastructure aren’t really details — they’re load-bearing. I wanted to write something that made that concrete rather than just asserting it.\n\nWould love feedback, especially:\n\n  * Are there other examples of this pattern I should have included?\n  * Anything in the numerical stability section that could be sharper?\n\n\n\nto post:\n\nhuggingface.co\n\n### FlashAttention, Streaming Algorithms, and Numerical Stability in Modern ML...\n\nA Blog post by Jen Wei on Hugging Face\n\n, Jen",
  "title": "Technical blog post -- streaming algorithms and numerical stability in ML systems"
}