Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreigna3xsu7n6ovf6dbfh5fv6romkkespjcje4q64xop3lzoduwkvnq",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mf6h2vgqzjn2"
  },
  "path": "/t/wave-field-llm-o-n-log-n-attention-via-wave-equation-dynamics-within-5-of-standard-transformer/173625#post_1",
  "publishedAt": "2026-02-18T18:58:52.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "GitHub - badaramoni/wave-field-llm: An O(n log n) language model architecture using wave equation dynamics instead of O(n²) self-attention. Within 5% of standard transformer quality."
  ],
  "textContent": "I’ve been working on an alternative attention mechanism that treats language\nas a physical field system instead of using standard O(n²) self-attention.\n\n**How it works:**\n\n  * Tokens are mapped onto a continuous 1D field\n  * Information propagates via damped wave equations: k(t) = exp(-α·t)·cos(ω·t + φ)\n  * Each attention head has just 3 learnable physics parameters (frequency, damping, phase)\n  * Convolution computed via FFT in O(n log n)\n  * Heads self-organize into different roles (local grammar, medium context, long-range)\n\n\n\n**Results (WikiText-2, 6M params, character tokenizer):**\n\nModel | PPL | Accuracy | Complexity\n---|---|---|---\nStandard Transformer | 5.9 | 51.0% | O(n²)\nWave Field V3.5 | 6.2 | 50.5% | O(n log n)\n\nAt longer sequences the savings grow: 31x at 2K tokens, 107x at 8K, 367x at 32K.\n\n**Known limitations:**\n\n  * With BPE tokenizer (8K vocab), there’s a significant capacity gap vs standard transformer\n  * This is a model capacity issue at small scale, not an architecture flaw\n  * Currently scaling to 100M params to see if the gap closes\n\n\n\n**What’s unique:**\n\n  * Every bug during development was found through physics-based diagnostics\n(energy flow, conservation, causality tests) — not guessing\n  * Cross-head field coupling and wave interference for information routing\n  * Not a Mamba/Hyena variant — different approach entirely\n\n\n\nCode: GitHub - badaramoni/wave-field-llm: An O(n log n) language model architecture using wave equation dynamics instead of O(n²) self-attention. Within 5% of standard transformer quality.\n\nHappy to answer questions about the physics, architecture decisions, or results.",
  "title": "Wave Field LLM — O(n log n) attention via wave equation dynamics, within 5% of standard transformer"
}