External Publication
Visit Post

Attention Is All We Had — But Not What We Needed: Language generation without attention via iterative energy-based state refinement

Hugging Face Forums [Unofficial] May 29, 2026
Source

New Version

Zenodo

Attention Is All We Had — But Not What We Needed: Convergent State Machine...

We introduce the Convergent State Machine (CSM), a novel architecture for language generation that replaces attention entirely with energy-based iterative state refinement. Three models trained (66M, 150M, 331M), all with zero attention layers. CSM...

Discussion in the ATmosphere

Loading comments...