Anti-LLM Sentiment Considered Harmful
Haskell Community [Unofficial]
May 16, 2026
swamp-agr:
> “temperature” as a parameter to tune reproducible responses
Lowering temperature doesn’t make it completely reproducible, but it makes the program more likely to pick the higher frequency possible next-tokens. The distribution is more “peaked”. Say you have some possible next-tokens given the previous ones where
* P(A) = 0.50
* P(B) = 0.20
* P(C) = 0.15
then at temp 1 there’s no change (it rolls a d20 and on a 1–10 it picks A), while if you lower the temperature you might get something like
* P(A): 0.70
* P(B): 0.15
* P(C): 0.08
(I don’t know about the exact numbers here, but notice how more likely next-tokens become even more likely while the unlikely ones become even more unlikely.)
If you lower it all the way you’d just be picking the most likely next token which tends to lead to quite boring output
It’s still not reproducible. For one, you would have to fix the random seed.
But even with a fixed seed, there’s something going on with parallelisation and rounding in GPU’s where logically the same operation can lead to different results:
and since GPU’s may reorder operations to distribute load efficiently and llm outputs are very dependent on these floats you not guaranteed the exact same result. (In general. Probably there are people working on making reproducible llm’s too, though I have the feeling there’s less resources going into that than there is going into making them bigger and faster etc.)
Discussion in the ATmosphere