Raw Record Source

{
  "path": "/posts/2024/llm-rand/index",
  "site": "at://did:plc:mracrip6qu3vw46nbewg44sm/site.standard.publication/self",
  "tags": [
    "language_models"
  ],
  "$type": "site.standard.document",
  "title": "Language model random number generator",
  "updatedAt": "2024-10-22T00:00:00.000Z",
  "publishedAt": "2024-10-22T00:00:00.000Z",
  "textContent": "I had the idea to try and use a language model as a random number generator.\nI didn't expect it to actually work as a uniform random number generator but was curious to see what the distribution of numbers would look like.\n\nMy goal was to prompt the model to generate a number between 1 and 100.\nI could also vary the temperature to see how that changed the distribution of the numbers.\n\nFirst, we import a bunch of libraries we'll use later\n\nNext, we'll define functions to call ollama using the openai client, a prompt to generate a number between 1 and 100 and a parsing function that will deal with the messiness of the model outputs.\nThe parser is very permissive.\nIt grabs the first digit it finds in the output string, even if it's part of another string.\nFor example:\n\n    123\n\nHere's an example of the whole thing end to end with llama3.2 and temperature 1.5\n\n    Random number: 53\n\n    53\n\nWe're also doing to do some runs with high temperatures, like 9 and 11.\nThey outputs from the models using those temperatures are weird, but we can parse them some of the time.\n\n    randomumber：Observers1BekK\n\n    1\n\nThere are lots of good, small models we can try this experiment on.\nWe're going to loop through them and several different temperature values, attempting to generate 500 values for each model-temperature combination.\nIf we can't parse a number from the model response, we just move on.\nWe're going to count the number of successful samples later.\nWe write the results to jsonl files, which makes it easier to resume the experiment in case something goes wrong along the way.\nRunning this takes a while - 10s of minutes on my M3 MBP.\n\nOnce we have the data in jsonl files, we can take a look at the distributions of the generated numbers\n\n!png\n\nThat's a lot of data.\nMy takeaways:\n\n- Models with simple prompts, at long temperatures, generally output the same couple of numbers, even though we ask for a \"random\" number\n- llama3.2 outputs a pretty stable distribution of numbers across temperatures\n- All the models some sort of bi-modal behavior between temperatures 3 and 7\n- My permissive number parsing probably masks some pretty bad behavior by the model\n\nLet's take a look at that last point now.\nWe're going to plot the number of valid outputs per model across temperature to see how well they followed instructions and actually output an integer.\nKeep in mind, our integer parsing is much more permissive than our instructions, so an output that can't be parsed is a pretty serious failure by the model.\nHere's an example what these look like:\n\n    random value in java script without additional logic:\\`\n    No number found in output: random value in java script without additional logic:\\`\n\nYikes!\nThis is why we use models at temperatures in the 0-2 range.\nWith that, let's see how these models breakdown.\n\n!png\n\nAs we'd expect, we start to have problems between temperatures 1 and 3.\n\nHopefully, you never find yourself in a position where you're using a language model to generate a random number but it was a fun experiment nonetheless.",
  "canonicalUrl": "https://www.danielcorin.com/posts/2024/llm-rand/index"
}