{
"$type": "site.standard.document",
"content": "---\ntitle: \"LLMs Unplugged: teaching language models with pen, paper, and dice\"\ndescription: \"A hands-on teaching resource where learners build bigram language models with\n pen, paper, and dice---no computers required. Tested with school students,\n public servants, and tech professionals.\"\ntags:\n - teaching\n - ai\n - llm\n---\n\n:::tip\n\nSome of this blog post is copied from the actual website copy at\n[LLMs Unplugged](https://www.llmsunplugged.org). But it doesn't count as\nplagiarism because I wrote all the content on that website too :)\n\n:::\n\nLarge language models (LLMs) are everywhere, but how many people actually\nunderstand how they work? Not the hand-wavy \"they predict the next word\"\nexplanation, but the actual mechanics of how. That's what\n[LLMs Unplugged](https://www.llmsunplugged.org), a new teaching resource I've\nbeen working on for the last year or so, is for.\n\nThe core insight is dead simple: language models predict the next word by\ncounting patterns. A bigram model asks \"after seeing word X, what usually comes\nnext?\" You can teach this with pen, paper, and dice---no computers required.\nLearners manually count word transitions, record them in a grid, then roll dice\nto generate text. The probabilistic nature becomes tangible instead of\nmysterious.\n\nThe \"unplugged\" framing is borrowed: this builds on the\n[CS Unplugged](https://www.csunplugged.org/) tradition of teaching computer\nscience concepts without computers. That project has brilliant\nactivities for teaching algorithms, binary numbers, and compression. But there's\na gap when it comes to language models specifically[^gap].\n\n[^gap]:\n CS Unplugged does have some machine learning activities, but they focus on\n classification rather than generative models.\n\nThe unplugged approach has real pedagogical advantages. The activity is\ntransparent: when you're physically counting transitions and filling in a\ngrid, there's no black box, and you can see exactly where the probabilities\ncome from. It's accessible, too: you don't need computers, coding skills, or\nexpensive infrastructure, just paper and dice. Rolling dice to generate text\nis also surprisingly fun (honestly I think this is just as good as a team\nbonding or bucks'-night activity, but the learning is real). And once you've\nbuilt a bigram model by hand, the jump to understanding GPT becomes\nconceptual rather than magical, which is the transfer we're actually aiming\nfor.\n\nWe've tested this with hundreds of participants across wildly different\ncontexts: school students (12+), undergrads, senior public servants, educators,\ntech professionals. It works for all of them, though for different reasons.\n\nFor school students, it demystifies AI and makes probability concrete. For\npublic servants, it provides a mental model for understanding the AI systems\nthey're being asked to use and regulate. Even for tech professionals, it's\nsurprisingly good as a team-building exercise---there's something equalising\nabout everyone sitting down with pencils and dice[^team-building].\n\n[^team-building]:\n Also turns out that explaining language models to your colleagues without\n using jargon is harder than you think. Good practice.\n\nParents have used it at home with their kids. Educators have remixed it for\ntheir own classrooms (it's CC BY-NC-SA licensed specifically for that). But I\nreally think that this is just the beginning, and I (along with the team at the\nCybernetic Studio at the ANU School of Cybernetics) have some big plans for LLMs\nUnplugged in 2026.\n\nThe core activity itself goes like this:\n\n1. pick a simple text corpus (we provide booklets with pre-selected texts)\n2. manually count word transitions: if \"the\" appears 10 times, and \"cat\" follows\n it 3 times, record that\n3. fill in a grid showing the counts for each word pair\n4. convert counts to probabilities (or just use the counts directly---dice don't\n care)\n5. roll dice to randomly select the next word based on those probabilities\n6. write down these words that \"come out\" of the model; marvel that they make\n sense, or laugh at their glorious nonsensicality\n\nThe \"why is this nonsense?\" discussion is where the learning happens. Bigrams\nhave no long-term memory, so they can't track sentence structure or maintain\ncoherent topics. This limitation becomes _super_ obvious when you generate \"The\ncat sat on the cat sat on the...\"\n\nThen you can talk about how real language models address these limitations:\nbigger context windows, attention mechanisms, and training on trillions of\ntokens instead of a few paragraphs. Suddenly the path from your\npencil-and-paper model to ChatGPT or Claude or Gemini is a little clearer.\n\nThis approach teaches the fundamental mechanism of language models, but it\ndoesn't capture everything: the scale really is incomparable (a few hundred\nwords vs billions of parameters).\n\nBut those limitations are pedagogically useful. Once you understand bigrams\ndeeply, the extensions to more sophisticated models become natural questions.\n\"What if we looked at more context?\" leads to trigrams and N-grams. \"What if\nwords had relationships beyond just sequence?\" leads to embeddings. \"What if we\ncould focus on relevant parts of the context?\" leads to attention. And there are\nLLMs Unplugged lessons which cover all these topics and more. The unplugged\napproach gives you the conceptual foundation: rather than replicating\nproduction LLMs, it tries to make the core ideas understandable.\n\nThe fundamentals are solid, but there's more work to do---in particular to\nroad test this in a wide range of classrooms (and boardrooms). The bones of the\nactivity are the same, but there are always going to be tweaks which can help it\nland for a particular audience.\n\nIf you're an educator, check out the [materials](https://www.llmsunplugged.org/)\nand use them. If you teach a workshop, I'd love to hear how it goes. If you find\ngaps or confusion points, [send me an email](mailto:ben.swift@anu.edu.au) or\nopen an issue on the\n[GitHub repo](https://github.com/ANUcybernetics/llms-unplugged). This is meant\nto be a living resource, not a finished product.\n\n:::info\n\nThe LLMs Unplugged site is at\n[llmsunplugged.anu.edu.au](https://llmsunplugged.org). All materials are CC\nBY-NC-SA licensed for educational use. The code's on\n[GitHub](https://github.com/ANUcybernetics/llms-unplugged) if you want to dig\ninto the implementation details or contribute.\n\n:::\n\nThe best way to understand how something works is to build it yourself\n(honestly, this is my approach to software development as well). Even if your\nversion is a dramatically simplified pencil-and-paper sketch, the act of\nconstruction creates an understanding that no amount of explanation can match.\nThat's what _LLMs Unplugged_ is about: giving people the tools to build their\nown understanding of language models, one dice roll at a time.\n",
"createdAt": "2026-05-13T23:14:42.407Z",
"description": "A hands-on teaching resource where learners build bigram language models with pen, paper, and dice---no computers required. Tested with school students, public servants, and tech professionals.",
"path": "/blog/2025/12/10/llms-unplugged-teaching-language-models-with-pen-paper-and-dice",
"publishedAt": "2025-12-10T00:00:00.000Z",
"site": "at://did:plc:tevykrhi4kibtsipzci76d76/site.standard.publication/self",
"tags": [
"teaching",
"ai",
"llm"
],
"textContent": "A hands-on teaching resource where learners build bigram language models with pen, paper, and dice---no computers required. Tested with school students, public servants, and tech professionals.",
"title": "LLMs Unplugged: teaching language models with pen, paper, and dice"
}