{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihwrbtxw7jyrx5hhfvc6qoj4t24xvsejwcetzotivnce2zpjvoita",
    "uri": "at://did:plc:ekor5xcatltxseja2u6t44zf/app.bsky.feed.post/3mmhyrkwflct2"
  },
  "path": "/archives/2026-mar-jun.html#22_May_2026_(Complete_copies_of_books_from_LLMs)",
  "publishedAt": "2026-05-23T02:37:45.000Z",
  "site": "https://stallman.org",
  "tags": [
    "deliver up nearly complete copies",
    "reproduce the whole text of the GNU GPL version 3 that way"
  ],
  "textContent": "Some well-known LLMs have been proved to be able to deliver up nearly complete copies of the text of some well-known books.\n\nThey may, as a result, be found to infringe the copyright on those books.\n\nPrecisely why and how this happens is a factual question, but this article does not tell us the answer. In particular, it does not prove that their developers intentionally and specifically stored large parts of any specific book's text verbatim. It could be that the writing style of that book is so distinctive that continuing repeatedly from any portion of the book always finds the text that comes next in the book.\n\nA couple of years ago I heard that someone had made Copi(a)lot reproduce the whole text of the GNU GPL version 3 that way. GitHub surely did not intend for it to do that! And, of course, it omitted the crucial _license notice_ which ought to say that the program is released under the GNU GPL, version 3 or later.",
  "title": "Complete copies of books from LLMs"
}