Complete copies of books from LLMs
Some well-known LLMs have been proved to be able to deliver up nearly complete copies of the text of some well-known books.
They may, as a result, be found to infringe the copyright on those books.
Precisely why and how this happens is a factual question, but this article does not tell us the answer. In particular, it does not prove that their developers intentionally and specifically stored large parts of any specific book's text verbatim. It could be that the writing style of that book is so distinctive that continuing repeatedly from any portion of the book always finds the text that comes next in the book.
A couple of years ago I heard that someone had made Copi(a)lot reproduce the whole text of the GNU GPL version 3 that way. GitHub surely did not intend for it to do that! And, of course, it omitted the crucial license notice which ought to say that the program is released under the GNU GPL, version 3 or later.
Discussion in the ATmosphere