Platform Development • Re: Tentative LLM contribution guidelines.
Another difficulty with LLM use in open-source programming might have to do with licencing. If an employed LLM draws on source code licenced under the GPL, the derivative work would most likely need follow that licence, regardless of the developer’s wishes. If it draws on GPL and incompatible licences which it finds in the wild, I understand the result might be illegal. Attribution would be complicated matter in any case, thanks to the nature of LLM. How do you plan to address this?
Well, I believe legally an LLM trained on code is considered more to have read it than to have copied it verbatim, so the code it outputs is not licensed under the original license of whatever it was trained on. If that weren't the case, then basically no project could use LLM for coding. And a lot of big names who definitely have legal teams are in fact using them, which suggests to me this isn't the concern you're implying.
From what I've gathered, GPL obligations only arise from reproducing GPL code verbatim, not from using a tool that was trained on it. Without verbatim reproduction, there’s no derivative work and no licensing issue. And modern LLMs are specifically trained to avoid verbatim reproduction, making the only scenario where GPL obligations could apply extremely unlikely.
Discussion in the ATmosphere