Thomas Claburn, The Register; OpenAI's ChatGPT may face a copyright quagmire after 'memorizing' these books
"Tyler Ochoa, a professor in the Law department at Santa Clara University in California, told The Register he fully expects to see lawsuits against the makers of large language models that generate text, including OpenAI, Google, and others.
Ochoa said the copyright issues with AI text generation are exactly the same as the issues with AI image generation. First: is copying large amounts of text or images for training the model fair use? The answer to that, he said, is probably yes.
Second: if the model generates output that's too similar to the input – what the paper refers to as "memorization" – is that copyright infringement? The answer to that, he said, is almost certainly yes.
And third: if the output of an AI text generator is not a copy of an existing text, is it protected by copyright?
Under current law, said Ochoa, the answer is no – because US copyright law requires human creativity, though some countries will disagree and will protect AI-generated works. However, he added, activities like selecting, arranging, and modifying AI model output makes copyright protection more plausible."
