article thumbnail

Copyright, AI, and Provenance

O'Reilly Media

Another group of cases involving text (typically novels and novelists) argue that using copyrighted texts as part of the training data for a Large Language Model (LLM) is itself copyright infringement, 1 even if the model never reproduces those texts as part of its output. That’s a nice image, but it is fundamentally wrong.

AI 101