John Hines of The Sedona Conference , The National Law Review; U.S. Copyright Office to Begin Issuing Further AI Guidance in January 2025
"Parts 2 and 3, which have not yet been released, will be of heightened interest to content creators and to individuals and businesses involved in developing and deploying AI technologies. Ultimate regulatory and legislative determinations could materially recalibrate the scope of ownership and protection afforded to works of authorship, and the stakes are extremely high...
Part 2 of the report, which the Copyright Office expects to publish “after the New Year Holiday,” will address the copyrightability of AI-generated works, and more specifically, how the nature and degree of such use affects copyrightability and registrability. Current law is clear that to be copyrightable, a work must be created by a human. E.g., Thaler v. Perlmutter, 678 F.Supp. 140 (D.DC 2023), on appeal. However assistive tools are used in virtually all creation, from pencils to cameras to photo-editing software programs. In the context of registrability, the Copyright Office offered the following distinction in its March 2023 guidance: “[W]hether the ‘work’ is basically one of human authorship, with the computer [or other device] merely being an assisting instrument, or whether the traditional elements of authorship in the work (literary, artistic, or musical expression or elements of selection, arrangement, etc.) were actually conceived and executed not by man but by a machine.” In Part 2, the Copyright Office will have an additional opportunity to explore these and related issues – this time with the advantage of the many comments offered through the Notice of Inquiry process.
Part 3 of the report, which the Copyright Office anticipates releasing “in the first quarter of 2025,” will focus on issues associated with training data. AI models, depending on their size and scope, may train on millions of documents—many of which are copyrighted or copyrightable— acquired from the Internet or through acquisition of various robust databases. Users of “trained” AI technologies will typically input written prompts to generate written content or images, depending on the model (Sora is now available to generate video). The output is essentially a prediction based on a correlation of values in the model (extracted from the training data) and values that are derived from the user prompts.
Numerous lawsuits, perhaps most notably the case that The New York Times filed against Microsoft and OpenAI, have alleged that the use of data to train AI models constitutes copyright infringement. In many cases there may be little question of copying in the course of uploading data to train the models. Among a variety of issues, a core common issue will be whether the use of the data for training purposes is fair use. Content creators, of course, point to the fact that they have built their livelihoods and/or businesses around their creations and that they should be compensated for what is a violation of their exclusive rights."