Issues and developments related to IP, AI, and OM, examined in the IP and tech ethics graduate courses I teach at the University of Pittsburgh School of Computing and Information. My Bloomsbury book "Ethics, Information, and Technology", coming in Summer 2025, includes major chapters on IP, AI, OM, and other emerging technologies (IoT, drones, robots, autonomous vehicles, VR/AR). Kip Currier, PhD, JD
Thursday, January 16, 2025
In AI copyright case, Zuckerberg turns to YouTube for his defense; TechCrunch, January 15, 2025
Wednesday, January 15, 2025
'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line; NPR, January 14, 2025
Bobby Allyn, NPR; 'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line
"A group of news organizations, led by The New York Times, took ChatGPT maker OpenAI to federal court on Tuesday in a hearing that could determine whether the tech company has to face the publishers in a high-profile copyright infringement trial.
Three publishers' lawsuits against OpenAI and its financial backer Microsoft have been merged into one case. Leading each of the three combined cases are the Times, The New York Daily News and the Center for Investigative Reporting.
Other publishers, like the Associated Press, News Corp. and Vox Media, have reached content-sharing deals with OpenAI, but the three litigants in this case are taking the opposite path: going on the offensive."
Monday, January 6, 2025
OpenAI holds off on promise to creators, fails to protect intellectual property; The American Bazaar, January 3, 2025
Vishnu Kamal, The American Bazaar; OpenAI holds off on promise to creators, fails to protect intellectual property
"OpenAI may yet again be in hot water as it seems that the tech giant may be reneging on its earlier assurances. Reportedly, in May, OpenAI said it was developing a tool to let creators specify how they want their works to be included in—or excluded from—its AI training data. But seven months later, this feature has yet to see the light of day.
Called Media Manager, the tool would “identify copyrighted text, images, audio, and video,” OpenAI said at the time, to reflect creators’ preferences “across multiple sources.” It was intended to stave off some of the company’s fiercest critics, and potentially shield OpenAI from IP-related legal challenges...
OpenAI has faced various legal challenges related to its AI technologies and operations. One major issue involves the privacy and data usage of its language models, which are trained on large datasets that may include publicly available or copyrighted material. This raises concerns over privacy violations and intellectual property rights, especially regarding whether the data used for training was obtained with proper consent.
Additionally, there are questions about the ownership of content generated by OpenAI’s models. If an AI produces a work based on copyrighted data, it is tricky to determine who owns the rights—whether it’s OpenAI, the user who prompted the AI, or the creators of the original data.
Another concern is the liability for harmful content produced by AI. If an AI generates misleading or defamatory information, legal responsibility could fall on OpenAI."
Friday, January 3, 2025
U.S. Copyright Office to Begin Issuing Further AI Guidance in January 2025; The National Law Review, January 2, 2025
John Hines of The Sedona Conference , The National Law Review; U.S. Copyright Office to Begin Issuing Further AI Guidance in January 2025
"Parts 2 and 3, which have not yet been released, will be of heightened interest to content creators and to individuals and businesses involved in developing and deploying AI technologies. Ultimate regulatory and legislative determinations could materially recalibrate the scope of ownership and protection afforded to works of authorship, and the stakes are extremely high...
Part 2 of the report, which the Copyright Office expects to publish “after the New Year Holiday,” will address the copyrightability of AI-generated works, and more specifically, how the nature and degree of such use affects copyrightability and registrability. Current law is clear that to be copyrightable, a work must be created by a human. E.g., Thaler v. Perlmutter, 678 F.Supp. 140 (D.DC 2023), on appeal. However assistive tools are used in virtually all creation, from pencils to cameras to photo-editing software programs. In the context of registrability, the Copyright Office offered the following distinction in its March 2023 guidance: “[W]hether the ‘work’ is basically one of human authorship, with the computer [or other device] merely being an assisting instrument, or whether the traditional elements of authorship in the work (literary, artistic, or musical expression or elements of selection, arrangement, etc.) were actually conceived and executed not by man but by a machine.” In Part 2, the Copyright Office will have an additional opportunity to explore these and related issues – this time with the advantage of the many comments offered through the Notice of Inquiry process.
Part 3 of the report, which the Copyright Office anticipates releasing “in the first quarter of 2025,” will focus on issues associated with training data. AI models, depending on their size and scope, may train on millions of documents—many of which are copyrighted or copyrightable— acquired from the Internet or through acquisition of various robust databases. Users of “trained” AI technologies will typically input written prompts to generate written content or images, depending on the model (Sora is now available to generate video). The output is essentially a prediction based on a correlation of values in the model (extracted from the training data) and values that are derived from the user prompts.
Numerous lawsuits, perhaps most notably the case that The New York Times filed against Microsoft and OpenAI, have alleged that the use of data to train AI models constitutes copyright infringement. In many cases there may be little question of copying in the course of uploading data to train the models. Among a variety of issues, a core common issue will be whether the use of the data for training purposes is fair use. Content creators, of course, point to the fact that they have built their livelihoods and/or businesses around their creations and that they should be compensated for what is a violation of their exclusive rights."
Tuesday, December 31, 2024
Column: A Faulkner classic and Popeye enter the public domain while copyright only gets more confusing; Los Angeles Times, December 31, 2024
Michael Hiltzik , Los Angeles Times; Column: A Faulkner classic and Popeye enter the public domain while copyright only gets more confusing
"The annual flow of copyrighted works into the public domain underscores how the progressive lengthening of copyright protection is counter to the public interest—indeed, to the interests of creative artists. The initial U.S. copyright act, passed in 1790, provided for a term of 28 years including a 14-year renewal. In 1909, that was extended to 56 years including a 28-year renewal.
In 1976, the term was changed to the creator’s life plus 50 years. In 1998, Congress passed the Copyright Term Extension Act, which is known as the Sonny Bono Act after its chief promoter on Capitol Hill. That law extended the basic term to life plus 70 years; works for hire (in which a third party owns the rights to a creative work), pseudonymous and anonymous works were protected for 95 years from first publication or 120 years from creation, whichever is shorter.
Along the way, Congress extended copyright protection from written works to movies, recordings, performances and ultimately to almost all works, both published and unpublished.
Once a work enters the public domain, Jenkins observes, “community theaters can screen the films. Youth orchestras can perform the music publicly, without paying licensing fees. Online repositories such as the Internet Archive, HathiTrust, Google Books and the New York Public Library can make works fully available online. This helps enable both access to and preservation of cultural materials that might otherwise be lost to history.”"
Anthropic Agrees to Enforce Copyright Guardrails on New AI Tools; Bloomberg Law, December 30, 2024
Annelise Levy, Bloomberg Law; Anthropic Agrees to Enforce Copyright Guardrails on New AI Tools
"Anthropic PBC must apply guardrails to prevent its future AI tools from producing infringing copyrighted content, according to a Monday agreement reached with music publishers suing the company for infringing protected song lyrics.
Eight music publishers—including
Monday, December 30, 2024
Key IP Issues for the Next President and Congress to Tackle: AI and Patent Subject Matter Eligibility; IP Watchdog, December 29, 2024
RYAN J. MALLOY, IP Watchdog; Key IP Issues for the Next President and Congress to Tackle: AI and Patent Subject Matter Eligibility
"The debates surrounding the 2024 election focused on “hot button” issues like abortion, immigration, and transgender rights. But several important IP issues also loom over the next administration and Congress. These issues include AI-generated deepfakes, the use of copyrighted works for AI training, the patentability of AI-assisted inventions, and patent subject matter eligibility more generally. We might see President Trump and the 119th Congress tackle some or all of these issues in the next term."
Sunday, December 29, 2024
AI's assault on our intellectual property must be stopped; Financial Times, December 21, 2024
Kate Mosse, Financial Times; AI's assault on our intellectual property must be stopped
"Imagine my dismay, therefore, to discover that those 15 years of dreaming, researching, planning, writing, rewriting, editing, visiting libraries and archives, translating Occitan texts, hunting down original 13th-century documents, becoming an expert in Catharsis, apparently counts for nothing. Labyrinth is just one of several of my novels that have been scraped by Meta's large language model. This has been done without my consent, without remuneration, without even notification. This is theft...
AI companies present creators as being against change. We are not. Every artist I know is already engaging with AI in one way or another. But a distinction needs to be made between AI that can be used in brilliant ways -- for example, medical diagnosis -- and the foundations of AI models, where companies are essentially stealing creatives' work for their own profit. We should not forget that the AI companies rely on creators to build their models. Without strong copyright law that ensures creators can earn a living, AI companies will lack the high-quality material that is essential for their future growth."
Friday, December 27, 2024
The AI Boom May Be Too Good to Be True; Wall Street Journal, December 26, 2024
Josh Harlan, Wall Street Journal; The AI Boom May Be Too Good to Be True
"Investors rushing to capitalize on artificial intelligence have focused on the technology—the capabilities of new models, the potential of generative tools, and the scale of processing power to sustain it all. What too many ignore is the evolving legal structure surrounding the technology, which will ultimately shape the economics of AI. The core question is: Who controls the value that AI produces? The answer depends on whether AI companies must compensate rights holders for using their data to train AI models and whether AI creations can themselves enjoy copyright or patent protections.
The current landscape of AI law is rife with uncertainty...How these cases are decided will determine whether AI developers can harvest publicly available data or must license the content used to train their models."
Tech companies face tough AI copyright questions in 2025; Reuters, December 27, 2024
Blake Brittain, Reuters ; Tech companies face tough AI copyright questions in 2025
"The new year may bring pivotal developments in a series of copyright lawsuits that could shape the future business of artificial intelligence.
The AI revolution is running out of data. What can researchers do?; Nature, December 11, 2024
Nicola Jones, Nature; The AI revolution is running out of data. What can researchers do?
"A prominent study1 made headlines this year by putting a number on this problem: researchers at Epoch AI, a virtual research institute, projected that, by around 2028, the typical size of data set used to train an AI model will reach the same size as the total estimated stock of public online text. In other words, AI is likely to run out of training data in about four years’ time (see ‘Running out of data’). At the same time, data owners — such as newspaper publishers — are starting to crack down on how their content can be used, tightening access even more. That’s causing a crisis in the size of the ‘data commons’, says Shayne Longpre, an AI researcher at the Massachusetts Institute of Technology in Cambridge who leads the Data Provenance Initiative, a grass-roots organization that conducts audits of AI data sets...
Several lawsuits are now under way attempting to win compensation for the providers of data being used in AI training. In December 2023, The New York Times sued OpenAI and its partner Microsoft for copyright infringement; in April this year, eight newspapers owned by Alden Global Capital in New York City jointly filed a similar lawsuit. The counterargument is that an AI should be allowed to read and learn from online content in the same way as a person, and that this constitutes fair use of the material. OpenAI has said publicly that it thinks The New York Times lawsuit is “without merit”.
If courts uphold the idea that content providers deserve financial compensation, it will make it harder for both AI developers and researchers to get what they need — including academics, who don’t have deep pockets. “Academics will be most hit by these deals,” says Longpre. “There are many, very pro-social, pro-democratic benefits of having an open web,” he adds."
Thursday, December 26, 2024
Harvard’s Library Innovation Lab launches Institutional Data Initiative; Harvard Law Today, December 12, 2024
Scott Young , Harvard Law Today; Harvard’s Library Innovation Lab launches Institutional Data Initiative
"At the Institutional Data Initiative (IDI), a new program hosted within the Harvard Law School Library, efforts are already underway to expand and enhance the data resources available for AI training. At the initiative’s public launch on Dec. 12, Library Innovation Lab faculty director, Jonathan Zittrain ’95, and IDI executive director, Greg Leppert, announced plans to expand the availability of public domain data from knowledge institutions — including the text of nearly one million books scanned at Harvard Library — to train AI models...
Harvard Law Today: What is the Institutional Data Initiative?
Greg Leppert: Our work at the Institutional Data Initiative is focused on finding ways to improve the accessibility of institutional data for all uses, artificial intelligence among them. Harvard Law School Library is a tremendous repository of public domain books, briefs, research papers, and so on. Regardless of how this information was initially memorialized — hardcover, softcover, parchment, etc. — a considerable amount has been converted into digital form. At the IDI, we are working to ensure these large data sets of public domain works, like the ones from the Law School library that comprise the Caselaw Access Project, are made open and accessible, especially for AI training. Harvard is not alone in terms of the scale and quality of its data; similar sets exist throughout our academic institutions and public libraries. AI systems are only as diverse as the data on which they’re trained, and these public domain data sets ought to be part of a healthy diet for future AI training.
HLT: What problem is the Institutional Data Initiative working to solve?
Leppert: As it stands, the data being used to train AI is often limited in terms of scale, scope, quality, and integrity. Various groups and perspectives are massively underrepresented in the data currently being used to train AI. As things stand, outliers will not be served by AI as well as they should be, and otherwise could be, by the inclusion of that underrepresented data. The country of Iceland, for example, undertook a national, government-led effort to make materials from their national libraries available for AI applications. That is because they were seriously concerned the Icelandic language and culture would not be represented in AI models. We are also working towards reaffirming Harvard, and other institutions, as the stewards of their collections. The proliferation of training sets based on public domain materials has been encouraging to see, but it’s important that this doesn’t leave the material vulnerable to critical omissions or alterations. For centuries, knowledge institutions have served as stewards of information for the purpose of promoting the public good and furthering the representation of diverse ideas, cultural groups, and ways of seeing the world. So, we believe these institutions are the exact kind of sources for AI training data if we want to optimize its ability to serve humanity. As it stands today, there is significant room for improvement."
Saturday, December 21, 2024
Every AI Copyright Lawsuit in the US, Visualized; Wired, December 19, 2024
Kate Knibbs, Wired; Every AI Copyright Lawsuit in the US, Visualized
"WIRED is keeping close tabs on how each of these lawsuits unfold. We’ve created visualizations to help you track and contextualize which companies and rights holders are involved, where the cases have been filed, what they’re alleging, and everything else you need to know."
Every AI Copyright Lawsuit in the US, Visualized
Thursday, December 19, 2024
Getty Images Wants $1.7 Billion From its Lawsuit With Stability AI; PetaPixel, December 19, 2024
MATT GROWCOOT, PetaPixel; Getty Images Wants $1.7 Billion From its Lawsuit With Stability AI
"Getty, one of the world’s largest photo agencies, launched its lawsuit in January 2023. Getty suspects that Stability AI may have used as many as 12 million of its copyrighted photos to train the AI image generator Stable Diffusion. Getty is seeking $150,000 per infringement and 12 million photos equates to a staggering $1.8 trillion.
However, according to Stability AI’s latest company accounts as reported by Sifted, Getty is seeking damages for 11,383 works at $150,000 per infringement which comes to a total of $1.7 billion. Stability AI has previously reported that Getty was seeking damages for 7,300 images so that number has increased. But Stability AI says Getty hasn’t given an exact number it wants for the lawsuit to be settled, according to Sifted."
Friday, December 13, 2024
To Whom Does the World Belong? The battle over copyright in the age of ChatGPT.; Boston Review, December 10, 2024
Alexander Hartley, Boston Review; To Whom Does the World Belong? The battle over copyright in the age of ChatGPT
"Who, if anyone, owns the copyright to a paragraph produced by a chatbot? As I write, nobody knows."
Wednesday, December 11, 2024
Paul McCartney warns AI ‘could take over’ as UK debates copyright laws; The Guardian, December 10, 2024
Robert Booth UK technology editor, The Guardian; Paul McCartney warns AI ‘could take over’ as UK debates copyright laws
"Paul McCartney has backed calls for laws to stop mass copyright theft by companies building generative artificial intelligence, warning AI “could just take over”.
The former Beatle said it would be “a very sad thing indeed” if young composers and writers could not protect their intellectual property from the rise of algorithmic models, which so far have learned by digesting mountains of copyrighted material.
He spoke out amid growing concern that the rise of AI is threatening income streams for music, news and book publishers. Next week the UK parliament will debate amendments to the data bill that could allow creators to decide whether or not their copyrighted work can be used to train generative AI models."
Sunday, December 8, 2024
The Copyrighted Material Being Used to Train AI; The Bulwark, December 7, 2024
SONNY BUNCH, The Bulwark; The Copyrighted Material Being Used to Train AI
"On this week’s episode, I talked to Alex Reisner about his pieces in the Atlantic highlighting the copyrighted material being hoovered into large language models to help AI chatbots simulate human speech. If you’re a screenwriter and would like to see which of your work has been appropriated to aid in the effort, click here; he has assembled a searchable database of nearly 140,000 movie and TV scripts that have been used without permission. (And you should read his other stories about copyright law reaching its breaking point and “the memorization problem.”) In this episode, we also got into the metaphysics of art and asked what sort of questions need to be asked as we hurtle toward the future. If you enjoyed this episode, please share it with a friend!"
Wednesday, December 4, 2024
OpenAI Must Hand Over Execs' Social Media DMs in Copyright Suits; Bloomberg Law, December 3, 2024
"
Tuesday, December 3, 2024
Getty Images CEO Calls AI Training Models ‘Pure Theft’; PetaPixel, December 3, 2024
MATT GROWCOOT , PetaPixel; Getty Images CEO Calls AI Training Models ‘Pure Theft’
"The CEO of Getty Images has penned a column in which he calls the practice of scraping photos and other content from the open web by AI companies “pure theft”.
Writing for Fortune, Craig Peters argues that fair use rules must be respected and that AI training practices are in contravention of those rules...
“I am responsible for an organization that employs over 1,700 individuals and represents the work of more than 600,000 journalists and creators worldwide,” writes Peters. “Copyright is at the very core of our business and the livelihood of those we employ and represent.”"
Friday, November 29, 2024
Major Canadian News Outlets Sue OpenAI in New Copyright Case; The New York Times, November 29, 2024
Matina Stevis-Gridneff, The New York Times ; Major Canadian News Outlets Sue OpenAI in New Copyright Case
"A coalition of Canada’s biggest news organizations is suing OpenAI, the maker of the artificial intelligence chatbot, ChatGPT, accusing the company of illegally using their content in the first case of its kind in the country.
Five of the country’s major news companies, including the publishers of its top newspapers, newswires and the national broadcaster, filed the joint suit in the Ontario Superior Court of Justice on Friday morning...
The Canadian outlets, which include the Globe and Mail, the Toronto Star and the CBC — the Canadian Broadcasting Corporation — are seeking what could add up to billions of dollars in damages. They are asking for 20,000 Canadian dollars, or $14,700, per article they claim was illegally scraped and used to train ChatGPT.
They are also seeking a share of the profits made by what they claim is OpenAI’s misuse of their content, as well as for the company to stop such practices in the future."