Intellectual Property (IP), Artificial Intelligence (AI), Open Movements (OM) : LLMs

Showing posts with label LLMs. Show all posts

Monday, November 24, 2025

Minister indicates sympathy for artists in debate over AI and copyright; The Guardian, November 23, 2025

Robert Booth, The Guardian; Minister indicates sympathy for artists in debate over AI and copyright

"The technology secretary, Liz Kendall, has indicated she is sympathetic to artists’ demands not to have their copyrighted works scraped by AI companies without payment and said she wanted to “reset” the debate.

In remarks that suggest a change in approach from her predecessor, Peter Kyle, who had hoped to require artists to actively opt out of having their work ingested by generative AI systems, she said “people rightly want to get paid for the work that they do” and “we have to find a way that both sectors can grow and thrive in future”.

The government has been consulting on a new intellectual property framework for AI which, in the case of the most common large language models (LLMs), requires vast amounts of training data to work effectively.

The issue has sparked impassioned protests from some of Britain’s most famous artists. This month Paul McCartney released a silent two-minute 45 second track of an empty studio on an album protesting against copyright grabs by AI firms as part of a campaign also backed by Kate Bush, Sam Fender, the Pet Shop Boys and Hans Zimmer."

Tuesday, September 23, 2025

Screw the money — Anthropic’s $1.5B copyright settlement sucks for writers; TechCrunch, September 5, 2025

Amanda Silberling , TechCrunch; Screw the money — Anthropic’s $1.5B copyright settlement sucks for writers

"But writers aren’t getting this settlement because their work was fed to an AI — this is just a costly slap on the wrist for Anthropic, a company that just raised another $13 billion, because it illegally downloaded books instead of buying them.

In June, federal judge William Alsup sided with Anthropic and ruled that it is, indeed, legal to train AI on copyrighted material. The judge argues that this use case is “transformative” enough to be protected by the fair use doctrine, a carve-out of copyright law that hasn’t been updated since 1976.

“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,” the judge said.

It was the piracy — not the AI training — that moved Judge Alsup to bring the case to trial, but with Anthropic’s settlement, a trial is no longer necessary."

Monday, August 25, 2025

Medical triage as an AI ethics benchmark; Nature, August 22, 2025

Nathalie Maria Kirch,

Konstantin Hebenstreit &

Matthias Samwald

, Nature; Medical triage as an AI ethics benchmark

"We present the TRIAGE benchmark, a novel machine ethics benchmark designed to evaluate the ethical decision-making abilities of large language models (LLMs) in mass casualty scenarios. TRIAGE uses medical dilemmas created by healthcare professionals to evaluate the ethical decision-making of AI systems in real-world, high-stakes scenarios. We evaluated six major LLMs on TRIAGE, examining how different ethical and adversarial prompts influence model behavior. Our results show that most models consistently outperformed random guessing, with open source models making more serious ethical errors than proprietary models. Providing guiding ethical principles to LLMs degraded performance on TRIAGE, which stand in contrast to results from other machine ethics benchmarks where explicating ethical principles improved results. Adversarial prompts significantly decreased accuracy. By demonstrating the influence of context and ethical framing on the performance of LLMs, we provide critical insights into the current capabilities and limitations of AI in high-stakes ethical decision making in medicine."

Wednesday, July 30, 2025

Insuring Intellectual Property – Examining AI and Fair Use; The National Law Review, July 29, 2025

Michael S. Levine, Geoffrey B. Fehling, Armin Ghiam, Madalyn "Mady" Moore of Hunton Andrews Kurth - Publications, The National Law Review; Insuring Intellectual Property – Examining AI and Fair Use

"The frequency of lawsuits involving the development and deployment of AI technologies is increasing by the day. Recent lawsuits seeking to hold companies directly and secondarily liable for “joint enterprises” based on use (or alleged misuse) of copyrighted works for training AI models serve as important reminders about the protections that intellectual property (IP) insurance can offer to cover the risks associated with copyright infringement claims.

Recently, a California federal district court ruled that it was “fair use” for an AI software company to use copyrighted books to train its large language models (LLMs). However, the court also found the company’s unauthorized possession of over seven million pirated books that it downloaded from the internet (apparently for free) amounted to copyright infringement independent from whether the books were ultimately used to train the LLMs. In contrast, where the company purchased books before scanning them into digital files, the use was a permissible “fair use.”

The court’s order in Bartz et al. v. Anthropic PBC, No. 3:24-cv-05417 (N.D. Cal. June 23, 2025), highlights the nuanced permissible use of copyrighted training data and underscores why policyholders engaged in the use of copyrighted material should acquire and maintain robust IP insurance that will reliably respond to claims of alleged infringement."

Thursday, July 24, 2025

Donald Trump Says AI Companies Can’t Be Expected To Pay For All Copyrighted Content Used In Their Training Models: “Not Do-Able”; Deadline, July 23, 2025

Ted Johnson, Tom Tapp, Deadline; Donald Trump Says AI Companies Can’t Be Expected To Pay For All Copyrighted Content Used In Their Training Models: “Not Do-Able”

[Kip Currier: Don't be fooled by the flimflam rhetoric in Trump's AI Action Plan unveiled yesterday (July 23, 2025). Where Trump's AI Action Plan says “We must ensure that free speech flourishes in the era of AI and that AI procured by the Federal government objectively reflects truth rather than social engineering agendas", it's actually the exact opposite: the Trump plan is censorious and will "cancel out" truth (e.g. on climate science, misinformation and disinformation, etc.) in Orwellian fashion.]

[Excerpt]

"The plan is a contrast to Trump’s predecessor, Joe Biden, who focused on the government’s role in ensuring that the technology was safe.

The Trump White House plan also recommends updating federal procurement guidelines “to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias.” Also recommended is revising the National Institute of Standards and Technology AI Risk Management Framework to remove references to misinformation, DEI and climate change.

“We must ensure that free speech flourishes in the era of AI and that AI procured by the Federal government objectively reflects truth rather than social engineering agendas,” the plan says."

Wednesday, July 23, 2025

AI chatbots remain overconfident -- even when they’re wrong; EurekAlert!, July 22, 2025

CARNEGIE MELLON UNIVERSITY, EurekAlert!; AI chatbots remain overconfident -- even when they’re wrong

"Artificial intelligence chatbots are everywhere these days, from smartphone apps and customer service portals to online search engines. But what happens when these handy tools overestimate their own abilities?

Researchers asked both human participants and four large language models (LLMs) how confident they felt in their ability to answer trivia questions, predict the outcomes of NFL games or Academy Award ceremonies, or play a Pictionary-like image identification game. Both the people and the LLMs tended to be overconfident about how they would hypothetically perform. Interestingly, they also answered questions or identified images with relatively similar success rates.

However, when the participants and LLMs were asked retroactively how well they thought they did, only the humans appeared able to adjust expectations, according to a study published today in the journal Memory & Cognition.

“Say the people told us they were going to get 18 questions right, and they ended up getting 15 questions right. Typically, their estimate afterwards would be something like 16 correct answers,” said Trent Cash, who recently completed a joint Ph.D. at Carnegie Mellon University in the departments of Social Decision Science and Psychology. “So, they’d still be a little bit overconfident, but not as overconfident.”

“The LLMs did not do that,” said Cash, who was lead author of the study. “They tended, if anything, to get more overconfident, even when they didn’t do so well on the task.”

The world of AI is changing rapidly each day, which makes drawing general conclusions about its applications challenging, Cash acknowledged. However, one strength of the study was that the data was collected over the course of two years, which meant using continuously updated versions of the LLMs known as ChatGPT, Bard/Gemini, Sonnet and Haiku. This means that AI overconfidence was detectable across different models over time.

“When an AI says something that seems a bit fishy, users may not be as skeptical as they should be because the AI asserts the answer with confidence, even when that confidence is unwarranted,” said Danny Oppenheimer, a professor in CMU’s Department of Social and Decision Sciences and coauthor of the study."

Wednesday, July 2, 2025

Eminem, AI and me: why artists need new laws in the digital age; The Guardian, July 2, 2025

Alexander Hurst , The Guardian; Eminem, AI and me: why artists need new laws in the digital age

"Song lyrics, my publisher informs me, are subject to notoriously strict copyright enforcement and the cost to buy the rights is often astronomical. Fat chance as well, then, of me quoting Eminem to talk about how Lose Yourself seeped into the psyche of a generation when he rapped: “You only get one shot, do not miss your chance to blow, this opportunity comes once in a lifetime.”

Oh would it be different if I were an AI company with a large language model (LLM), though. I could scrape from the complete discography of the National and Eminem, and the lyrics of every other song ever written. Then, when a user prompted something like, “write a rap in the style of Eminem about losing money, and draw inspiration from the National’s Bloodbuzz Ohio”, my word correlation program – with hundreds of millions of paying customers and a market capitalisation worth tens if not hundreds of billions of dollars – could answer:

“I still owe money to the money to the money I owe,

But I spit gold out my throat when I flow,

So go tell the bank they can take what they like

I already gave my soul to the mic.”

And that, according to rulings last month by the US courts, is somehow “fair use” and is perplexingly not copyright infringement at all, despite no royalties having been paid to anyone in the process."

Tuesday, June 3, 2025

Artificial Intelligence—Promises and Perils for Humans’ Rights; Harvard Law School Human Rights Program, June 10, 2025 10:30 AM EDT

Harvard Law School Human Rights Program; Artificial Intelligence—Promises and Perils for Humans’ Rights

"In recent years, rapid advances in Artificial Intelligence (AI) technology, significantly accelerated by the development and deployment of deep learning and Large Language Models, have taken center stage in policy discussions and public consciousness. Amidst a public both intrigued and apprehensive about AI’s transformative potential across workplaces, families, and even broader political, economic, and geopolitical structures, a crucial conversation is emerging around its ethical, legal, and policy dimensions.

This webinar will convene a panel of prominent experts from diverse fields to delve into the critical implications of AI for humans and their rights. The discussion will broadly address the anticipated human rights harms stemming from AI’s increasing integration into society and explore potential responses to these challenges. A key focus will be on the role of international law and human rights law in addressing these harms, considering whether this legal framework can offer the appropriate tools for effective intervention."

Emerging Issues in the Use of Generative AI: Ethics, Sanctions, and Beyond; The Federalist Society, June 3, 2025 12 PM EDT

The Federalist Society; Emerging Issues in the Use of Generative AI: Ethics, Sanctions, and Beyond

"The idea of Artificial Intelligence has long presented potential challenges in the legal realm, and as AI tools become more broadly available and widely used, those potential hurdles are becoming ever more salient for lawyers in their day-to-day operations. Questions abound, from what potential risks of bias and error may exist in using an AI tool, to the challenges related to professional responsibility as traditionally understood, to the risks large language learning models pose to client confidentiality. Some contend that AI is a must-use, as it opens the door to faster, more efficient legal research that could equip lawyers to serve their clients more effectively. Others reject the use of AI, arguing that the risks of use and the work required to check the output it gives exceed its potential benefit.

Join us for a FedSoc Forum exploring the ethical and legal implications of artificial intelligence in the practice of law.

Featuring:

Laurin H. Mills, Member, Werther & Mills, LLC
Philip A. Sechler, Senior Counsel, Alliance Defending Freedom
Prof. Eugene Volokh, Gary T. Schwartz Distinguished Professor of Law Emeritus, UCLA School of Law; Thomas M. Siebel Senior Fellow, Hoover Institution, Stanford University
(Moderator) Hon. Brantley Starr, District Judge, United States District Court for the Northern District of Texas"

Wednesday, May 21, 2025

Dark LLMs: The Growing Threat of Unaligned AI Models; Cornell University, May 15, 2025

Michael Fire, Yitzhak Elbazis, Adi Wasenstein, Lior Rokach , Cornell University; Dark LLMs: The Growing Threat of Unaligned AI Models

"Large Language Models (LLMs) rapidly reshape modern life, advancing fields from healthcare to education and beyond. However, alongside their remarkable capabilities lies a significant threat: the susceptibility of these models to jailbreaking. The fundamental vulnerability of LLMs to jailbreak attacks stems from the very data they learn from. As long as this training data includes unfiltered, problematic, or 'dark' content, the models can inherently learn undesirable patterns or weaknesses that allow users to circumvent their intended safety controls. Our research identifies the growing threat posed by dark LLMs models deliberately designed without ethical guardrails or modified through jailbreak techniques. In our research, we uncovered a universal jailbreak attack that effectively compromises multiple state-of-the-art models, enabling them to answer almost any question and produce harmful outputs upon request. The main idea of our attack was published online over seven months ago. However, many of the tested LLMs were still vulnerable to this attack. Despite our responsible disclosure efforts, responses from major LLM providers were often inadequate, highlighting a concerning gap in industry practices regarding AI safety. As model training becomes more accessible and cheaper, and as open-source LLMs proliferate, the risk of widespread misuse escalates. Without decisive intervention, LLMs may continue democratizing access to dangerous knowledge, posing greater risks than anticipated."

Monday, March 24, 2025

How to tell when AI models infringe copyright; The Washington Post, March 24, 2024

Editorial Board, The Washington Post; How to tell when AI models infringe copyright

"Fair use has been a big part of AI companies’ defense. No matter how well a plaintiff manages to argue that a given AI model infringes copyright, the AI maker can usually point to the doctrine of fair use, which requires consideration of multiple factors, including the purpose of the use (here, criticism, comment and research are favored) and the effect of the use on the marketplace. If, in using a copied work, an AI model adds “something new,” it is probably in the clear."

Should AI be treated the same way as people are when it comes to copyright law? ; The Hill, March 24, 2025

NICHOLAS CREEL, The Hill ; Should AI be treated the same way as people are when it comes to copyright law?

"The New York Times’s lawsuit against OpenAI and Microsoft highlights an uncomfortable contradiction in how we view creativity and learning. While the Times accuses these companies of copyright infringement for training AI on their content, this ignores a fundamental truth: AI systems learn exactly as humans do, by absorbing, synthesizing and transforming existing knowledge into something new."

Monday, February 17, 2025

Copyright battles loom over artists and AI; Financial Times, February 16, 2025

louise.lucas@ft.com, Financial Times ; Copyright battles loom over artists and AI

"Artists are the latest creative industry to gripe about the exploitative nature of artificial intelligence. More than 3,000 have written to protest against plans by Christie’s to auction art created using AI."

Wednesday, January 15, 2025

'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line; NPR, January 14, 2025

Bobby Allyn, NPR; 'The New York Times' takes OpenAI to court. ChatGPT's future could be on the line

"A group of news organizations, led by The New York Times, took ChatGPT maker OpenAI to federal court on Tuesday in a hearing that could determine whether the tech company has to face the publishers in a high-profile copyright infringement trial.

Three publishers' lawsuits against OpenAI and its financial backer Microsoft have been merged into one case. Leading each of the three combined cases are the Times, The New York Daily News and the Center for Investigative Reporting.

Other publishers, like the Associated Press, News Corp. and Vox Media, have reached content-sharing deals with OpenAI, but the three litigants in this case are taking the opposite path: going on the offensive."

Monday, January 6, 2025

OpenAI holds off on promise to creators, fails to protect intellectual property; The American Bazaar, January 3, 2025

Vishnu Kamal, The American Bazaar; OpenAI holds off on promise to creators, fails to protect intellectual property

"OpenAI may yet again be in hot water as it seems that the tech giant may be reneging on its earlier assurances. Reportedly, in May, OpenAI said it was developing a tool to let creators specify how they want their works to be included in—or excluded from—its AI training data. But seven months later, this feature has yet to see the light of day.

Called Media Manager, the tool would “identify copyrighted text, images, audio, and video,” OpenAI said at the time, to reflect creators’ preferences “across multiple sources.” It was intended to stave off some of the company’s fiercest critics, and potentially shield OpenAI from IP-related legal challenges...

OpenAI has faced various legal challenges related to its AI technologies and operations. One major issue involves the privacy and data usage of its language models, which are trained on large datasets that may include publicly available or copyrighted material. This raises concerns over privacy violations and intellectual property rights, especially regarding whether the data used for training was obtained with proper consent.

Additionally, there are questions about the ownership of content generated by OpenAI’s models. If an AI produces a work based on copyrighted data, it is tricky to determine who owns the rights—whether it’s OpenAI, the user who prompted the AI, or the creators of the original data.

Another concern is the liability for harmful content produced by AI. If an AI generates misleading or defamatory information, legal responsibility could fall on OpenAI."

Sunday, December 29, 2024

AI's assault on our intellectual property must be stopped; Financial Times, December 21, 2024

Kate Mosse, Financial Times; AI's assault on our intellectual property must be stopped

"Imagine my dismay, therefore, to discover that those 15 years of dreaming, researching, planning, writing, rewriting, editing, visiting libraries and archives, translating Occitan texts, hunting down original 13th-century documents, becoming an expert in Catharsis, apparently counts for nothing. Labyrinth is just one of several of my novels that have been scraped by Meta's large language model. This has been done without my consent, without remuneration, without even notification. This is theft...

AI companies present creators as being against change. We are not. Every artist I know is already engaging with AI in one way or another. But a distinction needs to be made between AI that can be used in brilliant ways -- for example, medical diagnosis -- and the foundations of AI models, where companies are essentially stealing creatives' work for their own profit. We should not forget that the AI companies rely on creators to build their models. Without strong copyright law that ensures creators can earn a living, AI companies will lack the high-quality material that is essential for their future growth."

Sunday, December 8, 2024

There’s No Longer Any Doubt That Hollywood Writing Is Powering AI; The Atlantic, November 18, 2024

Alex Reisner , The Atlantic; There’s No Longer Any Doubt That Hollywood Writing Is Powering AI

"Editor’s note: This analysis is part of The Atlantic’s investigation into the OpenSubtitles data set. You can access the search tool directly here. Find The Atlantic's search tool for books used to train AI here.

For as long as generative-AI chatbots have been on the internet, Hollywood writers have wondered if their work has been used to train them. The chatbots are remarkably fluent with movie references, and companies seem to be training them on all available sources. One screenwriter recently told me he’s seen generative AI reproduce close imitations of The Godfather and the 1980s TV show Alf, but he had no way to prove that a program had been trained on such material.

I can now say with absolute confidence that many AI systems have been trained on TV and film writers’ work. Not just on The Godfather and Alf, but on more than 53,000 other movies and 85,000 other TV episodes: Dialogue from all of it is included in an AI-training data set that has been used by Apple, Anthropic, Meta, Nvidia, Salesforce, Bloomberg, and other companies. I recently downloaded this data set, which I saw referenced in papers about the development of various large language models (or LLMs). It includes writing from every film nominated for Best Picture from 1950 to 2016, at least 616 episodes of The Simpsons, 170 episodes of Seinfeld, 45 episodes of Twin Peaks, and every episode of The Wire, The Sopranos, and Breaking Bad. It even includes prewritten “live” dialogue from Golden Globes and Academy Awards broadcasts."

The Copyrighted Material Being Used to Train AI; The Bulwark, December 7, 2024

SONNY BUNCH, The Bulwark; The Copyrighted Material Being Used to Train AI

"On this week’s episode, I talked to Alex Reisner about his pieces in the Atlantic highlighting the copyrighted material being hoovered into large language models to help AI chatbots simulate human speech. If you’re a screenwriter and would like to see which of your work has been appropriated to aid in the effort, click here; he has assembled a searchable database of nearly 140,000 movie and TV scripts that have been used without permission. (And you should read his other stories about copyright law reaching its breaking point and “the memorization problem.”) In this episode, we also got into the metaphysics of art and asked what sort of questions need to be asked as we hurtle toward the future. If you enjoyed this episode, please share it with a friend!"

Tuesday, November 5, 2024

The Heart of the Matter: Copyright, AI Training, and LLMs; SSRN, November 1, 2024

Daniel J. Gervais, Vanderbilt University - Law School

Noam Shemtov, Queen Mary University of London, Centre for Commercial Law Studies

Haralambos Marmanis, Copyright Clearance Center

Catherine Zaller Rowland, Copyright Clearance Center

SSRN; The Heart of the Matter: Copyright, AI Training, and LLMs

"Abstract

This article explores the intricate intersection of copyright law and large language models (LLMs), a cutting-edge artificial intelligence technology that has rapidly gained prominence. The authors provide a comprehensive analysis of the copyright implications arising from the training, fine-tuning, and use of LLMs, which often involve the ingestion of vast amounts of copyrighted material. The paper begins by elucidating the technical aspects of LLMs, including tokenization, word embeddings, and the various stages of LLM development. This technical foundation is crucial for understanding the subsequent legal analysis. The authors then delve into the copyright law aspects, examining potential infringement issues related to both inputs and outputs of LLMs. A comparative legal analysis is presented, focusing on the United States, European Union, United Kingdom, Japan, Singapore, and Switzerland. The article scrutinizes relevant copyright exceptions and limitations in these jurisdictions, including fair use in the US and text and data mining exceptions in the EU. The authors highlight the uncertainties and challenges in applying these legal concepts to LLMs, particularly in light of recent court decisions and legislative developments. The paper also addresses the potential impact of the EU's AI Act on copyright considerations, including its extraterritorial effects. Furthermore, it explores the concept of "making available" in the context of LLMs and its implications for copyright infringement. Recognizing the legal uncertainties and the need for a balanced approach that fosters both innovation and copyright protection, the authors propose licensing as a key solution. They advocate for a combination of direct and collective licensing models to provide a practical framework for the responsible use of copyrighted materials in AI systems.

This article offers valuable insights for legal scholars, policymakers, and industry professionals grappling with the copyright challenges posed by LLMs. It contributes to the ongoing dialogue on adapting copyright law to technological advancements while maintaining its fundamental purpose of incentivizing creativity and innovation."

Monday, November 4, 2024

What AI knows about you; Axios, November 4, 2024

Ina Friend, Axios; What AI knows about you

"Most AI builders don't say where they are getting the data they use to train their bots and models — but legally they're required to say what they are doing with their customers' data.

The big picture: These data-use disclosures open a window onto the otherwise opaque world of Big Tech's AI brain-food fight.

In this new Axios series, we'll tell you, company by company, what all the key players are saying and doing with your personal information and content.

Why it matters: You might be just fine knowing that picture you just posted on Instagram is helping train the next generative AI art engine. But you might not — or you might just want to be choosier about what you share.

Zoom out: AI makers need an incomprehensibly gigantic amount of raw data to train their large language and image models.

The industry's hunger has led to a data land grab: Companies are vying to teach their baby AIs using information sucked in from many different sources — sometimes with the owner's permission, often without it — before new laws and court rulings make that harder.

Zoom in: Each Big Tech giant is building generative AI models, and many of them are using their customer data, in part, to train them.

In some cases it's opt-in, meaning your data won't be used unless you agree to it. In other cases it is opt-out, meaning your information will automatically get used unless you explicitly say no.
These rules can vary by region, thanks to legal differences. For instance, Meta's Facebook and Instagram are "opt-out" — but you can only opt out if you live in Europe or Brazil.
In the U.S., California's data privacy law is among the laws responsible for requiring firms to say what they do with user data. In the EU, it's the GDPR."