Showing posts with label AI training data. Show all posts
Showing posts with label AI training data. Show all posts

Friday, July 25, 2025

Trump’s Comments Undermine AI Action Plan, Threaten Copyright; Publishers Weekly, July 23, 2025

Ed Nawotka  , Publishers Weekly; Trump’s Comments Undermine AI Action Plan, Threaten Copyright

"Senate bill proposes 'opt-in' legislation

Trump's comments come on the heels of the introduction, by U.S. senators Josh Hawley (R-Mo.) and Richard Blumenthal (D-Conn.), of the AI Accountability and Personal Data Protection Act this past Monday following a hearing last week on AI companies' copyright infringement. The bipartisan legislation aims to hold AI firms liable for using copyrighted works or personal data without acquiring explicit consent to train AI models. It would empower individuals—including writers, artists, and content creators—to sue companies in federal court if their data or copyrighted works are used without consent. It also supports class action lawsuits and advocates for violators to pay robust penalties.

"AI companies are robbing the American people blind while leaving artists, writers, and other creators with zero recourse," said Hawley. "It’s time for Congress to give the American worker their day in court to protect their personal data and creative works. My bipartisan legislation would finally empower working Americans who now find their livelihoods in the crosshairs of Big Tech’s lawlessness."

"This bill embodies a bipartisan consensus that AI safeguards are urgent—because the technology is moving at accelerating speed, and so are dangers to privacy," added Blumenthal. "Enforceable rules can put consumers back in control of their data, and help bar abuses. Tech companies must be held accountable—and liable legally—when they breach consumer privacy, collecting, monetizing or sharing personal information without express consent. Consumers must be given rights and remedies—and legal tools to make them real—not relying on government enforcement alone."

Thursday, July 24, 2025

Donald Trump Is Fairy-Godmothering AI; The Atlantic, July 23, 2025

 Matteo Wong , The Atlantic; Donald Trump Is Fairy-Godmothering AI

"In a sense, the action plan is a bet. AI is already changing a number of industries, including software engineering, and a number of scientific disciplines. Should AI end up producing incredible prosperity and new scientific discoveries, then the AI Action Plan may well get America there faster simply by removing any roadblocks and regulations, however sensible, that would slow the companies down. But should the technology prove to be a bubble—AI products remain error-prone, extremely expensive to build, and unproven in many business applications—the Trump administration is more rapidly pushing us toward the bust. Either way, the nation is in Silicon Valley’s hands...

Once the red tape is gone, the Trump administration wants to create a “dynamic, ‘try-first’ culture for AI across American industry.” In other words, build and test out AI products first, and then determine if those products are actually helpful—or if they pose any risks.

Trump gestured toward other concessions to the AI industry in his speech. He specifically targeted intellectual-property laws, arguing that training AI models on copyrighted books and articles does not infringe upon copyright because the chatbots, like people, are simply learning from the content. This has been a major conflict in recent years, with more than 40 related lawsuits filed against AI companies since 2022. (The Atlantic is suing the AI company Cohere, for example.) If courts were to decide that training AI models with copyrighted material is against the law, it would be a major setback for AI companies. In their official recommendations for the AI Action Plan, OpenAI, Microsoft, and Google all requested a copyright exception, known as “fair use,” for AI training. Based on his statements, Trump appears to strongly agree with this position, although the AI Action Plan itself does not reference copyright and AI training.

Also sprinkled throughout the AI Action Plan are gestures toward some MAGA priorities. Notably, the policy states that the government will contract with only AI companies whose models are “free from top-down ideological bias”—a reference to Sacks’s crusade against “woke” AI—and that a federal AI-risk-management framework should “eliminate references to misinformation, Diversity, Equity, and Inclusion, and climate change.” Trump signed a third executive order today that, in his words, will eliminate “woke, Marxist lunacy” from AI models...

Looming over the White House’s AI agenda is the threat of Chinese technology getting ahead. The AI Action Plan repeatedly references the importance of staying ahead of Chinese AI firms, as did the president’s speech: “We will not allow any foreign nation to beat us; our nation will not live in a planet controlled by the algorithms of the adversaries,” Trump declared...

But whatever happens on the international stage, hundreds of millions of Americans will feel more and more of generative AI’s influence—on salaries and schools, air quality and electricity costs, federal services and doctor’s offices. AI companies have been granted a good chunk of their wish list; if anything, the industry is being told that it’s not moving fast enough. Silicon Valley has been given permission to accelerate, and we’re all along for the ride."

Donald Trump Says AI Companies Can’t Be Expected To Pay For All Copyrighted Content Used In Their Training Models: “Not Do-Able”; Deadline, July 23, 2025

 Ted JohnsonTom Tapp, Deadline; Donald Trump Says AI Companies Can’t Be Expected To Pay For All Copyrighted Content Used In Their Training Models: “Not Do-Able”

 

[Kip Currier: Don't be fooled by the flimflam rhetoric in Trump's AI Action Plan unveiled yesterday (July 23, 2025). Where Trump's AI Action Plan says “We must ensure that free speech flourishes in the era of AI and that AI procured by the Federal government objectively reflects truth rather than social engineering agendas", it's actually the exact opposite: the Trump plan is censorious and will "cancel out" truth (e.g. on climate science, misinformation and disinformation, etc.) in Orwellian fashion.]


[Excerpt]

"The plan is a contrast to Trump’s predecessor, Joe Biden, who focused on the government’s role in ensuring that the technology was safe.

The Trump White House plan also recommends updating federal procurement guidelines “to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias.” Also recommended is revising the National Institute of Standards and Technology AI Risk Management Framework to remove references to misinformation, DEI and climate change.

“We must ensure that free speech flourishes in the era of AI and that AI procured by the Federal government objectively reflects truth rather than social engineering agendas,” the plan says."

Wednesday, July 23, 2025

Trump derides copyright and state rules in AI Action Plan launch; Politico, July 23, 2025

 MOHAR CHATTERJEE , Politico; Trump derides copyright and state rules in AI Action Plan launch

"President Donald Trump criticized copyright enforcement efforts and state-level AI regulations Wednesday as he launched the White House’s AI Action Plan on a mission to dominate the industry.

In remarks delivered at a “Winning the AI Race” summit hosted by the All-In Podcast and the Hill and Valley Forum in Washington, Trump said stringent copyright enforcement was unrealistic for the AI industry and would kneecap U.S. companies trying to compete globally, particularly against China.

“You can’t be expected to have a successful AI program when every single article, book or anything else that you’ve read or studied, you’re supposed to pay for,” he said. “You just can’t do it because it’s not doable. ... China’s not doing it.”

Trump’s comments were a riff as his 28-page AI Action Plan did not wade into copyright and administration officials told reporters the issue should be left to the courts to decide.

Trump also signed three executive orders. One will fast track federal permitting, streamline reviews and “do everything possible to expedite construction of all major AI infrastructure projects,” Trump said. Another expands American exports of AI hardware and software. A third order bans the federal government from procuring AI technology “that has been infused with partisan bias or ideological agendas,” as Trump put it...

Trump echoed tech companies’ complaints about state AI laws creating a patchwork of regulation. “You can’t have one state holding you up,” he said. “We need one common sense federal standard that supersedes all states, supersedes everybody.”"

Wave of copyright lawsuits hit AI companies like Cambridge-based Suno; WBUR, July 23, 2025

 

 WBUR; Wave of copyright lawsuits hit AI companies like Cambridge-based Suno

"Suno, a Cambridge company that generates AI music, faces multiple lawsuits alleging it illegally trained its model on copyrighted work. Peter Karol of Suffolk Law School and Bhamati Viswanathan of Columbia University Law School's Kernochan Center for Law, Media, and the Arts join WBUR's Morning Edition to explain how the suits against Suno fit into a broader legal battle over the future of creative work.

This segment aired on July 23, 2025. Audio will be available soon."

Tuesday, July 22, 2025

Commentary: A win-win-win path for AI in America; The Post & Courier, July 22, 2025

Keith Kupferschmid, The Post & Courier; Commentary: A win-win-win path for AI in America

"Contrary to claims that these AI training deals are impossible to make at scale, a robust free market is already emerging in which hundreds (if not thousands) of licensed deals between AI companies and copyright owners have been reached. New research shows it is possible to create fully licensed data sets for AI.

No wonder one federal judge recently called claims that licensing is impractical “ridiculous,” given the billions at stake: “If using copyrighted works to train the models is as necessary as the companies say, they will figure out a way to compensate copyright holders.” Just like AI companies don’t dispute that they have to pay for energy, infrastructure, coding teams and the other inputs their operations require, they need to pay for creative works as well.

America’s example to the world is a free-market economy based on the rule of law, property rights and freedom to contract — so, let the market innovate solutions to these new (but not so new) licensing challenges. Let’s construct a pro-innovation, pro-worker approach that replaces the false choice of the AI alarmists with a positive, pro-America pathway to leadership on AI."

Senators Introduce Bill To Restrict AI Companies’ Unauthorized Use Of Copyrighted Works For Training Models; Deadline, July 21, 2025

Ted Johnson , Deadline; Senators Introduce Bill To Restrict AI Companies’ Unauthorized Use Of Copyrighted Works For Training Models

"Sen. Josh Hawley (R-MO) and Sen. Richard Blumenthal (D-CT) introduced legislation on Monday that would restrict AI companies from using copyrighted material in their training models without the consent of the individual owner.

The AI Accountability and Personal Data Protection Act also would allow individuals to sue companies that uses their personal data or copyrighted works without their “express, prior consent.”

The bill addresses a raging debate between tech and content owners, one that has already led to extensive litigation. Companies like OpenAI have argued that the use of copyrighted materials in training models is a fair use, while figures including John Grisham and George R.R. Martin have challenged that notion."

Sunday, July 20, 2025

AI guzzled millions of books without permission. Authors are fighting back.; The Washington Post, July 19, 2025

  , The Washington Post; AI guzzled millions of books without permission. Authors are fighting back.


[Kip Currier: I've written this before on this blog and I'll say it again: technology companies would never allow anyone to freely vacuum up their content and use it without permission or compensation. Period. Full Stop.]


[Excerpt]

"Baldacci is among a group of authors suing OpenAI and Microsoft over the companies’ use of their work to train the AI software behind tools such as ChatGPT and Copilot without permission or payment — one of more than 40 lawsuits against AI companies advancing through the nation’s courts. He and other authors this week appealed to Congress for help standing up to what they see as an assault by Big Tech on their profession and the soul of literature.

They found sympathetic ears at a Senate subcommittee hearing Wednesday, where lawmakers expressed outrage at the technology industry’s practices. Their cause gained further momentum Thursday when a federal judge granted class-action status to another group of authors who allege that the AI firm Anthropic pirated their books.

“I see it as one of the moral issues of our time with respect to technology,” Ralph Eubanks, an author and University of Mississippi professor who is president of the Authors Guild, said in a phone interview. “Sometimes it keeps me up at night.”

Lawsuits have revealed that some AI companies had used legally dubious “torrent” sites to download millions of digitized books without having to pay for them."

Judge Rules Class Action Suit Against Anthropic Can Proceed; Publishers Weekly, July 18, 2025

Jim Milliot , Publishers Weekly; Judge Rules Class Action Suit Against Anthropic Can Proceed

"In a major victory for authors, U.S. District Judge William Alsup ruled July 17 that three writers suing Anthropic for copyright infringement can represent all other authors whose books the AI company allegedly pirated to train its AI model as part of a class action lawsuit.

In late June, Alsup of the Northern District of California, ruled in Bartz v. Anthropic that the AI company's training of its Claude LLMs on authors' works was "exceedingly transformative," and therefore protected by fair use. However, Alsup also determined that the company's practice of downloading pirated books from sites including Books3, Library Genesis, and Pirate Library Mirror (PiLiMi) to build a permanent digital library was not covered by fair use.

Alsup’s most recent ruling follows an amended complaint from the authors looking to certify classes of copyright owners in a “Pirated Books Class” and in a “Scanned Books Class.” In his decision, Alsup certified only a LibGen and PiLiMi Pirated Books Class, writing that “this class is limited to actual or beneficial owners of timely registered copyrights in ISBN/ASIN-bearing books downloaded by Anthropic from these two pirate libraries.”

Alsup stressed that “the class is not limited to authors or author-like entities,” explaining that “a key point is to cover everyone who owns the specific copyright interest in play, the right to make copies, either as the actual or as the beneficial owner.” Later in his decision, Alsup makes it clear who is covered by the ruling: “A beneficial owner...is someone like an author who receives royalties from any publisher’s revenues or recoveries from the right to make copies. Yes, the legal owner might be the publisher but the author has a definite stake in the royalties, so the author has standing to sue. And, each stands to benefit from the copyright enforcement at the core of our case however they then divide the benefit.”"

US authors suing Anthropic can band together in copyright class action, judge rules; Reuters, July 17, 2025

 , Reuters; US authors suing Anthropic can band together in copyright class action, judge rules

"A California federal judge ruled on Thursday that three authors suing artificial intelligence startup Anthropic for copyright infringement can represent writers nationwide whose books Anthropic allegedly pirated to train its AI system.

U.S. District Judge William Alsup said the authors can bring a class action on behalf of all U.S. writers whose works Anthropic allegedly downloaded from "pirate libraries" LibGen and PiLiMi to create a repository of millions of books in 2021 and 2022."

Thursday, July 17, 2025

The Art (and Legality) of Imitation: Navigating the Murky Waters of Fair Use in AI Training The National Law Review, July 16, 2025

Sarah C. ReasonerAshley N. HigginsonAnita C. MarinelliKimberly A. Berger of Miller Canfield   - Miller Canfield Resources, The National Law Review; The Art (and Legality) of Imitation: Navigating the Murky Waters of Fair Use in AI Training 

"The legal landscape for artificial intelligence is still developing, and no outcome can yet be predicted with any sort of accuracy. While some courts appear poised to accept AI model training as transformative, other courts do not. As AI technology continues to advance, the legal system must adapt to address the unique challenges it presents. Meanwhile, businesses and creators navigating this uncertain terrain should stay informed about legal developments and consider proactive measures to mitigate risks. As we await further rulings and potential legislative action, one thing is clear: the conversation around AI and existing intellectual property protection is just beginning."

Wednesday, July 16, 2025

Can Gen AI and Copyright Coexist?; Harvard Business Review, July 16, 2025

 and , Harvard Business Review ; Can Gen AI and Copyright Coexist?

"We’re experts in the study of digital transformation and have given this issue a lot of thought. We recently served, for example, on a roundtable of 10 economists convened by the U.S. Copyright Office to study the implications of gen AI on copyright policy. We recognize that the two decisions are far from the last word on this topic; both will no doubt be appealed to the Ninth Circuit and then subsequently to the Supreme Court. But in the meantime, we believe there are already many lessons to be learned from these decisions about the implications of gen AI for business—lessons that will be useful for leaders in both the creative industries and gen AI companies."

Wednesday, July 9, 2025

Viewpoint: Don’t let America’s copyright crackdown hand China global AI leadership; Grand Forks Herald, July 5, 2025

 Kent Conrad and Saxby Chambliss , Grand Forks Herald; Viewpoint: Don’t let America’s copyright crackdown hand China global AI leadership


[Kip Currier: The assertion by anti-AI regulation proponents, like the former U.S. congressional authors of this think-piece, that requiring AI tech companies to secure permission and pay for AI training data will kill or hobble U.S. AI entrepreneurship is hyperbolic catastrophizing. AI tech companies can license training data from creators who are willing to participate in licensing frameworks. Such frameworks already exist for music copyrights, for example. AI tech companies just don't want to pay for something if they can get it for free.

AI tech companies would never permit users to scrape up, package, and sell their IP content for free. Copyright holders shouldn't be held to a different standard and be required to let tech companies monetize their IP-protected works without permission and compensation.]

Excerpt]

"If these lawsuits succeed, or if Congress radically rewrites the law, it will become nearly impossible for startups, universities or mid-size firms to develop competitive AI tools."

Why the new rulings on AI copyright might actually be good news for publishers; Fast Company, July 9, 2025

PETE PACHAL, Fast Company; Why the new rulings on AI copyright might actually be good news for publishers

"The outcomes of both cases were more mixed than the headlines suggest, and they are also deeply instructive. Far from closing the door on copyright holders, they point to places where litigants might find a key...

Taken together, the three cases point to a clearer path forward for publishers building copyright cases against Big AI:

Focus on outputs instead of inputs: It’s not enough that someone hoovered up your work. To build a solid case, you need to show that what the AI company did with it reproduced it in some form. So far, no court has definitively decided whether AI outputs are meaningfully different enough to count as “transformative” in the eyes of copyright law, but it should be noted that courts have ruled in the past that copyright violation can occur even when small parts of the work are copied—ifthose parts represent the “heart” of the original.

Show market harm: This looks increasingly like the main battle. Now that we have a lot of data on how AI search engines and chatbots—which, to be clear, are outputs—are affecting the online behavior of news consumers, the case that an AI service harms the media market is easier to make than it was a year ago. In addition, the emergence of licensing deals between publishers and AI companies is evidence that there’s market harm by creating outputs without offering such a deal.

Question source legitimacy: Was the content legally acquired or pirated? The Anthropic case opens this up as a possible attack vector for publishers. If they can prove scraping occurred through paywalls—without subscribing first—that could be a violation even absent any outputs."

Saturday, July 5, 2025

Two Courts Rule On Generative AI and Fair Use — One Gets It Right; Electronic Frontier Foundation (EFF), June 26, 2025

TORI NOBLE, Electronic Frontier Foundation (EFF); Two Courts Rule On Generative AI and Fair Use — One Gets It Right

 "Gen-AI is spurring the kind of tech panics we’ve seen before; then, as now, thoughtful fair use opinions helped ensure that copyright law served innovation and creativity. Gen-AI does raise a host of other serious concerns about fair labor practices and misinformation, but copyright wasn’t designed to address those problems. Trying to force copyright law to play those roles only hurts important and legal uses of this technology.

In keeping with that tradition, courts deciding fair use in other AI copyright cases should look to Bartz, not Kadrey."

Thursday, July 3, 2025

Cloudflare Sidesteps Copyright Issues, Blocking AI Scrapers By Default; Forbes, July 2, 2025

Emma Woollacott , Forbes; Cloudflare Sidesteps Copyright Issues, Blocking AI Scrapers By Default

"IT service management company Cloudflare is striking back on behalf of content creators, blocking AI scrapers by default.

Web scrapers are bots that crawl the internet, collecting and cataloguing content of all types, and are used by AI firms to collect material that can be used to train their models.

Now, though, Cloudflare is allowing website owners to choose if they want AI crawlers to access their content, and decide how the AI companies can use it. They can opt to allow crawlers for certain purposes—search, for example—but block others. AI companies will have to obtain explicit permission from a website before scraping."

Wednesday, July 2, 2025

Fair Use or Foul Play? The AI Fair Use Copyright Line; The National Law Review, July 2, 2025

 Jodi Benassi of McDermott Will & Emery  , The National Law Review; Fair Use or Foul Play? The AI Fair Use Copyright Line

"Practice note: This is the first federal court decision analyzing the defense of fair use of copyrighted material to train generative AI. Two days after this decision issued, another Northern District of California judge ruled in Kadrey et al. v. Meta Platforms Inc. et al., Case No. 3:23-cv-03417, and concluded that the AI technology at issue in his case was transformative. However, the basis for his ruling in favor of Meta on the question of fair use was not transformation, but the plaintiffs’ failure “to present meaningful evidence that Meta’s use of their works to create [a generative AI engine] impacted the market” for the books."

Eminem, AI and me: why artists need new laws in the digital age; The Guardian, July 2, 2025

 , The Guardian; Eminem, AI and me: why artists need new laws in the digital age

"Song lyrics, my publisher informs me, are subject to notoriously strict copyright enforcement and the cost to buy the rights is often astronomical. Fat chance as well, then, of me quoting Eminem to talk about how Lose Yourself seeped into the psyche of a generation when he rapped: “You only get one shot, do not miss your chance to blow, this opportunity comes once in a lifetime.”

Oh would it be different if I were an AI company with a large language model (LLM), though. I could scrape from the complete discography of the National and Eminem, and the lyrics of every other song ever written. Then, when a user prompted something like, “write a rap in the style of Eminem about losing money, and draw inspiration from the National’s Bloodbuzz Ohio”, my word correlation program – with hundreds of millions of paying customers and a market capitalisation worth tens if not hundreds of billions of dollars – could answer:

“I still owe money to the money to the money I owe,

But I spit gold out my throat when I flow,

So go tell the bank they can take what they like

I already gave my soul to the mic.”

And that, according to rulings last month by the US courts, is somehow “fair use” and is perplexingly not copyright infringement at all, despite no royalties having been paid to anyone in the process."

Tuesday, July 1, 2025

The Court Battles That Will Decide if Silicon Valley Can Plunder Your Work; Slate, June 30, 2025

 BY  , Slate; The Court Battles That Will Decide if Silicon Valley Can Plunder Your Work

"Last week, two different federal judges in the Northern District of California made legal rulings that attempt to resolve one of the knottiest debates in the artificial intelligence world: whether it’s a copyright violation for Big Tech firms to use published books for training generative bots like ChatGPT. Unfortunately for the many authors who’ve brought lawsuits with this argument, neither decision favors their case—at least, not for now. And that means creators in all fields may not be able to stop A.I. companies from using their work however they please...

What if these copyright battles are also lost? Then there will be little in the way of stopping A.I. startups from utilizing all creative works for their own purposes, with no consideration as to the artists and writers who actually put in the work. And we will have a world blessed less with human creativity than one overrun by second-rate slop that crushes the careers of the people whose imaginations made that A.I. so potent to begin with."

Hollywood Confronts AI Copyright Chaos in Washington, Courts; The Wall Street Journal, July 1, 2025

Amrith Ramkumar,  Jessica Toonkel, The Wall Street Journal; Hollywood Confronts AI Copyright Chaos in Washington, Courts

Technology firms say using copyrighted materials to train AI models is key to America’s success; creatives want their work protected