Showing posts with label AI training data. Show all posts
Showing posts with label AI training data. Show all posts

Monday, December 22, 2025

OpenAI, Anthropic, xAI Hit With Copyright Suit from Writers; Bloomberg Law, December 22, 2025

 Annelise Levy, Bloomberg Law; OpenAI, Anthropic, xAI Hit With Copyright Suit from Writers

"Writers including Pulitzer Prize-winning journalist John Carreyrou filed a copyright lawsuit accusing six AI giants of using pirated copies of their books to train large language models.

The complaint, filed Monday in the US District Court for the Northern District of California, claims Anthropic PBC, Google LLCOpenAI Inc.Meta Platforms Inc., xAI Corp., and Perplexity AI Inc. committed a “deliberate act of theft.”

It is the first copyright lawsuit against xAI over its training process, and the first suit brought by authors against Perplexity...

Carreyrou is among the authors who opted out of a $1.5 billion class-action settlement with Anthropic."

Sunday, December 21, 2025

Launch, Train, Settle: How Suno And Udio’s Licensing Deals Made Copyright Infringement Profitable; Forbes, December 18, 2025

Virginie Berger, Forbes; Launch, Train, Settle: How Suno And Udio’s Licensing Deals Made Copyright Infringement Profitable

"The Precedent That Pays

Perhaps most concerning is what these partial settlements teach other AI companies: copyright infringement can be a viable business strategy, as long as you only have to answer to those with the resources to sue.

The calculus is straightforward. Build your product using copyrighted material without permission. Grow quickly while competitors who might try to license properly struggle with costs and complexity.

If you get big enough, those with sufficient resources will eventually sue. At that point, negotiate from strength because your technology is already deployed, your users are already dependent on it, and dismantling what you've built would be costly.

The worst case isn't court-ordered damages or shutdown anymore but will be a licensing deal where you finally pay something. But far less than you would have paid to license properly from the start, and only to the major players who could force you to the table. And you keep operating with legitimacy.

Both Suno and Udio can now market themselves as "responsibly licensed" platforms, pointing to their deals with major labels as proof of legitimacy. The narrative shifts from "they stole content to build this" to "they're innovative partners in the future of music.""

Australian culture, resources and democracy for $4,300 a year? Thanks for the offer, tech bros, but no thanks; The Guardian, December 15, 2025

 , The Guardian; Australian culture, resources and democracy for $4,300 a year? Thanks for the offer, tech bros, but no thanks

"According to the Tech Council, AI will deliver $115bn in annual productivity (or about $4,300 per person), rubbery figures generated by industry-commissioned research based on estimates on hours saved with no regard for jobs lost, the distribution of the promised dividend benefit or how the profits will flow.

In return for this ill-defined bounty, Farquhar says our government will need to allow the tech industry to do three things: build a data and text mining exemption to copyright law, rapidly scale data centre infrastructure and allow foreign companies to use these centres without regard for local laws. This is a proposition that demands closer scrutiny.

The use of copyrighted content to train AI has been a burning issue since 2023 when a massive data dredge saw more than 190,000 authors (including me) have our works plundered without our consent to train AI. Musicians and artists too have had their work scraped and repurposed.

This theft has been critical in training the large language models to portray something approaching empathy. It has also allowed paid users to take this stolen content and ape creators, devaluing and diminishing their work in the process. Nick Cave has described this as “replication as travesty”, noting “songs arise out of suffering … data doesn’t suffer. ChatGPT has no inner being, it has been nowhere, it has endured nothing.”

The sense of grievance among creators over the erasure of culture is wide and deep. A wave of creators from Peter Garrett to Tina Arena, Anna Funderand Trent Dalton have determined this is the moment to take a stand.

It is not just the performers; journalists, academics, voiceover and visual artists are all being replaced by shittier but cheaper automated products built on the theft of their labour, undermining the integrity of their work and will ultimately take their jobs.

Like fossil fuels, what is being extracted and consumed is the sum of our accumulated history. It goes from metaphor to literal when it comes to the second plank of Farquhar’s pitch: massive spending on industrial infrastructure to accommodate AI.

This imperative to power AI is the justification used by Donald Trump to recharge the mining of fossil fuels, while the industry is beating the “modular nuclear” drum for a cleaner AI revolution. Meanwhile, the OpenAI CEO, Sam Altman, is reassuring us that we don’t need to stress because AI will solve climate change anyway!

The third and final element of Farquhar’s pitch is probably its most revealing. If Australia wants to build this AI nirvana, foreign nations should be given diplomatic immunity for the data centres built and operated here. This quaint notion of the “data embassy” overriding national sovereignty reinforces a growing sense that the tech sector is moving beyond the idea of the nation state governing corporations to that of a modern imperial power.

That’s the premise of Karen Hao’s book The Empire of AI, which chronicles the rise of OpenAI and the choices it made to trade off safety and the public good in pursuit of scale and profit."

Proposal to allow use of Australian copyrighted material to train AI abandoned after backlash; The Guardian, December 19, 2025

 , The Guardian; Proposal to allow use of Australian copyrighted material to train AI abandoned after backlash

"The Productivity Commission has abandoned a proposal to allow tech companies to mine copyrighted material to train artificial intelligence models, after a fierce backlash from the creative industries.

Instead, the government’s top economic advisory body recommended the government wait three years before deciding whether to establish an independent review of Australian copyright settings and the impact of the disruptive new technology...

In its interim report on the digital economy, the commission floated the idea of granting a “fair dealing” exemption to copyright rules that would allow AI companies to mine data and text to develop their large language models...

The furious response from creative industries to the commission’s idea included music industry bodies saying it would “legitimise digital piracy under guise of productivity”."

Monday, December 15, 2025

Government's AI consultation finds just 3% support copyright exception; The Bookseller, December 15, 2025

MAIA SNOW, The Bookseller ; Government's AI consultation finds just 3% support copyright exception

"The initial results of the consultation found that the majority of respondents (88%) backed licences being required in all cases where data was being used for AI training. Just 3% of respondents supported the government’s preferred options, which would allow data mining by AI companies and require rights holders to opt-out."

Sunday, December 14, 2025

The Disney-OpenAI tie-up has huge implications for intellectual property; Fast Company, December 11, 2025

CHRIS STOKEL-WALKER, Fast Company ; The Disney-OpenAI tie-up has huge implications for intellectual property

"Walt Disney and OpenAI make for very odd bedfellows: The former is one of the most-recognized brands among children under the age of 18. The near-$200 billion company’s value has been derived from more than a century of aggressive safeguarding of its intellectual property and keeping the magic alive among innocent children.

OpenAI, which celebrated its first decade of existence this week, is best known for upending creativity, the economy, and society with its flagship product, ChatGPT. And in the last two months, it has said it wants to get to a place where its adult users can use its tech to create erotica.

So what the hell should we make of a just-announced deal between the two that will allow ChatGPT and Sora users to create images and videos of more than 200 characters, from Mickey and Minnie Mouse to the Mandalorian, starting from early 2026?"


Saturday, December 13, 2025

Authors Ask to Update Meta AI Copyright Suit With Torrent Claim; Bloomberg Law, December 12, 2025

, Bloomberg Law; Authors Ask to Update Meta AI Copyright Suit With Torrent Claim

"Authors in a putative class action copyright suit against Meta Platforms Inc. asked a federal judge for permission to amend their complaint to add a claim over Meta’s use of peer-to-peer file-sharing unveiled in discovery."

Thursday, December 11, 2025

Disney says Google AI infringes copyright “on a massive scale”; Ars Technica, December 11, 2025

RYAN WHITWAM , Ars Technica; Disney says Google AI infringes copyright “on a massive scale”

"Disney has sent a cease and desist to Google, alleging the company’s AI tools are infringing Disney’s copyrights “on a massive scale.”

According to the letter, Google is violating the entertainment conglomerate’s intellectual property in multiple ways. The legal notice says Google has copied a “large corpus” of Disney’s works to train its gen AI models, which is believable, as Google’s image and video models will happily produce popular Disney characters—they couldn’t do that without feeding the models lots of Disney data.

The C&D also takes issue with Google for distributing “copies of its protected works” to consumers."

Has Cambridge-based AI music upstart Suno 'gone legit'?; WBUR, December 11, 2025

, WBUR ; Has Cambridge-based AI music upstart Suno 'gone legit'?

"The Cambridge-based AI music company Suno, which has been besieged by lawsuits from record labels, is now teaming up with behemoth label Warner Music. Under a new partnership, Warner will license music in its catalogue for use by Suno's AI.

Copyright law experts Peter Karol and Bhamati Viswanathan join WBUR's Morning Edition to discuss what the deal between Suno and Warner Music means for the future of intellectual property."

Wednesday, December 10, 2025

EU investigates Google over AI-generated summaries in search results; BBC, December 8, 2025

 Liv McMahon , BBC; EU investigates Google over AI-generated summaries in search results

"The Commission's investigation comes down to whether Google has used the work of other people published online to build its own AI tools which it can profit from."

AI firms began to feel the legal wrath of copyright holders in 2025; NewScientist, December 10, 2025

Chris Stokel-Walker , NewScientist; AI firms began to feel the legal wrath of copyright holders in 2025

"The three years since the release of ChatGPT, OpenAI’s generative AI chatbot, have seen huge changes in every part of our lives. But one area that hasn’t changed – or at least, is still trying to maintain pre-AI norms – is the upholding of copyright law.

It is no secret that leading AI firms built their models by hoovering up data, including copyrighted material, from the internet without asking for permission first. This year, major copyright holders struck back, buffeting AI companies were with a range of lawsuits alleging copyright infringement."

Saturday, December 6, 2025

The New York Times sues Perplexity for producing ‘verbatim’ copies of its work; The Verge, December 5, 2025

Emma Roth, The Verge; The New York Times sues Perplexity for producing ‘verbatim’ copies of its work

"The New York Times has escalated its legal battle against the AI startup Perplexity, as it’s now suing the AI “answer engine” for allegedly producing and profiting from responses that are “verbatim or substantially similar copies” of the publication’s work.

The lawsuit, filed in a New York federal court on Friday, claims Perplexity “unlawfully crawls, scrapes, copies, and distributes” content from the NYT. It comes after the outlet’s repeated demands for Perplexity to stop using content from its website, as the NYT sent cease-and-desist notices to the AI startup last year and most recently in July, according to the lawsuit. The Chicago Tribune also filed a copyright lawsuit against Perplexity on Thursday."

Friday, December 5, 2025

The New York Times is suing Perplexity for copyright infringement; TechCrunch, December 5, 2025

Rebecca Bellan , TechCrunch; The New York Times is suing Perplexity for copyright infringement

"The New York Times filed suit Friday against AI search startup Perplexity for copyright infringement, its second lawsuit against an AI company. The Times joins several media outlets suing Perplexity, including the Chicago Tribune, which also filed suit this week."

Thursday, December 4, 2025

OpenAI loses fight to keep ChatGPT logs secret in copyright case; Reuters, December 3, 2025

  , Reuters ; OpenAI loses fight to keep ChatGPT logs secret in copyright case

"OpenAI must produce millions of anonymized chat logs from ChatGPT users in its high-stakes copyright dispute with the New York Times and other news outlets, a federal judge in Manhattan ruled.

U.S. Magistrate Judge Ona Wang in a decision made public on Wednesday said that the 20 million logs were relevant to the outlets' claims and that handing them over would not risk violating users' privacy."

Lawsuit or License?; Columbia Journalism Review, December 4, 2025

, Columbia Journalism Review; Lawsuit or License?

"Today, the Tow Center for Digital Journalism is releasing a tracker that monitors developments between news publishers and AI companies—including lawsuits, deals, and grants—based on publicly available information."

Wednesday, December 3, 2025

‘The biggest decision yet’; The Guardian, December 2, 2025

  , The Guardian; ‘The biggest decision yet’

"Humanity will have to decide by 2030 whether to take the “ultimate risk” of letting artificial intelligence systems train themselves to become more powerful, one of the world’s leading AI scientists has said.

Jared Kaplan, the chief scientist and co-owner of the $180bn (£135bn) US startup Anthropic, said a choice was looming about how much autonomy the systems should be given to evolve.

The move could trigger a beneficial “intelligence explosion” – or be the moment humans end up losing control...

He is not alone at Anthropic in voicing concerns. One of his co-founders, Jack Clark, said in October he was both an optimist and “deeply afraid” about the trajectory of AI, which he called “a real and mysterious creature, not a simple and predictable machine”.

Kaplan said he was very optimistic about the alignment of AI systems with the interests of humanity up to the level of human intelligence, but was concerned about the consequences if and when they exceed that threshold."

Bannon, top conservatives urge White House to reject Big Tech’s ‘fair use’ push to justify AI copyright theft: ‘Un-American and absurd’; New York Post, December 1, 2025

 Thomas Barrabi , New York Post; Bannon, top conservatives urge White House to reject Big Tech’s ‘fair use’ push to justify AI copyright theft: ‘Un-American and absurd’

"Prominent conservatives including Steve Bannon are urging the Trump administration to reject an increasingly popular argument that tech giants are using to rip off copyrighted material to train artificial intelligence.

So-called “fair use” doctrine – which argues that the use of copyrighted content without permission is legally justified if it is done in the public interest – has become a common defense for AI firms like Google, Mark Zuckerberg’s Meta and Microsoft who have been accused of ripping off work.

The argument’s biggest backers also include White House AI czar David Sacks, who has warned that Silicon Valley firms “would be crippled” in a crucial race against AI firms in China unless they can rely on fair use protection...

Bannon and his allies threw cold water on such claims in a Monday letter addressed to US Attorney General Pam Bondi and Michael Kratsios, who heads the White House’s Office of Science and Technology Policy.

“This is un-American and absurd,” the conservatives argued in the letter, which was exclusively obtained by The Post. “We must compete and win the global AI race the American way — by ensuring we protect creators, children, conservatives, and communities.”...

The conservatives point to clear economic incentives to back copyright-protected industries, which contribute more than $2 trillion to the US GDP, carry an average annual wage of more than $140,000 and account for a $37 billion trade surplus, according to the letter...

The letter notes that money is no object for the companies leading the AI boom, which “enjoy virtually unlimited access to financing” and are each valued at hundreds of billions, if not trillions of dollars.

“In a free market, businesses pay for the inputs they need,” the letter said. “Imagine if AI CEOs claimed they needed free access to semiconductors, energy, researchers, and developers to build their products. They would be laughed out of their boardrooms.”...

The letter is the latest salvo in a heated policy divide as AI models gobble up data from the web. Critics accuse companies like Google, Microsoft, OpenAI and Meta of essentially seeking a “license to steal” from news outlets, artists, authors and others that produce original work."

Tuesday, December 2, 2025

Two AI copyright cases, two very different outcomes – here’s why; The Conversation, December 1, 2025

Reader in Intellectual Property Law, Brunel University of London , The Conversation; Two AI copyright cases, two very different outcomes – here’s why

"Artificial intelligence companies and the creative industries are locked in an ongoing battle, being played out in the courts. The thread that pulls all these lawsuits together is copyright.

There are now over 60 ongoing lawsuits in the US where creators and rightsholders are suing AI companies. Meanwhile, we have recently seen decisions in the first court cases from the UK and Germany – here’s what happened in those...

Although the circumstances of the cases are slightly different, the heart of the issue was the same. Do AI models reproduce copyright-protected content in their training process and in generating outputs? The German court decided they do, whereas the UK court took a different view.

Both cases could be appealed and others are underway, so things may change. But the ending we want to see is one where AI and the creative industries come together in agreement. This would preferably happen with the use of copyright licences that benefit them both.

Importantly, it would also come with the consent of – and fair payment to – creators of the content that makes both their industries go round."

Tuesday, November 25, 2025

Huckabee’s Copyright Claim Over AI Advances Against Bloomberg; Bloomberg Law, November 25, 2025

, Bloomberg Law; Huckabee’s Copyright Claim Over AI Advances Against Bloomberg

 "A federal judge declined to dismiss a copyright-infringement claim in a proposed class action led by Mike Huckabee, accusing Bloomberg LP of using a pirated dataset to train its AI model.

Judge Margaret M. Garnett said she couldn’t evaluate Bloomberg’s defense that its use of authors’ books to train BloombergGPT was fair use under US copyright law without a factual record, denying its motion to dismiss in a Monday opinion filed in the US District Court for the Southern District of New York."

Monday, November 24, 2025

Minister indicates sympathy for artists in debate over AI and copyright; The Guardian, November 23, 2025

 , The Guardian; Minister indicates sympathy for artists in debate over AI and copyright

 "The technology secretary, Liz Kendall, has indicated she is sympathetic to artists’ demands not to have their copyrighted works scraped by AI companies without payment and said she wanted to “reset” the debate.

In remarks that suggest a change in approach from her predecessor, Peter Kyle, who had hoped to require artists to actively opt out of having their work ingested by generative AI systems, she said “people rightly want to get paid for the work that they do” and “we have to find a way that both sectors can grow and thrive in future”.

The government has been consulting on a new intellectual property framework for AI which, in the case of the most common large language models (LLMs), requires vast amounts of training data to work effectively.

The issue has sparked impassioned protests from some of Britain’s most famous artists. This month Paul McCartney released a silent two-minute 45 second track of an empty studio on an album protesting against copyright grabs by AI firms as part of a campaign also backed by Kate Bush, Sam Fender, the Pet Shop Boys and Hans Zimmer."