Showing posts with label AI training tools. Show all posts
Showing posts with label AI training tools. Show all posts

Friday, September 6, 2024

A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism; Arxiv, 2024

 Brian Thompson,∗ Mehak Preet Dhaliwal,† Peter Frisch,Tobias Domhan,Marcello Federico1 1AWS AI Labs 2UC Santa Barbara 3Amazon

brianjt@amazon.com, Arxiv ; A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism

"Abstract

We show that content on the web is often translated into many languages, and the low quality of these multi-way translations indicates they were likely created using Machine Translation (MT). Multi-way parallel, machine generated content not only dominates the translations in lower resource languages; it also constitutes a large fraction of the total web content in those languages. We also find evidence of a selection bias in the type of content which is translated into many languages, consistent with low qual- ity English content being translated en masse into many lower resource languages, via MT. Our work raises serious concerns about training models such as multilingual large language models on both monolingual and bilingual data scraped from the web."

Sunday, June 18, 2023

Generative AI is a minefield for copyright law; The Conversation, June 15, 2023

 JD-PhD Student, Massachusetts Institute of Technology (MIT), Lecturer on Law, Harvard Law School,  PhD Student in Media Arts and Sciences, Massachusetts Institute of Technology (MIT), The Conversation; ; Generative AI is a minefield for copyright law 

"While copyright law tends to favor an all-or-nothing approach, scholars at Harvard Law School have proposed new models of joint ownership that allow artists to gain some rights in outputs that resemble their works.

In many ways, generative AI is yet another creative tool that allows a new group of people access to image-making, just like cameras, paintbrushes or Adobe Photoshop. But a key difference is this new set of tools relies explicitly on training data, and therefore creative contributions cannot easily be traced back to a single artist. 

The ways in which existing laws are interpreted or reformed – and whether generative AI is appropriately treated as the tool it is – will have real consequences for the future of creative expression."

Friday, January 20, 2023

Do Androids Dream Of Copyright Registration? – AI Art And Copyright; Dunlap Bennett & Ludwig PLLC via JDSupra, January 18, 2023

Thuan Tran, Dunlap Bennett & Ludwig PLLC via JDSupraDo Androids Dream Of Copyright Registration? – AI Art And Copyright, Part 1

"Two big questions arise when considering AI art: 1) Can AI art be copyrighted; and 2) What about the artists who are having their art “sampled” (though some prefer “stolen”) to supply the data for these diffusion models?

The article today focuses on the first question."

Wednesday, January 18, 2023

AI Trained on Copyrighted Works: When Is It Fair Use?; Lexology, January 16, 2023

Diana Bikbaeva - Diana Bikbaeva, Lexology; AI Trained on Copyrighted Works: When Is It Fair Use?

"We recently published an article that got much traction about whether machine learning on copyrighted materials is fair use. We now offer a deep analysis with new ideas and suggestions on legal risk mitigation and principles for more ethical AI systems.

AI is still a relatively new, although rapidly evolving technology, and some of its legal implications (especially in copyright law) remain a gray area, creating uncertainty on its use and development.

AI and machine learning technology are not one-size-fits-all and have diverse structures and algorithms specific to the tasks they are programmed to solve. So, any discussion of the legal implications of machine learning and resulting artificial intelligence needs to avoid sweeping conclusions on the technology in general and should consider the underlying technology and its treatment of copyrighted materials on a case-by-case basis."