Categorie: News

Nvidia accused of paying for access to pirated content to train its AIs

The world of artificial intelligence is once again shaken by a legal controversy involving Nvidia.

According to recent revelations disclosed in a pending class action, the California-based company did not merely collect data freely accessible on the web, but allegedly sought to pay to obtain privileged access to one of the world’s largest shadow libraries: Anna’s Archive.

The accusations, supported by documents published by the specialized portal TorrentFreak, paint a picture in which the hunger for data to train Large Language Models (LLMs) seems to have outweighed respect for copyright laws.

Nvidia and the alleged deal with Anna’s Archive

Credits: NVIDIA

At the center of the scandal are some internal communications that appear to pin down Nvidia’s Data Strategy Team. According to documents filed during the discovery phase of the case, company representatives allegedly contacted the administrators of Anna’s Archive to negotiate a “high-speed access”.

Anna’s Archive is known for being a massive aggregator of copyrighted material, including books, scientific articles and texts taken from other pirate portals such as Bibliotik.

What makes the situation particularly critical for Nvidia’s defense is the timing and the awareness of the actions. The emails suggest that, despite being warned about the illegal nature of the collections hosted by the archive, management would have given the “green light” to the operation within a week.

The goal was to download about 500 terabytes of data, an enormous amount of human knowledge necessary to refine the cognitive abilities of generative AIs.

Although there is not yet definitive proof that the payment was actually processed or that the transaction went through, the intent to establish a commercial relationship with a pirate site represents a significant precedent.

A defense that is crumbling

This new wave of evidence has led the authors who filed the lawsuit against Nvidia to modify and significantly expand their complaint.

Initially, the accusation focused on the use of the Books3 dataset, a collection containing thousands of pirated literary works, also used by other industry giants such as Meta and Anthropic.

So far, Silicon Valley’s standard defense has relied on the notion of “Fair Use” (fair use), arguing that training AIs falls into a grey area of copyright law that allows the transformative use of works.

However, the new evidence makes Nvidia’s position much more precarious than its rivals. If it is confirmed that the company actively sought to finance an illicit activity to gain a competitive advantage, the fair use thesis could collapse.

The plaintiffs’ lawyers argue that Nvidia not only used stolen material, but also offered its corporate customers automatic access to contaminated datasets like The Pile, which includes the Books3 collection. This behavior would demonstrate, according to the accusation, total indifference to others’ intellectual property rights.

The irony of intellectual property theft

The case raises a fundamental ethical question that has not escaped industry observers. Companies like Nvidia, which jealously protect their technological patents and trade secrets with armies of lawyers, seem not to hesitate when it comes to appropriating the creative work of writers and researchers.

While Nvidia continues to rake in record profits from selling its graphics accelerators, authors see their works swallowed by machines without receiving any compensation or permission requests.

At the moment, Anna’s Archive remains online, although its growing notoriety has made it the target of ongoing DMCA takedown notices, forcing operators into a continuous cat-and-mouse game to keep the servers running.

The class action against Nvidia, enriched by these new emails, could lead to a turning point in the regulation of AI training, establishing clearer boundaries on what is permissible in the race for digital gold.

Luca Zaninello

Appassionato del mondo della telefonia da sempre, da oltre un decennio si occupa di provare con mano i prodotti e di raccontare le sue esperienze al pubblico del web. Fotografo amatoriale, ha un occhio di riguardo per i cameraphone più esagerati.

Recent Posts

100 countries can hack your smartphone, UK government confirms

More than half of the world's governments today have at their disposal sophisticated commercial spyware…

6 hours ago

OnePlus Watch 4 is official with Wear OS and titanium case

A few hours after OPPO's event (which saw the launch of a slew of novelties…

6 hours ago

Amazon Tech Week: 7 days of deals across smartphones, tablets, PCs and more!

The period from April 22 to April 28 is dedicated to the best tech products…

7 hours ago

Motorola Edge 70 Pro Official: 144 Hz display, larger battery and many improvements

After the debut of Edge 70 and Edge 70 Fusion, it is time to say…

8 hours ago

DJI Mic 3 is a real gem for creators, now at an unbeatable price!

Lightness and versatility, without sacrificing professional performance: these are the characteristics of DJI Mic 3,…

11 hours ago

ASUS set to return to the tablet market with a competitor to the iPad Pro

The latest rumors reveal that the Taiwanese company is developing a high-end product named ASUS…

11 hours ago