Root NationNewsIT NewsAI scandal around Amazon: The company is accused of harvesting YouTube content

AI scandal around Amazon: The company is accused of harvesting YouTube content

Amazon Web Services (AWS)

© ROOT-NATION.com - Use of content is permitted with a backlink.

Three YouTubers, including the company behind H3H3 Productions and H3 Podcast Highlights, as well as a solo golf creator and a golf channel, have filed a class action lawsuit in federal court in Seattle, US against Amazon.

The complaint alleges that the company allegedly circumvented YouTube’s technical defenses by using virtual machines and IP address rotation to collect videos in bulk without permission. According to the plaintiffs, the collected content was used to train a generative video model called Nova Reel, which is available through Amazon Bedrock.

Amazon

The lawsuit was filed by Ted Entertainment Inc. the company that operates H3H3 Productions and H3 Podcast Highlights, MrShortGame Golf channel hosted by Matt Fisher, and Golfholics Inc. Together, these parties represent more than 2.6 million subscribers on YouTube, approximately 4 billion views and more than 5,800 original videos. Amazon is the defendant in the case, and the key object of the claims is the Nova Reel model as a product partially created using their content.

The legal construction of the claim is based on the provisions of Section 1201 of the Digital Millennium Copyright Act, which prohibits circumvention of technological protection mechanisms that control access to protected materials. The plaintiffs argue that YouTube’s systems for protecting video files are such mechanisms, and Amazon’s actions regarding the mass collection of data are allegedly a deliberate circumvention of them. If the court agrees with this position, downloading videos from YouTube for AI training may be recognized as a violation of the DMCA even in the case of public access to the content, as it is the technical barrier laid down by the platform that is violated.

The lawsuit also describes the mechanics of data collection. The focus is on two academic datasets: HD VILA 100M, created by Microsoft Research Asia in 2021, and HD VG 130M, prepared by researchers from Peking University and Microsoft. Both sets contain the URLs of YouTube videos, not the video files themselves. This is legally significant as the use of such sets in AI training requires downloading videos from YouTube, and the plaintiffs claim that Amazon allegedly made such downloads.

The lawsuit states that Amazon allegedly did not limit itself to simple downloads. The company, according to the plaintiffs, used automated systems, virtual machines, and regular IP address changes to avoid detection and blocking by YouTube. The combination of these tools is described as an intentional circumvention of the technical protections applied by the platform. Similar methods were allegedly used in a previous lawsuit filed by the same group against Nvidia, which involved the downloading of 38.5 million video URLs using a similar infrastructure.

Amazon

Nova Reel is Amazon’s generative AI model launched in December 2024 and available through Amazon Bedrock. It accepts text queries and images and generates videos ranging in length from 6 seconds to 2 minutes. It also features a watermarking function, which Amazon positions as a content authentication tool. The model is part of the Nova family, which the company is actively expanding in the areas of text, images, and video amid growing competition in the field of AI services for the corporate segment.

Competitive pressure in the video generation segment is significant. Nova Reel is seen as Amazon’s answer to systems such as OpenAI’s Sora and Google’s Veo for enterprise tasks. At the same time, Amazon is expanding its AI infrastructure investments, including a partnership with Uber to use Trainium chips for large-scale model training via AWS. This reflects the broader context of competition in the field of generative AI, where the speed of access to computing resources and data is becoming a key factor.

The plaintiffs are part of a broader wave of litigation against AI companies. In 2025, disputes over AI training data have moved into a systemic phase rather than isolated conflicts. In December 2025, Ted Entertainment, Matt Fisher, and Golfholics filed a lawsuit against Nvidia in California, claiming that the company used the same HD VILA 100M and HD VG 130M sets and similar data collection methods to train the Cosmos model. In January 2026, the same group filed lawsuits against Meta, ByteDance, and Snap. In early April, parallel lawsuits were filed against OpenAI and Apple in the Northern District of California. The lawsuit against Amazon filed in Seattle is the latest in this series.

At the same time, the total number of copyright cases against AI companies in the US is growing and has exceeded 100. Among them is a lawsuit filed in March 2026 by Encyclopaedia Britannica and Merriam Webster against OpenAI, which claims that almost 100 thousand Britannica articles were used as educational data without permission. As in the YouTube creators’ case, the key is the allegation of systematic content extraction, which is then commercialized as AI products.

YouTube Apple Vision Pro

At the center of the legal debate is the mechanism of using academic datasets. The plaintiffs are trying to challenge the difference between published URLs of lists that are scientific in nature and the actual downloading of video files required to train models. This issue will become increasingly important in 2026 amid tighter control over the origin of training data in generative AI. If the court accepts the plaintiffs’ interpretation, the use of academic video URL datasets could legally amount to direct unauthorized content downloading. Amazon, like other defendants in similar cases, has not publicly commented on the lawsuit.

Read also:

Subscribe
Notify of
guest

0 Comments
Newest
OldestMost Voted