Apple Faces Lawsuit Over Allegedly Training Its AI Models on Copyrighted Books

Two authors have accused Apple of using a database dubbed Book3, which contained pirated versions of their books.

Written by Akash Dutta, Edited by Ketan Pratap | Updated: 6 September 2025 17:19 IST

Highlights

The AI model in question is the OpenELM, released last year
OpenELM was also accused of using YouTube’s subtitle data
The plaintiffs are asking for a jury trial and monetary damages

Apple’s OpenELM AI models have up to 3 billion parameters

Photo Credit: Reuters

A fresh lawsuit was filed against Apple on Friday over allegedly training its artificial intelligence (AI) on copyrighted books. Two authors have alleged that the Cupertino-based tech giant used datasets containing pirated versions of their books to train the OpenELM AI model, which the company released as an open-source model last year. This particular large language model (LLM) also came under fire in 2024, after a report claimed that a part of its dataset contained the video subtitle data of YouTube.

Authors File Lawsuit Against Apple Over AI Training

The lawsuit, filed in the US federal court in Northern California on Friday, proposes a class action suit against Apple. Authors Grady Hendrix and Jennifer Robertson have accused the tech giant of using illegally obtained copyrighted books to train its LLM, OpenELM.

As per the lawsuit, Apple's model card of OpenELM, which was added to Hugging Face, highlights that one of the datasets used to train the model includes RedPajama. The company obtained from the Internet, where many annotators release public datasets with license-free content.

Apple's iPhone 17 Launch Spoiled by Case Leak: This Is How They Do It

RedPajama, based on the allegations, contained a dataset called Books3, which is claimed to be “a known body of pirated books.” The authors claim that their books were also part of that dataset.

The plaintiffs are now requesting the court to let the lawsuit continue as a class action against the iPhone maker. Following a jury trial, the suit also seeks class statutory damages, compensatory damages, restitution, disgorgement, and other forms of relief. It also asked the court to order the destruction of any of Apple's AI models that were trained on this data.

Notably, last year, the company had said that OpenELM does not power either its AI features under the Apple Intelligence branding or other machine learning features in its devices. Apple highlighted that the model was created as a “contribution to the research community.”

Separately, Anthropic disclosed in a court filing on Friday that it has now agreed to pay $1.5 billion (roughly Rs. 13,200 crore) to settle the ongoing class action from a group of authors. These authors had sued the AI startup for training its AI models on their copyrighted work without consent. Notably, the Claude-maker did not admit any liability as part of the settlement.

Apple Faces Lawsuit Over Allegedly Training Its AI Models on Copyrighted Books

Authors File Lawsuit Against Apple Over AI Training

Related Stories