Two authors have accused Apple of using a database dubbed Book3, which contained pirated versions of their books.
Apple’s OpenELM AI models have up to 3 billion parameters
Photo Credit: Reuters
A fresh lawsuit was filed against Apple on Friday over allegedly training its artificial intelligence (AI) on copyrighted books. Two authors have alleged that the Cupertino-based tech giant used datasets containing pirated versions of their books to train the OpenELM AI model, which the company released as an open-source model last year. This particular large language model (LLM) also came under fire in 2024, after a report claimed that a part of its dataset contained the video subtitle data of YouTube.
The lawsuit, filed in the US federal court in Northern California on Friday, proposes a class action suit against Apple. Authors Grady Hendrix and Jennifer Robertson have accused the tech giant of using illegally obtained copyrighted books to train its LLM, OpenELM.
As per the lawsuit, Apple's model card of OpenELM, which was added to Hugging Face, highlights that one of the datasets used to train the model includes RedPajama. The company obtained from the Internet, where many annotators release public datasets with license-free content.
RedPajama, based on the allegations, contained a dataset called Books3, which is claimed to be “a known body of pirated books.” The authors claim that their books were also part of that dataset.
The plaintiffs are now requesting the court to let the lawsuit continue as a class action against the iPhone maker. Following a jury trial, the suit also seeks class statutory damages, compensatory damages, restitution, disgorgement, and other forms of relief. It also asked the court to order the destruction of any of Apple's AI models that were trained on this data.
Notably, last year, the company had said that OpenELM does not power either its AI features under the Apple Intelligence branding or other machine learning features in its devices. Apple highlighted that the model was created as a “contribution to the research community.”
Separately, Anthropic disclosed in a court filing on Friday that it has now agreed to pay $1.5 billion (roughly Rs. 13,200 crore) to settle the ongoing class action from a group of authors. These authors had sued the AI startup for training its AI models on their copyrighted work without consent. Notably, the Claude-maker did not admit any liability as part of the settlement.
For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.