Meta CEO Mark Zuckerberg Allegedly Permitted Llama AI Models’ Training on Copyrighted Materials

Several authors have filed a lawsuit against Meta alleging it used pirated e-books and articles to train its AI models.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 10 January 2025 18:49 IST
Highlights
  • The lawsuit was filed with the US District Court on Wednesday
  • During depositions, Meta revealed that it torrented LibGen
  • LibGen is a link aggregator that provides access to copyrighted works

The lawsuit alleges that Meta also tried to conceal its copyright infringement

Photo Credit: Unsplash/Dima Solomin

Meta is facing a copyright lawsuit over allegedly using copyrighted works to train its artificial intelligence (AI) models. The lawsuit was filed by multiple complainants that also include several bestselling authors. The primary allegation against the tech giant is that it used pirated e-books and articles to train the older versions of its Llama AI models, violating copyright laws. Additionally, the filings also accuse company CEO Mark Zuckerberg of allowing its Llama AI team to torrent a sketchy link aggregator to access the copyrighted materials.

The information comes from two separate documents filed with the US District Court for the Northern District of California on Wednesday. The documents, from complainants such as authors Sarah Silverman and Ta-Nehisi Coates, highlight Meta's testimony given in late 2024 where it was discovered that Zuckerberg permitted the usage of a dataset called LibGen to train its Llama AI models.

Advertisement

Notably, LibGen (short for Library Genesis) is a file-sharing platform that offers free access to academic and general-interest content. Many consider it a pirate library as it gives access to copyrighted works that are otherwise either available behind a paywall or are not digitised at all. The platform has faced several lawsuits and has been ordered to shut down in the past.

The filings claim that Meta used the LibGen dataset while having full knowledge that it had pirated content and broke copyright laws. The document also cited a memo to Meta's AI decision-makers that mentions after “escalation to MZ,” Meta's AI team “has been approved to use LibGen”. Here, MZ is a shorthand for the Meta CEO's name.

Advertisement

Additionally, the memo also mentioned that the executives were alerted to the fact that public knowledge about using “a dataset we know to be pirated such as LibGen” could undermine its negotiating position with regulators. The social media giant was also accused of stripping copyright information from the dataset's text and metadata to conceal its infringement.

As per the filings, Nikolay Bashlykov, a research engineer working in Meta's AI division allegedly removed copyright information from the LibGen dataset. To further hide the evidence of using the alleged dataset “Meta's programmers included “supervised samples” of data when fine-tuning Llama to ensure Llama's output would include less incriminating answers when answering prompts regarding the source of Meta's AI training data,” stated the document.

Advertisement

Further, the complainants also alleged that Meta was involved in another kind of copyright infringement just by accessing LibGen. The filings claimed that the tech giant torrented the LibGen dataset. The process of using Torrent includes both downloading as well as uploading (also known as seeding) the content. The process of uploading can be considered distribution of copyright materials and constitute a violation, claimed the filings.

“Had Meta bought Plaintiffs' works in a bookstore or borrowed them from a library and trained its Llama models on them without a license, it would have committed copyright infringement. Meta's decision to bypass lawful methods of acquiring books and become a knowing participant in an illegal torrenting network establishes a CDAFA [California Comprehensive Computer Data Access and Fraud Act] violation and serves as proof of copyright infringement,” the filings stated.

Advertisement

Currently, the copyright lawsuit is open and a ruling is pending. Meta is yet to make its arguments, which are likely to be based on fair usage. The court will have to decide whether the AI model's generative capabilities can be considered transformative enough to validate that argument or not.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Itel Aqua Launched in India With IP67 Rating, 1,200mAh Battery: See Price
  2. Asus Unveils These ROG Edition 20 Lineup Products at Computex 2026
  3. Huawei Nova 16 Pro, Nova 16 Ultra Debut With 7,000mAh Battery: See Price
  4. HP OmniBook X 14, Ultra 16 Refreshed With Nvidia RTX Spark 'Superchip'
  5. Pawzeeble Is Building a Pet-Focused Social Networking Space for Indian Users
  6. Dell XPS 13 Refreshed With Intel Panther Lake CPUs to Rival MacBook Neo
  7. Huawei Nova 16, Nova 16z Debut With 50-Megapixel Camera at This Price
  8. Xiaomi 17T India Launch Roundup: Launch Date, Expected Specifications
  9. Vivo X Fold 6 Launch Timeline, Key Specifications Leaked Online
  1. Asus ROG Edition 20 Lineup Unveiled at Computex 2026 to Commemorate 20 Years of ROG Series Products
  2. Indian Startup Pawzeeble Is Building a Pet-Focused Social Networking Space for Indian Users
  3. Asus ROG Strix Scar 18 (2026) With 240Hz 4K Mini-LED Display Showcased at Computex 2026
  4. Huawei Nova 16 Pro, Nova 16 Ultra Launched With Kirin 9010S SoC, 7,000mAh Battery: Price, Specifications
  5. Huawei Nova 16 Launched With 7,000mAh Battery, 50-Megapixel Camera, Nova 16z Tags Along: Price, Specifications
  6. Computex 2026: AMD Unveils Ryzen 7 7700X3D, Radeon RX 9070 GRE; Extends AM5 Support to 2029
  7. Itel Aqua Launched in India With IP67 Rating, 1,200mAh Battery: Price, Features
  8. Vivo X Fold 6 Launch Timeline Leaked; Tipped to Arrive With MediaTek Dimensity 9500 Chip
  9. HP OmniBook Ultra 16 (2026), OmniBook X 14 (2026) Unveiled With Nvidia's RTX Spark 'Superchip'
  10. Acer Swift Air 14 Launched With Intel Core Series 3 CPU, Lightweight Design at Computex 2026
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.