Counsel for plaintiffs in a copyright lawsuit filed against Meta allege that Meta CEO Mark Zuckerberg gave the green light to the team behind the company’s Llama AI models to use a data set of pirated ebooks and articles for training.
The case, Kadrey v. Meta, is one of many against tech giants developing AI that accuse the companies of training models on copyrighted works without permission. For the most part, defendants like Meta have asserted that they’re shielded by fair use, the U.S. legal doctrine that allows for the use of copyrighted works to make something new as long as it’s sufficiently transformative. Many creators reject that argument.
In newly unredacted documents filed with the U.S. District Court for the Northern District of California late Wednesday, plaintiffs in Kadrey v. Meta, who include bestselling authors Sarah Silverman and Ta-Nehisi Coates, recount Meta’s testimony from late last year, during which it was revealed that Zuckerberg approved Meta’s use of a data set called LibGen for Llama-related training.
LibGen, which describes itself as a “links aggregator,” provides access to copyrighted works from publishers including Cengage Learning, Macmillan Learning, McGraw Hill, and Pearson Education. LibGen has been sued a number of times, ordered to shut down, and fined tens of millions of dollars for copyright infringement.