
🔥 Get Your $1000 Gift Card Instantly! 🔥
🎉 1 out of 4 wins! Claim your $1000 gift card in just 1 minute! ⏳
💎 Claim Now 🎁 Get $1000 Amazon Gift Card Now! 🎯🎉 1 out of 4 wins! Claim your $1000 gift card in just 1 minute! ⏳
💎 Claim Now 🎁 Get $1000 Amazon Gift Card Now! 🎯🎉 1 out of 4 wins! Claim your $1000 gift card in just 1 minute! ⏳
💎 Claim Now 🎁 Get $1000 Amazon Gift Card Now! 🎯
You Can Now Peruse All the Books Meta Pirated to Train its AI
Our access to free literature is being abused from two ends, on one side is a U.S. government taken over by tech oligarchs, the other side is some of the oligarchs and other big tech firms. AI developed by companies like Meta have gobbled up millions upon millions of books from piracy sites. But if you don’t want to read AI-generated garbagethe federal government under President Donald Trump is looking to kill one of the major sources of funding for public libraries. It’s a bad time if you love reading.
Over the past two years, The Atlantic has been analyzing and creating repositories of publicly-available data troves used to train AI. The site set its sights on LibGen, an archive of pirated media that includes millions of books, academic papers, and other articles. Recently the site released its findings alongside a tool for searching through the archive of millions upon millions of pirated works. With that, you can look for your favorite authors to find if they have been used to train AI models from the likes of OpenAI, Mistral, and Meta.
LibGen, a shortened name for Library Genesis, is what’s referred to online as a “shadow library” for its illicit but open nature. It includes nearly 7.5 million books and 81 million academic papers, according to The Atlantic’s report. While it contains a hoard of copyrighted material, that belies its actual benefits to society. Library Genesis has also been used by scientists to access academic works without paying exorbitant fees to publishers. Other shadow libraries like Sci-Hub have been recognized by groups like the Electronic Frontier Foundation as an objective good for the progress of science.
Gizmodo reached out to Meta for comment, but we did not immediately hear back. We also asked Mistral and OpenAI to comment on its use of LibGen. In a statement to Gizmodo an OpenAI spokesperson said “The models powering ChatGPT and our API today were not developed using these datasets. These datasets, created by former employees who are no longer with OpenAI, were last used in 2021.”
But while LibGen might not be at the heart of OpenAI’s work now, also clear where it and other AI companies stand and its looks like a pirate ship. Last year, a former OpenAI employee said he felt the company was breaking copyright lawthough OpenAI has defended itself in court over copyright lawsuits claiming using copyrighted works for AI training was fair use. Sites like The Verge have already covered Meta’s plans to use LibGen in an effort to beat OpenAI and Mistral. The latest court records from a class action suit headlined by comedian Sarah Silverman mention Meta senior researcher Melanie Kambadur saying Meta would need books “ASAP” since “books are actually more important than web data” for training AI. More documents reveal company staff had considered licensing books to train its AI, but opted for a pirated archive instead. One director of engineering said if they license “one single book,” the company couldn’t then use the legal argument for “fair use.”
If you were wondering how high up the brazen “borrowing” might go, another email document references “escalation to MZ,” which could refer to CEO Mark Zuckerberg as the final decider. The Atlantic further claims that Meta used a torrent to download LibGen, which would have seeded the files to other people in a direct knock against copyright law. Meta, on the other hand, was more than happy to note earlier this week that people have downloaded its Llama AI model 1 billion times.
While the law still hasn’t worked out whether AI’s guzzling of copyrighted data is legal, its clear where the creative community stands. Michael Chabon sued Meta for using his copyrighted work to train AI. The Atlantic’s latest revelations have left authors not too pleased. Author Michael Livingston wrote on Bluesky he found 16 of his books and more articles used for training Llama 3. Nebula award-winning author Aliette de Bodard said “all my books are in LibGen, and I’m not happy about it.”
The irony of pirating books to train AI is becoming more stark as the administration of President Donald Trump works to destroy the apparatus that financially supports public libraries while leaning on AI for many services traditionally performed by humans. On March 14, Trump issued an executive order that would effectively kill the Institute of Museum and Library Services. Like its name suggests, the agency offers grants and other funding to public libraries across the U.S. On Thursday, Trump appointed Keith E. Sonderling to the position of acting director for the IMLS.
State and local taxes normally help pay for libraries, but many institutions in the U.S. rely on federal grant funding for basic services. This extends to digital services promoted by libraries, which is what gives us apps like Libby and Hoopla, that lets users check out e-books or audiobooks from their local libraries. Hoopla Digital President Jeff Jankowski told NPR that without federal funding some libraries may scale back or kill their digital services. Expect longer wait times for e-books to become available, or else find that one book you were hoping to read isn’t available at all.
Musk and DOGE seem to think replacing fired staff with AI will somehow make the government more efficient. Sure chatbots can reproduce iterative responses based on a prompt, but its unlikely AI will be able to accomplish any of what a federal agency can do when fully staffed. The result from all this meddling by tech oligarchs will suppress our access to literature, first by hurting the books industry by stealing authors’ work, then by limiting people’s access to books altogether.
🎁 You are the lucky visitor today! You won a FREE $1000 gift card! 🎁
⚡ Hurry up! This offer is valid for today only! ⚡
Claim Now 💰 Get Amazon Deals 📢