A San Francisco federal court has ruled that training AI on copyrighted books can be legal, yet the same judgment is forcing AI firm Anthropic to confront allegations of building its system on a foundation of pirated material. This verdict carves out a complex new reality for AI developers and creators alike.
The Legality of Learning
A pivotal victory for the AI industry came as U.S. District Judge William Alsup deemed the process of training AI as “fair use.” In a case initiated by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson against Anthropic, which is notably backed by Amazon and Alphabet, the judge found the AI’s learning method to be “exceedingly transformative.”
Also Read: Microsoft’s AI partner becomes its biggest corporate rival
He elaborated that the AI model, Claude, didn’t just copy the books but learned from them to create something new, much like a human writer develops their own style. This is the first ruling of its kind, offering a significant legal defense for companies like OpenAI and Meta Platforms, who argue their technology fosters human creativity.
A Library Built on Piracy
Despite the win on the training front, Anthropic faces a significant hurdle. Judge Alsup condemned the company for creating a “central library” containing over 7 million pirated books. He rejected the company’s argument that the origin of its training data was irrelevant. The judge was clear that downloading from pirate sites is not a defensible act.
This finding means Anthropic must now prepare for a trial scheduled for December. The company could face substantial financial penalties for willful copyright infringement, potentially as high as $150,000 for each copyrighted work it used without permission. This part of the ruling serves as a stark warning to the industry about the importance of sourcing data legally.
Aslo Read: Google’s new AI Search in India now answers your most complex questions
The split decision establishes a crucial benchmark. It suggests that while the mechanics of AI training may be protected, the methods used to acquire that training data are not exempt from copyright law. The ruling clarifies that innovation in artificial intelligence cannot be built upon a foundation of illegal content.