The Anthropic website and mobile phone app are shown in this photo on July 5, 2024. A judge ruled in the AI company’s favor in a copyright infringement case brought last year by a group of authors. Richard Drew/AP hide caption
AI companies could have the legal right to train their large language models on copyrighted works — as long as they obtain copies of those works legally.
That’s the upshot of a first-of-its-kind ruling by a federal judge in San Francisco on Monday in an ongoing copyright infringement case that pits a group of authors against a major AI company.
The ruling is significant because it represents the first substantive decision on how fair use applies to generative AI systems.
Fair use doctrine enables copyrighted works to be used by third parties without the copyright holder’s consent in some circumstances such as illustrating a point in a news article. Claims of fair use are commonly invoked by AI companies trying to make the case for the use of copyrighted works to train their generative AI models. But authors and other creative industry plaintiffs have been pushing back with a slew of lawsuits.
In their 2024 class action lawsuit, authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson alleged Anthropic AI used the contents of millions of digitized copyrighted books to train the large language models behind their chatbot, Claude, including at least two works by each plaintiff. The company also bought some hard copy books and scanned them before ingesting them into its model.
“Rather than obtaining permission and paying a fair price for the creations it exploits, Anthropic pirated them,” the authors’ complaint states.
In Monday’s order, Senior U.S. District Judge William Alsup supported Anthropic’s argument, stating the company’s use of books by the plaintiffs to train their AI model was acceptable.
“The training use was a fair use,” he wrote. “The use of the books at issue to train Claude and its precursors was exceedingly transformative.”
The judge said the digitization of the books purchased in print form by Anthropic could also be considered fair use, “because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies.”
However, Alsup also acknowledged that not all books were paid for. He wrote Anthropic “downloaded for free millions of copyrighted books in digital form from pirate sites on the internet” as part of its effort “to amass a central library of ‘all the books in the world’ to retain ‘forever,.'”
Alsup did not approve of Anthropic’s view “that the pirated library copies must be treated as training copies,” and is allowing the authors’ piracy complaint to proceed to trial.
“We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness),” Alsup stated.
Alsup’s bifurcated decision led to similarly divided responses from those involved in the case and industry stakeholders.
In a statement to NPR, Anthropic praised the judge’s recognition that using works to train large language models was “transformative — spectacularly so.” The company added: “Consistent with copyright’s purpose in enabling creativity and fostering scientific progress, Anthropic’s large language models are trained upon works not to race ahead and replicate or supplant them, but to turn a hard corner and create something different.”
However, Anthropic also said it disagrees with the court’s decision to proceed with a trial.
“We believe it’s clear that we acquired books for one purpose only — building large language models — and the court clearly held that use was fair,” the company stated.
A member of the plaintiffs’ legal team declined to speak publicly about the decision.
The Authors’ Guild, a major professional writers’ advocacy group, did share a statement: “We disagree with the decision that using pirated or scanned books for training large language models is fair use,” the statement said.
In an interview with NPR, the guild’s CEO, Mary Rasenberger, added that authors need not be too concerned with the ruling.
”The impact of this decision for book authors is actually quite good,” Rasenberger said. “The judge understood the outrageous piracy. And that comes with statutory damages for intentional copyright infringement, which are quite high per book.”
According to the Copyright Alliance, U.S. copyright law states willful copyright infringement can lead to statutory damages of up to $150,000 per infringed work. The ruling states Anthropic pirated more than 7 million copies of books. So the damages resulting from the upcoming trial could be huge.
The part
Content adapted by the team from the original source: https://www.npr.org/2025/06/25/nx-s1-5445242/federal-rules-in-ai-companys-favor-in-landmark-copyright-infringement-lawsuit-authors-bartz-graeber-wallace-johnson-anthropic
Leave a comment