Concluded IP

Bartz v. Anthropic PBC

U.S. District Court, Northern District of California · United States · 2025-06-23 · 3:24-cv-05417-WHA

Three authors sued Anthropic PBC for training Claude on pirated copies of their books. Judge William Alsup ruled that AI training on lawfully purchased books is 'quintessentially transformative' fair use, but training on pirated copies from Books3, LibGen, and PiLiMi is 'inherently, irredeemably infringing.' The case settled for $1.5 billion covering 482,460 books — the largest copyright recovery in U.S. history.

Holding

Judge Alsup drew a bright constitutional line under Art. I, Sec. 8, Cl. 8: AI training on purchased copyrighted works constitutes transformative fair use because it creates a fundamentally new product; training on pirated copies obtained through shadow libraries is per se copyright infringement regardless of the downstream use.

Arguments For / Positive Implications

Created the first clear judicial framework distinguishing lawful from unlawful AI training data
The $1.5 billion settlement and ~$3,000 per work established a concrete valuation benchmark for AI training rights
Grounded the ruling in the Constitution's Progress Clause, giving it exceptional persuasive authority
Validated the 'transformative use' doctrine for AI while drawing a firm line at piracy
The bright-line piracy rule gives AI companies a clear compliance roadmap

Arguments Against / Concerns

The 'purchased vs. pirated' distinction may be difficult to apply when provenance is unclear
Settlement preempted a full appellate ruling, leaving some questions unresolved
~$3,000 per book may undervalue works by bestselling authors while overvaluing obscure titles
Does not address training on content obtained through web scraping of licensed but publicly accessible works
Other jurisdictions may not follow the U.S. fair use framework, creating global compliance gaps

Our Takes

Lawra (The Moderate)
This is the case that finally answered the AI training question everyone was asking. Judge Alsup did something brilliant: he didn't say all AI training is fair use and he didn't say it's all infringement. He drew the line exactly where it belongs — at piracy. If you bought the book, training on it is transformative. If you stole it, nothing downstream can redeem that. Every AI company now has a compliance blueprint, and every author has a price tag. $1.5 billion says copyright still means something in the age of AI.

Lawrena (The Skeptic)
Let's not celebrate too quickly. Yes, the $1.5 billion settlement is historic — but Anthropic trained on over 7 million pirated books before anyone stopped them. Books3 alone contained 196,640 stolen works. The real scandal is that it took a lawsuit to establish that piracy is piracy, even when a tech company does it. And the 'transformative fair use' finding for purchased books? That's a Trojan horse. It tells every AI company: just buy the books first and you can train without a license. Authors deserve ongoing royalties, not a one-time settlement.

Lawrelai (The Enthusiast)
Judge Alsup just wrote the playbook for responsible AI development. The ruling is elegant in its simplicity: clean data, clean conscience; dirty data, you pay. The $1.5 billion settlement proves that the market can solve this — AI companies will invest in legitimate data pipelines because the cost of piracy is now quantified and enormous. This is how innovation and copyright coexist. Build your models on lawfully acquired data, negotiate fair licensing deals, and the courts will protect your right to innovate. The transformative use finding is a green light for the entire industry — as long as you stay on the right side of the piracy line.

Carlos Miranda Levy (The Curator)
This ruling gets the fundamental balance right. Knowledge is humanity's patrimony, and AI training on lawfully acquired works creates something genuinely new — that's exactly the kind of transformative progress copyright law exists to enable. But piracy remains piracy. The real breakthrough here is the market signal: $1.5 billion tells every AI company that investing in legitimate data acquisition is not optional — it's existential. This creates the non-friction creative ecosystem we need: authors get compensated, AI companies get legal certainty, and society gets the benefits of innovation built on respect for creators.

Why This Case Matters

Bartz v. Anthropic PBC is the most consequential AI copyright ruling to date. For the first time, a federal court drew a clear, constitutionally grounded line between lawful and unlawful AI training data — and the $1.5 billion settlement that followed is the largest copyright recovery in United States history. This case answers the question that NYT v. OpenAI, Authors Guild v. Google, and dozens of other suits have been circling: when does AI training on copyrighted works cross the line?

The Piracy Timeline

Anthropic’s liability began long before any lawsuit was filed. To train its Claude family of AI models, the company used datasets containing millions of pirated books sourced from three shadow libraries:

Books3: A dataset of 196,640 books scraped from Bibliotik, a private torrent tracker, and widely shared in the AI research community.
Library Genesis (LibGen): An underground repository hosting approximately 5 million pirated books, academic papers, and other copyrighted works.
PiLiMi: A lesser-known shadow library containing roughly 2 million pirated texts.

In total, the training corpus included over 7 million pirated works. Plaintiffs Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson — all published authors — filed their class action on August 19, 2024, alleging that Anthropic knowingly incorporated stolen copies of their books into its training pipeline.

The Fair Use Ruling

On June 23, 2025, Senior U.S. District Judge William Alsup issued the ruling that split the case — and the AI industry — in two. His opinion turned on a single, decisive distinction:

Purchased copies = fair use. Judge Alsup found that when an AI company lawfully purchases copyrighted books and uses them to train a model that produces entirely new, non-substitutive output, the use is “quintessentially transformative.” The model does not reproduce the books; it learns patterns from them and generates something fundamentally different. This analysis followed the Supreme Court’s framework in Campbell v. Acuff-Rose Music (1994) and the Second Circuit’s reasoning in Authors Guild v. Google (2015), while carefully distinguishing Andy Warhol Foundation v. Goldsmith (2023) on the grounds that AI output does not substitute for the original works.

Pirated copies = per se infringement. For books obtained through Books3, LibGen, and PiLiMi, the court held that the analysis was far simpler. Training on pirated copies is “inherently, irredeemably infringing” because the foundational act — unauthorized reproduction — taints every downstream use. No amount of transformation can launder a pirated source. Judge Alsup rooted this holding in the Constitution’s Progress Clause (Art. I, Sec. 8, Cl. 8), arguing that copyright’s role as “the engine of free expression” requires that unauthorized copying be treated as categorically different from licensed use.

The $1.5 Billion Settlement

Following the fair use ruling, the parties entered settlement negotiations. On September 5, 2025, Anthropic agreed to pay $1.5 billion to resolve claims on behalf of a class covering 482,460 works — approximately $3,000 per book. The settlement was preliminarily approved by the court on September 25, 2025, with final approval scheduled for April 2026.

The settlement structure reflected the court’s dual ruling: Anthropic accepted liability only for works whose copies were traced to pirated sources, while maintaining that its use of lawfully purchased copies was protected fair use.

Constitutional Foundations

What makes this ruling particularly durable is its constitutional grounding. Rather than relying solely on statutory fair use factors, Judge Alsup anchored his analysis in the Copyright Clause itself. The Constitution grants Congress the power to secure exclusive rights to authors in order to “promote the Progress of Science and useful Arts.” Judge Alsup reasoned that this purpose is served by allowing transformative AI training (which creates new knowledge tools) but defeated by piracy (which deprives authors of their incentive to create).

This constitutional framing gives the ruling persuasive weight well beyond the Northern District of California and may influence how other circuits — and ultimately the Supreme Court — approach AI training questions.

The Broader Impact

Bartz v. Anthropic establishes three principles that will shape AI copyright law for years to come:

The provenance principle: Where your training data comes from matters as much as what you do with it. Clean data pipelines are not just good ethics — they are legal necessities.
The valuation benchmark: At approximately $3,000 per work, the settlement gives publishers, authors, and AI companies a starting point for licensing negotiations. Future deals will be priced against this number.
The compliance roadmap: AI companies now have a clear framework — purchase or license your training data, document provenance, and avoid shadow libraries. Companies that follow this path can rely on the transformative fair use defense; those that don’t face existential liability.

The case also sends a signal to the global debate over AI and copyright. While the EU AI Act and other regulatory frameworks take different approaches, the Bartz ruling demonstrates that existing U.S. copyright law, properly applied, can accommodate AI innovation without sacrificing creators’ rights.

Sources

Bartz v. Anthropic PBC, No. 3:24-cv-05417-WHA (N.D. Cal. June 23, 2025) (2025-06-23)
Andy Warhol Foundation for the Visual Arts v. Goldsmith, 598 U.S. 508 (2023) (2023-05-18)
Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015) (2015-10-16)
Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994) (1994-03-07)
Anthropic Settles Authors' Copyright Suit for $1.5 Billion — Reuters (2025-09-05)

Explore Legal Frameworks

Cases don't happen in a vacuum. Explore the regulatory frameworks shaping AI law around the world — from the EU AI Act to emerging legislation in Latin America.

Legal Frameworks FAQ

Ready for structured learning? Explore the Learning Program →

Comments

Loading comments...