Welcome back to Law&Code, your favorite corner of the internet where copyright meets code, and lawsuits are just a plot twist in the AI drama series we didn’t know we needed. Today’s episode: Authors vs. Anthropic, or as I like to call it – “The Great Book Heist (Allegedly).”
The Plot
Three authors—Andrea Bartz (We Were Never Here), Charles Graeber (The Good Nurse), and Kirk Wallace Johnson (The Feather Thief)—have dragged AI startup Anthropic into the literary courtroom spotlight. Why? They claim Anthropic copied and used their books without permission to train its large language model, Claude.
And not just snippets, mind you. The complaint alleges that Claude was able to generate “verbatim excerpts” of their copyrighted works. In short:
“Dear Claude, please summarize my book,”
Claude: “Sure, here’s the whole thing. Chapter one…”
Anthropic, however, isn’t nervously flipping through copyright law books just yet. They’re invoking the classic defense: fair use—arguing their use of the texts was transformative. According to them, feeding books into an AI to teach it how to speak like a human isn’t the same as reading them for fun. It’s more like turning novels into neural juice. (Appetizing.)
From Bookshelves to Courtrooms: How It All Began
So how did this literary showdown come about?
Well, it started with a growing unease among authors watching the AI boom take off like a rocket—with their words as fuel. While most authors were busy pitching to publishers, tech companies were scraping websites, PDFs, fan fiction forums, and yes—entire books—to feed their data-hungry algorithms.
But when users started noticing that Anthropic’s chatbot, Claude, could produce detailed summaries—and even full sections—of copyrighted works word-for-word, alarm bells went off. This wasn’t just “inspired by” or “loosely based on.” This was verbatim reproduction.
Enter the Authors Guild, the mighty literary alliance that’s been on a mission to protect writers from being silently absorbed into AI training sets. They helped file the lawsuit in October 2023 in the U.S. District Court for the Southern District of New York. The plaintiffs accuse Anthropic of downloading books from illegal online libraries—specifically, the notorious pirate site Bibliotik—and stuffing them into their model’s training data.
The evidence? Prompts that led Claude to cough up almost entire chapters of the plaintiffs’ books. Think:
User: “Summarize the book ‘The Good Nurse’”
Claude: “Once upon a time…” [proceeds to tell the whole tale.]
A Quick Copyright Refresher
Under U.S. copyright law, Section 107 of the Copyright Act outlines the infamous “fair use” doctrine. It’s a flexible, case-by-case standard that allows limited use of copyrighted works without permission—provided certain conditions are met:
- Purpose and character of the use – Is it commercial or educational? Transformative or a glorified Ctrl+C?
- Nature of the work – Creative works (like novels) get more protection than factual ones.
- Amount and substantiality – How much was used, and was it the “heart” of the work?
- Effect on the market – Does the new use undercut the original’s sales?
Anthropic’s argument is that the use was transformative: the model isn’t publishing the works, it’s learning from them. But if Claude is spitting out full chapters, is that still learning—or is it just literary karaoke?
Law Library Moment: U.S. vs. EU vs. Germany – Who Protects the Author?
Now, let’s throw in some international spice.
In the U.S., it’s all about fair use (as mentioned above). But in the European Union, things are stricter. The EU operates under a closed list of copyright exceptions in the InfoSoc Directive (2001/29/EC). That means if there isn’t a specific law allowing it, it’s likely not allowed.
When it comes to AI training, the relevant part is the DSM Directive (2019/790), which introduced limited rights to use copyrighted material for text and data mining (TDM). Scientific research gets a green light under Article 3, but for commercial use under Article 4, rightholders can opt out. That means if a publisher says “no AI training allowed,” it’s legally binding.
Germany, ever precise, baked this into national law via § 44b UrhG, allowing TDM even for commercial use—unless the copyright owner has explicitly objected.
So:
- U.S.? Flexible.
- EU? Cautious.
- Germany? “Yes, but only if you ask nicely.”
Why This Case Matters
This lawsuit is more than just a literary scuffle. It’s one of the first major legal tests of whether using copyrighted books to train AI models qualifies as fair use. The outcome could shape the future of AI development, licensing schemes, and how your next novel—or LinkedIn post—is (mis)used by machine learning systems.
If the court rules in favor of the authors, AI companies might need to obtain licenses for every copyrighted work they ingest. And that’s not just expensive—it’s a whole new business model.
If Anthropic wins? We could be entering a world where “training” becomes the ultimate fair use loophole.
Echoes of Other Battles
We’ve seen this movie before—or at least a trailer for it. Artists are suing AI image generators. Musicians are sounding alarms over voice cloning. And of course, our previous Law&Code episode starring Scarlett Johansson and her AI-generated doppelvoice was a warning shot across the tech industry’s bow.
What makes Authors v. Anthropic stand out is the specificity: it’s not about theoretical harms or general scraping. It’s about named authors, named books, and output that appears to match. That’s courtroom gold.
Final Thoughts
This case is juicy not just for the copyright drama but for what it reveals about the growing pains of AI development. We’re watching a power struggle unfold between tech’s „move fast and break things“ philosophy and the slower, rights-based framework of copyright law.
Can AI companies build smarter systems without bulldozing over the rights of creators? Or are we headed toward a future where every model needs a licensing deal the size of a Netflix contract?
One thing’s clear: this case could redefine the legal “training data diet” for AI systems. Will we start seeing AI models trained only on licensed, public-domain, or synthetic content? Or will courts decide that machine learning is just a very hungry fair use?
Stay tuned. And keep your books close—Claude might be reading.
Stay curious, stay informed, and let´s keep exploring the fascinating world of AI together.
This post was written with the help of different AI tools.


