
Recent developments in the world of copyright have been making many writers rethink their attitudes toward copyright registration and reversion of publishing rights.
Because many artificial intelligence companies used pirated books to train their large language models, there are now a growing number of copyright infringement class action lawsuits against them. While it is still undetermined whether these companies’ use of the copyrighted material was fair use or not, it has become clear that the use of copyrighted material from pirate libraries is a no-no, especially when the method involves torrenting, which means the companies participated in redistributing the materials.
The first of these lawsuits, Anthropic v. Bartz, just held the final Fairness Hearing on a class action settlement and, although there were some minor factors which delayed Judge Martinez-Olguin’s approval, it looks as if the class action settlement will be approved and 1.5 billion dollars will eventually be paid out to claimants who met the definition of the class.
Needless to say, this will be an unprecedented class action settlement involving copyright. As currently calculated, claimants for each copyrighted work that was pirated by Anthropic will share $3,100. If the work was self-published or the work’s rights had reverted to the author, they will receive the entire amount. It’s safe to say this is the first time the average writer will benefit from their copyright registration in any substantial way.
But not every writer benefited, for a number of reasons; the primary reason was that the book had to have had its copyright registered with the US Copyright Office. The definition of the class for the Anthropic class action was:
- have been downloaded by Anthropic from LibGen or PiLiMi;
- have an International Standard Book Number (ISBN) or Amazon Standard Identification Number (ASIN);
- have been registered with the United States Copyright Office within five years of the work’s first publication; and
- have been registered before being downloaded by Anthropic, or within three months of the work’s first publication.
Of the estimated seven million works that were pirated by Anthropic, less than 500,000 works were part of the class. As of the May 14 settlement hearing, the number of works claimed was 447,576. That’s about 7% of the pirated works.
The requirement that a work have an ISBN or ASIN is essentially unfair because it only recognizes individual books, but at least it doesn’t discriminate against self-published works. There is nothing in US Copyright Law that distinguishes between “books” and other literary works that may have a registered copyright. The requirement is only to make identifying works and verifying author and publisher easier for the settlement administrators. As I say, though, most books do have one or the other, even if the ASIN is connected to long out of print book being sold used. Presumably some book authors have managed to avoid Amazon entirely, but they must be a small number.
Another similar class action lawsuit, Elsevier Inc. v. Meta Platforms, Inc. was filed on May 5 by a bunch of publishers and Scott Turow as the only named author. It restricts the class even further. The proposed class definition is:
All legal or beneficial owners of registered copyrights, in whole or in part, for any book possessing an International Standard Book Number (ISBN) or journal article possessing a Digital Object Identifier (DOI) or International Standard Serial Number (ISSN), that Meta, without such owner’s authorization, (1) reproduced by downloading during torrenting and/or copying of web scrapes; or (2) distributed during torrenting; or (3) reproduced in connection with the development and/or training of a Llama Model. For purposes of this definition, copyrighted works are limited to those registered with the United States Copyright Office (a) within five years of the work’s publication and before being reproduced or distributed by Meta, or (b) within three months of publication.
The main difference from the Anthropic class is the limitation to only books that have ISBNs. ASINs don’t count, cutting out a large majority of self and indie published works, even if they do have registered copyrights. You can understand, I suppose, why the plaintiff publishers want the class restricted to the books that they published, but it’s even more grossly unfair to ebooks that were published without ISBNs because ISBNs are only important for physical book distribution. It’s hard to justify limiting a class action this way when, for all practical purposes, the fairer Anthropic settlement’s class definition worked (fingers crossed).
So will there now be a rush by indie authors to purchase ISBNs? It makes sense if, for example, you claimed a book without an ISBN in the Anthropic settlement, since there’s a good chance it will turn up again in the Turow class. Like with copyright registration and rights reversion, the effort and outlay start to look worthwhile. Large copyright class actions and settlements change everything.
Postscript. The fundamental problem is that there is no comprehensive registry for published works.

Why can’t we go after the pirate sites, such as LibGen or PiLiMi? Why wait until they are scraped into AI and then go after the AI company?
Helpful, as always. Kudos.