Four Major Publishers Sue the Internet Archive Over Unauthorized Book Scanning

This past March, I wrote about the Internet Archive’s National Emergency Library–a spinoff from the IA’s massive Open Library project, which makes scanned print books available to the public for free in various digital formats.

While many of these books are in the public domain, many are not: they are in-copyright and commercially available, and have been scanned and uploaded without authors’ or publishers’ permission, violating copyright law and potentially interfering with authors’ income. To create the National Emergency Library, the IA has used the figleaf of the coronavirus pandemic as justification to remove even the minimal restrictions on borrowing that governed the Open Library–abandoning one of the key provisions of the legal theory that it and others created to justify what amounts to massive copyright violation.

Though controversy erupted over the Open Library a few years ago, with the IA’s actions condemned by authors, publishers, and authors’ groups, and many authors contacting the IA to have their books removed, no legal action followed. Given the outcome of the Authors’ Guild’s long legal fight against Google’s book scanning project, I’m guessing it might never have done, had the IA not thrown down the gauntlet of the National Emergency Library. Now, four publishers have called the IA’s bluff.

This morning, Hachette, HarperCollins, John Wiley & Sons, and Penguin Random House filed suit against the Internet Archive in the United States District Court for the Southern District of New York, alleging “systematic mass scanning and distribution of literary works.” The suit asks the court to declare that the Open Library constitutes “willful copyright infringement”, to enjoin the IA from further infringing activities relating to the plaintiffs, and impose payment of statutory damages.

From the Association of American Publishers’ press release on the lawsuit:

From Publishers Lunch (which also points out that the IA sells its book scanning and digitization capabilities commerically, generating millions in revenue):

The full complaint can be seen here. The list of titles cited in the complaint gives a taste of the breadth of  the IA’s copyright violation.

UPDATE 6/11/20: The Internet Archive has announced that it will be shuttering the National Emergency Library two weeks early, on June 16, and “returning to traditional controlled digital lending.”

It claims to be doing so because “the vast majority of people use digitized books on the Internet Archive for a very short time”, and also because “four commercial publishers chose to sue Internet Archive during a global pandemic”. Many people suspect that a lawsuit was exactly what the IA wanted, in hopes of getting legal rulings that will validate its disputed Controlled Digital Lending theory.

UPDATE 11/8/21: The latest on the ongoing lawsuit: The IA is attempting to gum up the works with extremely broad discovery demands, and plaintiffs are objecting.

Additionally, there’s new concern over the New Zealand National Library’s donation of hundreds of print books to the IA for “storage and preservation” (read: digitization). The books–many of them still in copyright–have been culled from the National Library’s Overseas Published Collections (in other words, they are books the Library no longer wishes to house).

As usual where the “controlled digital lending” theory is involved, authors are automatically opted in, and if they don’t wish their books to go to the IA, must opt out. That procedure is here. The entire list of books to be donated can be downloaded here.

Personal note: two of my father’s books are on the list, one of which is still in copyright (though no longer in print). I’m not his literary executor, unfortunately.

UPDATE 6/14/22: The Internet Archive and the publishers are seeking summary judgment in the case. Opening briefs are due July 7.

UPDATE 3/27/23: Judge John G Koeltl has ruled in favor of the publishers. The judge’s opinion is a resounding rejection of the IA’s controlled digital lending theory.

At bottom, IA’s fair use defense rests on the notion that lawfully acquiring a copyrighted print book entitles the recipient to make an unauthorized copy and distribute it in place of the print book, so long as it does not simultaneously lend the print book. But no case or legal principle supports that notion. Every authority points the other direction.

The IA is still able to scan and lend books that are in the public domain.


  1. Hi, I like Interner Archives. From them I have obtained some old manuscripts I couldn't obtain elsewhere. But here is a thought: in the old days of Hemingway you needed to be published to be read. If you were not published your stuff would be lost. That is no longed true today.

  2. E-books are actually a form of publishing. E-book services are a part of the publishing industry. And self-publishing has been around as long as there have been books; it was the original form of publishing and has persisted ever since.

  3. I like Interner Archives. From them I have obtained some old manuscripts I couldn't obtain elsewhere. But here is a thought: in the old days of Hemingway you needed to be published to be read. If you were not published your stuff would be lost. That is no longed true today. Today you don't need to be published to be read. Free electronic books are a new option by which to bypass the publishing industry in order to be read. Regarding self-publishing I have had experience with Xlibrus and Maple Leaf. I never made any contract with either of these, and I am glad about that. My 2014 book is today a free electronic book, and so will be the new book I am writing.

  4. All my books are listed for Open Library. One as "author unknown," but brought up by my name in a search. The books are listed as unavailable for lending. Where does Open Library get these entries? Does that mean someone submitted all my books for scanning and they have been entered in the database but the files are not yet there? Or is Open Library scooping up database entries from somewhere else, some library or book sales site?

  5. Yes, I saw that WaPo editorial, and rolled my eyes.

    Frances Grimble, thanks for the comment. I've added a link to the announcement to my post.

  6. Frances, the bad news is that there is never any compensation for authors, and the best thing that will happen is that the NYC publishers will financially bleed this site to death via expensive lawyer fights. Sadly, this site is so well-financed by the anti-copyright jerks that I doubt this will happen. They want this to go up to the US Supreme Court in the hopes of destroying copyright. Not going to happen but jerks can dream.

    If you want to learn how to get pirate sites to take down your works, etc., join the Yahoogroups list, AuthorsAgainstE-BookTheft. Send an email to

    This is a great resource for dealing with pirate sites, and members are a font of information.

  7. I have questions . . . I self-publish. If Internet Archive scans my books, does this suit give anyone outside the publisher suing any leverage? Will there be, or is there, any injunction against IA continuing to scan or post books not in the public domain that are not published by the publishers suing? Will anyone else get any compensation, if there is any? In other words, what about all the other authors?

  8. Sorry for the dribble effect, but as an author, I am so, so tired of being told "what authors want." What I want is to get paid. Yes, there are authors who want to give away their work. Though often, only a limited and carefully chosen amount of work. But authors also need to pay for their groceries just like everyone else. The idea that everyone *needs* pirated books because they are bored during a pandemic is absurd. People can legitimately download public domain free books, they can download e-books legitimately licensed by libraries, they can (gasp!) actually buy books online as e-books, print books (new or used), and audiobooks. Or you know, they can legitimately stream movies or do home repairs or . . . The idea that the books scanned for the "Open Library" are all rare books is absurd. Erin Morgenstern's The Night Circus is on that scanned list. Such a bestseller that I alone have three (legitimate!) paper copies because people keep giving it to me for Christmas.

  9. Yes, yes, yes! IA was taking advantage of the shutdown when at least some courts and law offices were closed, but I did hope for a suit when things "opened up."

  10. These IA stans/trolls are just as insufferable as they are numerous. They seemingly neither understand nor want to understand how IA is both operating illegally and refused to respond to previous legal communications requesting that they remove the material they have no right to distribute (meaning the next legal action is to file a lawsuit), they just want to accuse the victims whose work has been stolen by Open Library users of wanting to kill the Wayback Machine and other unrelated projects. If the cost of settling this lawsuit takes down all of IA, that will be a huge loss to the internet community but this is on IA for not following the law. (There's a reason Project Gutenburg is still doing just fine and why IA should not have attempted to replace them by setting up basically the same thing, plus the Twilight Saga.)

    Many of them seem to think they're socialist messiahs demanding free exchange of art and ideas among the people but what they're actually advocating for is getting the things they want for as little money they're possibly able to pay for it. I hate to point this out (no I don't) but taking the product of someone else's labor in exchange for no payment is the most exploitative flavor of capitalism that exists, my dudes.

    I'm actually sad the owner of IA decided to ignore the law and force the publishers to bring suit instead of just removing the books as requested, and is now begging for money from its users to cover legal fees. I don't know if this is a bold fundraising strat or soulless clout-chasing or what the point was but it's gross the way they're manipulating people who don't know better into defending them and paying for their mistake.

  11. Dear Anonymous at 12:40am:

    Anonymous, I strongly suggest that you learn something about what's behind and implied by your rhetoric before you bring it to a writer's board. There _are_ arguments to be made against the existing publishing system; there _are_ arguments to be made against lack of free library access to all; but targeting the people who create books in the first place with "Anarchy in the UK!" as not just a slogan, but a way of life, isn't an argument — it's blind ideology. Especially when not engaging in those arguments is choosing the easy way to "solve" difficult problems, presuming that those difficult problems have no side effects (like, say, mid-20th-century improved refrigeration and air conditioning having no effect on the ozone layer).

    The response to Marilynn is ignorant at best. The IA has not once, in its history, imposed a restriction on anything after it had announced wider availability of anything. The response is fundamentally a claim that "we're only endorsing looting until the government turns its attention from a health crisis to property theft, and then we'll slither off into the shadows." And it's difficult to credit the veiled accusation that Marilynn is in some kind of well in the face of what is actually in — and not in — the NEL "collection."

    The response to Victoria is fundamentally ignorant. Yes, there is in fact law stating that libraries cannot buy physical copies of the books and then scan them to make them available as e-books. The problem for you is it the law isn't a direct, plain, there-isn't-any-precondition-or-built-in-loophole directive with a single, simple, irrefutable, no-exceptions-of-any-kind citation. Under the US Copyright Act (and the Berne Convention), libraries can only digitize their collections for certain limited purposes and under certain limited conditions, and even then cannot transfer what they've digitized to others (notwithstanding the misinterpretations of the Google Books opinions, which explicitly separated the "we can digitize to make a search tool" from "and then make the digitized text freely available").

    And as far as "non-discriminatory in its outreach" is concerned, take a look at what is in the IA's collection. It is not… outreach. It is not quite overtly an effort to exploit the labor and creativity of people who cannot fight back because they don't have the resources. The resemblance to Brixton in the late 1970s and early 1980s is troubling: Outside agitators and the National Front trampling the independent shopowners in the name of their version of Manifest Destiny, and setting back that community for decades (economic development there is STILL the source of comedy routines, and it's worth considering the "starving artist and author" problem in that light). This time, it's not the shopowners being trampled, but the craftspeople creating what's on display there… which means that the victims don't make great news photos.

    In the meantime, Anonymous at 12:40am, I expect that you'll flounce off proclaiming your rhetorical victory over the artistic bourgeosie — achieved largely by not engagin with their arguments — without ever considering that the artistic bourgeosie are people too, with their own interests and needs and food bills. The fundamental problem with the IA/NEL approach is that it presumes that the paltry income earned from the "exclusive rights to respective writings" is just mad money, just pocket change for the idle rich, and that therefore cutting that stream off doesn't really hurt anyone. (Which, given the actual source of money behind the IA and NEL, shouldn't surprise anyone.)

  12. Marilynn: How many times should this be told that one book/ one reader rule has been done away with only as a temporary measure to help people during these days of global crisis? But perhaps you don't like the word global or non-English speaking people who according to you are neither educated nor can hold decent jobs. Perhaps one day you will come out of your well.

    Victoria: Is there are a law that states that libraries cannot buy physical copies of the books and then scan them to make it available as an e-book? Is it mandatory that they buy an e-book version of the book? In both it seems to me they are paying for the book. It seems to me that all your problems are because IA took some steps to counter a world-wide crisis. You do not like the fact that its reach is global.

    Anyway, it was nice to know the thoughts of both of you. Had no idea that certain people did not like the fact that IA was non-discriminatory in its out-reach. After that our debating the issue seems pointless. So good luck with your life. Stay healthy and happy.

  13. Anonymous 2. IA may be considered legal if you squint really hard and are delusional, but their National Emergency Library ticks every box for being a pirate site including the biggest illegal tell of all– not having the one-book/one-reader rule. Neither library buys the books. They have people donate the scans as a partial way to protect their legal rears, not that it does.

    Copyright is private property, and the copyright owner has the right to choose how it is sold or given away. What part of that don't YOU understand? No one has the right to take that away. Period. What you are spouting is just popular gibberish to make theft seem legitimate.

    And people who are highly literate in their language and English aren't well-educated and have decent jobs? Wow.

  14. The IA doesn't function like a normal library.

    – Libraries buy books, or licenses to them if they're ebooks, which means the author gets a royalty.
    – Libraries impose restrictions on borrowing: one person at a time (or in the case of ebook licenses, limited borrows at a time).
    – Libraries don't make new copies of books in their possession to lend out to the public.

    By contrast, the IA:

    – Acquires books by donation or discard. The IA does _not_ buy the books it scans. (Authors, therefore, get no royalties.)
    – In the case of the National Emergency Library, the IA has removed all restrictions on borrowing. Among other things, this seriously undercuts the legal theory that it uses to argue that its scanning and digitizing projects are fair use.
    – The IA uses the donated books to create new digital formats (in the case of at least one of these formats, riddled with OCR errors) without seeking or receiving permission from authors or publishers. For books that are currently in copyright, this is a violation of the law. For books that are currently commercially available, this directly competes with sales.

    In other words, the National Emergency Library is _exactly_ like a site where you can randomly download any number of books.
    If the IA were only acquiring and scanning out-of-copyright books (like Project Gutenberg), or "orphaned" books whose orphaned status they'd made a true good-faith effort to confirm, I wouldn't have nearly as much of a problem with it. Bringing old, out of print, or rare books back into circulation can be a noble goal–as long as it comports with copyright law.

    Where I have a problem with the IA and its "library" projects is that it is acquiring–for free–large numbers of books that are in copyright and commercially available, creating unpermissioned digital copies, and making them widely available to the public. For the reasons outlined above, this is illegal (not just in the USA, but in any country that's a signatory to the Berne Convention) and unethical. Cloaking it in high-flying language about the pandemic, or invoking global reach, or claiming that it's temporary (if it is; how many major "temporary" changes wind up becoming permanent?) doesn't change that. The IA's "library" projects violate international copyright law. Copyright law protects the creators without whose work the IA would have nothing to digitize. That's what this fight is about.

  15. Marilynn: What I can't understand is why you are comparing Open Library to pirate sites? It functions like a normal library. At a time you can only take a specific number of books, if the book is issued out, you have to wait for it till it is returned (and I have seen waiting lists that have more than 50 people on it which makes it more than two years of waiting if all those who borrow the book keep it for the stipulated period of two weeks), and it is very rare that the library has two copies of a book. This measure of letting people borrow books immediately without any waiting and also doing away with the one-user at a time approach is only an emergency measure which is ending soon. Also these are the books which Open Library has purchased (and therefore paid for it) and scanned. It is not like a site where you can randomly download a number of books. And further as I said earlier don't think that the e-library system is as developed in other parts of the world as in the US/ UK etc. Open Library is catering to all globally,it doesn't discriminate. Is it its global outreach that you can't digest?

    And thanks for letting me know that people outside the UK and US who can read English are highly educated. I didn't know that reading English was a pre-requisite for being highly educated.

  16. Anonymous the 2nd, here's the thing. Authors can share their work for free if they want to. Many share the first book in their series on services like Bookfunnel so people may want to buy the rest of their series. But they CHOOSE to share, and that's their right to do so. Places like IA take the books to give away without consulting the author, and they take every last book in an author's front and backlist. Where's the good in that for the author? Writing is an expensive proposition in time, money, and blood, sweat, and tears. Boosting the ego is nice, but paying the writing and life bills is a necessity. Professional writing is a job, not a charity.

    Plus, there are millions of LEGAL free books online in the US and worldwide. Sure, the newest novel by Betty Bestseller isn't there, but that's not a validation to take her book for free like a whiny toddler having a temper tantrum and saying it's your right to have it because I want it NOW. Wait for it at the local online or physical library like a grown up. Also, news alert. People in countries outside of the US and the UK who can read English are highly educated with excellent jobs so no excuse there.

    And, here's a secret by someone who has been fighting pirate sites for over twenty years. These sites say they are all about the freedom to read what you want when you want it and to heck with those pesky writers and publishers, but they are really all about the stolen money. I've seen pirate sites sold for millions of dollars, and, when authors and publishers manage to stop ad producers from funding a site, it magically goes away because it no longer funds the owners.

  17. Maryilynn: Everybody is struggling, not merely the authors. And I for one believe that if my books provide succor to people during these testing times, then I'd have really accomplished something. Authors want their books to be read and not merely gather dust on the library shelves. If my book reaches a remote corner of the world through Open Library and is read and discussed, I'd me more than happy. Libraries exist because people often do not have money to buy shiny, new books. How does that become hijacking of copyright?

    Anonymous: Have you seen the database of the IA? Books that we'd never have a chance to read, perhaps would never have known about are now being made available worldwide because of IA. Old books which might have turned to dust or moth-eaten are being digitized and being made available, how can that not be preserving?

    Victoria: Why do you think only the USA is the world? The e-library system might be very strong in the US but Open Library caters to the entire world and in many parts of the world, e-library system is virtually non-existent. I am glad Open Library doesn't have such a self-centered approach. The one user at a time approach has only been done away till the 3oth June. This was so that anybody already going crazy sitting at home and watching the graph of death rising can at least immerse themselves in the book of their choice and not be told that there was a huge waiting-list (there have been times when books have a waiting list of almost two years) and so the book is not available and make that person despair all the more. This measure is most appreciable. Will a few months make such a huge dent in the fortunes of these big-publishing houses? It's not that people are not buying books. Appalled that instead of appreciating the work being done by IA, people are condemning them.

  18. It's true that many people are cooped up in their homes–but their local libraries lend out a large catalog of ebooks. It's not like the IA is providing a unique service that can't be easily accessed elsewhere. So that's another of the IA's justifications for the National Emergency Library that doesn't quite fly.

    Also important: one of the key principles of the disputed legal theory of Controlled Digital Lending–which the IA and others created after the fact to justify their book scanning projects–is that borrowing is restricted to one user at a time per book. For the National Emergency Library, the IA has abandoned that principle, weakening its case (because if a key principle of your founding legal theory can be discarded at will, the theory starts to look pretextual).

  19. Anonymous, there is (literally) no apparent "preserving" of "books and other materials" going on. The works appearing in the "National Emergency Library" were scanned from existing materials… indicating that the existing materials are in good enough shape to be scanned. Looking at them further, the majority of the materials in the "National Emergency Library" (based on a statistical sample) has been in editions published since 1982, and therefore AFTER the acid/alkaline paper issue led to legitimate concerns about "fragile books that would deteriorate before their copyrights expired."

    All of which assumes that any of these books were in such small editions that they need preservation in the first place.

  20. Anonymous, IA is using this emergency as an excuse, and this lawsuit is what they wanted so they can try to push their anti-copyright agenda through the court system up to the Supreme Court. Giving away the entire front and back list of authors who are struggling just as much as everyone else and calling themselves heroes for doing it is a total jerk move.

    Meanwhile, spoiled babies who aren't willing to wait a few weeks for the latest Stephen King to be available legally through their library's paper and ebook distribution systems are throwing roses at these jerks.

    Legal free books online are everywhere. They aren't by Stephen King, but they are good reads. These people and you have no excuse to hijack other people's copyright.

  21. The Internet Archive should be applauded for what they are doing: preserving books and other materials and making them available world-wide. These emergency measures were only till the 30th of June, otherwise they issued books like physical libraries – with waiting lists and all. Instead of appreciating the fact that they are helping people cooped up in their homes, these publishers and authors are suing them. Terrible.

  22. Geoff: "cooperating" and "colluding" are completely different things. Here, these 4 business are *cooperating* because they have a common interest in a specific activity (here, a lawsuit that affects them all similarly). This is far more efficient and effective than pursuing 4 separate cases – and is specifically encouraged by the legal system, again for efficiency.

  23. Geoff: "… if Hachette, HarperCollins, John Wiley & Sons, and Penguin Random House can collude to file such a suit, what else might they be colluding about to the detriment of readers and authors?"

    Not likely. If major publishers colluded, they would have taken on Amazon a long time ago, back when Amazon was buying Audible, The Book Depository, and (I believe) Abebooks. When all that was happening, I shook my head in disbelief.

  24. Okay, I'll bite. While IA has been incredibly presumptive and this suit seems necessary, even valorous, if Hachette, HarperCollins, John Wiley & Sons, and Penguin Random House can collude to file such a suit, what else might they be colluding about to the detriment of readers and authors? Such as limiting what is considered "acceptable content" when it comes to public issues and private interests? Such as making the market for books revolve around the types of trite and trivial titles that they have come to pander? There are reasons why it's so hard to get published without invoking magic tropes, and there are reasons why they were chosen, the most obvious being that's how capitalism is organized.

Leave a Reply

MAY 29, 2020

Evaluating Publishing Contracts: Six Ways You May Be Sabotaging Yourself

JUNE 11, 2020

Agencies in Turmoil: Red Sofa Literary Threatens Legal Action, Mass Firings At Corvisiero Literary Agency