Hathitrust and Book Scanning: Copyright Is People

Recently, a consortium of university libraries called HathiTrust decided to make more than one hundred digitized books available as e-books to the universities’ communities because the books were “orphans,” works for whom the rightsholders could not be located after a diligent search.

Shortly thereafter, the Authors Guild filed a lawsuit against HathiTrust, and began to blog about how easy it was to find the authors of those books or their heirs, ridiculing HathiTrust’s process for designating orphan works (two of which, it was ultimately discovered, were actually still in print). After some defensive maneuvers, HathiTrust admitted that their process was faulty and announced they would re-think it, but intended to continue with their plan to release works they deemed to be “orphans.” Librarians across the Internet came out in support of HathiTrust, and reviled the Guild.

There is clearly a disconnect between HathiTrust and commercial writers, very handily illustrated by Duke University Scholarly Communications Officer Kevin Smith’s condescending open letter to J. R. Salamanca, author of The Lost Country, a novel originally published by Simon and Schuster in 1958, and made into the movie Wild in the Country, starring Elvis Presley, in 1961. The book was one of the “orphan works” included in the HathiTrust’s list. Within days, the Authors Guild had located Mr. Salamanca, and it wasn’t long before he was signed onto an amended complaint.

Librarians complained that this was just a publicity stunt, and it was, but it represented much more, and cuts to the heart of authors’ problem with what happened, because it showed that librarians could regard their search for the rightsholder of a work as diligent and not be even remotely so from an author’s point of view.

“Due diligence” has come up repeatedly as the minimum standard for a rightsholder search before a work could be declared an orphan, and many authors have conceded that it might be okay to allow limited use of a work if no rightsholders could be found after a truly diligent search. (See SFWA’s White Paper on Orphan Copyrights.) One of the expectations of a truly diligent search for rightsholders is that it would be conducted by someone who understands publishing and is able to search intelligently. This appears not to be the case with HathiTrust’s search, and is probably the underlying cause of the problems they encountered.

First, and most important, there’s evidence that HathiTrust was not searching for the authors themselves, but assumed that the publishers would be the rightsholders of these works. In response to an author comment that for a work published in 1958, the author would almost certainly be the only rightsholder because publishing contracts from the fifties don’t mention e-book rights, Kevin Smith comments that “The issue about who has the right to reproduce and distribute a digital copy of a work is usually a matter of interpreting the scope of the contractual assignment of those basic reproduction and distribution rights.” While this may be true in academic publishing, it’s certainly not the case in trade publishing, in which contracts explicitly lay out reproduction and distribution rights. The obvious conclusion is that HathiTrust didn’t find Mr. Salamanca because it wasn’t looking for him.

If true, this is very troubling. There is case law indicating that the author retains all rights that have not been explicitly assigned to a publisher (the Tasini case), and a preliminary finding by a judge that says that pre-electronic rights contracts do not give publishers e-book rights simply because the contracts specified book rights (Random House, Inc. v. Rosetta Book LLC). While the issue is by no means resolved (unfortunately, Rosetta Books settled with Random House, and the actual dispute wasn’t litigated, so we don’t know for sure how the case would have turned out), that’s surely enough case law to indicate that for out-of-print works, the author is likely to be the rightsholder that HathiTrust (and projects like it) should be looking for.

There’s another thing that points to the author being the rightsholder of out-of-print books, even if one concedes that the precedents above aren’t conclusive. Almost all commercial book contracts include a reversion clause that allows authors to reclaim all of their rights and terminate the contract after the book has gone out of print. Considering that a book such as The Lost Country has been out of print for at least four decades, wouldn’t it be reasonable to assume that a canny author like Mr. Salamanca had reverted his rights?

It would be extremely helpful if the folks at HathiTrust would reveal the steps they took to try and locate the rightsholder(s) of The Lost Country. What assumptions did they make? Whom did they talk to, if anyone? What references did they consult? Most important, what do they think a diligent search is? One thing is quite clear, and that is that a diligent search requires a diligent searcher. It can’t be automated. In many cases, it’s going to take time and effort and detective skills. And asking the publisher will only be one step in the process, which must involve looking at the publishing contract, and ultimately must include consulting with the author or his/her estate.

The good news is that there are many resources available to locate authors, both on the Internet and elsewhere. As the Authors Guild blog demonstrated, crowdsourcing works to some extent, but even a thoughtful Google search can produce a remarkable number of leads and, in some cases, an email address or phone number.

When the AG publicized the list of potential orphan works in HathiTrust’s hopper, one name jumped out at me: Fletcher Pratt. Although the book in question was one of his historical works, Pratt wrote some fine science fiction, both solo and in collaboration with SFWA Grand Master L. Sprague de Camp. I figured that would be a good test case, so I started investigating.

As it so happens, several years ago, SFWA initiated a project to locate contact information for the estates of deceased science fiction and fantasy writers. Bud Webster, who has deep roots in the SF pro and fan communities, heads up the project. Pratt died in 1956, but there was no contact information for Pratt’s literary estate in SFWA’s existing Estates Database. Bud put the word out among his contacts, and Orion CEO Malcolm Edwards came back with a name and address for Pratt’s heir, the daughter of Pratt’s wife Inga with her second husband. Case closed? Not quite…we’ve been unable to find a phone number or e-mail address that we could use to confirm her status, but the search is ongoing.

The moral of this story is twofold: 1) sources such as SFWA’s Estate Project should be consulted as part of any diligent search, and 2) copyright is people. Finding the heir(s) of an author can be very complicated, but that doesn’t mean a work is an orphan simply because several generations have passed since the author died.

That’s where the definition of diligence comes into play; if the trail peters out and the author or heirs can’t be found after a truly diligent search, then perhaps it’s time to declare the work an orphan. But a truly diligent search can’t economically be made for hundreds, if not thousands, of works; unless HathiTrust’s resources are very large, they are going to have to prioritize among their choices and be content with a slow, painstaking process. No doubt they would say that this defeats the whole point, which is based on Google’s notion that a thousand random books are worth more than a few carefully chosen ones. Books from academic publishers in which copyright was assigned to the publisher will undoubtedly be easier to trace. But if the orphan work concept is to have any validity at all for commercially published works, slow and painstaking is HathiTrust’s only choice.


  1. Wow! Anonymous was quick to hide behind Thomas Jefferson's stockings – but I don't remember reading a lot about his involvement with copyrights in his lifetime.

    So, in the interest of understanding what was said, I did a quick Google search of "Thomas Jefferson and copyright," An excellent article by Terry Hart at http://www.copyhype.com/2011/10/who-cares-what-jefferson-thought-about-copyright/ confirmed what I'd thought – Anonymous's quote taken out of context.

    The article is worth the time to go read, especially if this reference to a Founding Father is important ammo for the copyright scofflaws. But, at the article's core, Jefferson was talking about patents, not copyrights. While the two have many similiar qualities, they are not the same thing. It is difficult to protect an idea from being copied – in that respect Jefferson is absolutely right. The same does not hold true for protecting creative or artistic expression.

    Unfortunatly, I doubt this will stop Anonymous from continuing to troll for more people to upset. He definately comes across as one of those who think all information should be free, no matter what the cost. All your persuasive arguments won't sway him.

    I love the idea that Google and HathiTrust had – to make out-of-print books available to people who'd otherwise never see them, but theres a right and a wrong way to go about it. Project Guternberg went to the trouble to do things the right way, and make sure works were in the public domain before digitizing them. Google "Do No Evil" and HathiTrust ought to take a lesson from the moral example from the late Michael Hart and the wonderful organization he founded.

    By the way, I'll see your Thomas Jefferson, and raise you a James Madison quote about copyright and patents from the Federalist Papers #43:

    "The utility of this power will scarcely be questioned. The copyright of authors has been solemnly adjudged in Great Britain, to be a right of common law. The right to useful inventions seems with equal reason to belong to the inventors. The public good fully coincides in both cases with the claims of individuals."

  2. Another issue:

    Mass scanning projects may well be an attempt to overtake the copyright holder's voluntary republication.

    Everyone's becoming increasingly alert to the money that can be made from backlist books now that inexpensive e-book and print-on-demand editions are very easy and very affordable. Publishers are wanting to reissue out-of-print works. Authors are seeing the advantages of producing e-book and/or POD editions of their older titles, and posting these for sale at online venues such as Amazon. Non-writers are thinking about reissuing publications they inherited from older relatives.

    The income stream may not be as good as when the book was first published, but the book can sell till the end of the copyright term–in many cases, for decades–and it all adds up. Practically everyone would like to make a few thousand, or even a few hundred dollars, more a year with little effort or expense.

    And libraries are not thrilled about negotiating licenses with publishers, literary agents, or authors as to pay-per-view, or a set number of views, how many readers can "borrow" the book at a time, DRM, and other restrictions.

    But I suspect libraries are thinking, if they can just digitize a lot of copies before the copyright holder gets around to it, those books will be free to libraries forever. Meanwhile, the copyright holders will be unable to sell their books when everyone can easily download the same book for free from a library.

  3. Anonymous,

    Here's another thing non-writers fail to understand. Aside from writers who work for businesses writing software documentation and the like for hire, writers are not paid salaries.

    And neither, are writers typically paid for the work all at once. They are paid in dribs and drabs. They get a modest upfront advance for a book, then they get dribs and drabs of royalties and licenses of subsidiary rights such as reprints and translations. It takes years, if ever, to recoup the investment of work and time. This is not a situation where a writer gets paid fairly once and then keeps screaming for more money.

    When I was a freelance journalist, I never recouped full payment for an article for my first sale. I wrote long articles that typically took me a week to research and write. This being before the Internet, I sold first North American serial rights, print only. Many months later, after that article was published, I sold second serial rights to every magazine I could, for dribs and drabs of $100, or $50. I was writing antiques articles and I had to go to all these regional print publications that did not compete with each other, even to sell second serial rights. Then I wrote adapations of the articles for everyone I could, and I put some of the material in a book.

    After many such resales over a period of years, I was paid fairly for many of the articles.

    Some of the articles later appeared in electronic databases. I had them removed. I never sold or licensed any erights, and some compete with a book I put them in, which is still in print.

    Now it's a lot harder to resell articles because so many magazine publishers are demanding erights for no extra pay. They then post the article on the website, and that means absolutely zero chance of second rights sales for the article, because the Internet competes with every publication.

    So, I don't write magazine articles to get paid for them once. I don't write magazine articles at all. It's not worth my time.

    And if being a book writer were no longer worth my time, I would quit writing books. I'm an intelligence, well-educated person with other career options. For starters, I have been, and could be again, more handsomely paid to write software manuals for Silicon Valley companies. When I've done that, I've gotten a salary, plus health insurance and other benefits, plus pretty good subsidized meals in the company cafeteria.

    Wrap your head around this: Writers are not forced to produce work they are not paid fairly for, or any work at all. They can just do some form of corporate writing, or some other form of work entirely. For example, I'm also prefossionally trained in clothing design, and I've been a professional dancer and dance teacher.

    No writer has to slave just for the praise of readers. Let alone the open contempt exhibited by readers who don't want to pay, but assert they want the work anyway.

  4. Wow, a bunch of monopolists really want to defend their state-granted monopoly. Imagine that.

    Don't pretend that copyright is a natural right; it's a state granted monopoly.

    Copyright's days are numbered. Accept it, plan for it, and move on. If you want to write, great, keep writing – you'll be paid for new stories, but not, as now, for the old ones in perpetuity. Failing that, you get to learn about merchandising and personal branding.

    "He who receives an idea from me receives [it] without lessening [me], as he who lights his [candle] at mine receives light without darkening me." –Thomas Jefferson

  5. My point was that the post should have been prefaced with some background information. Writer Beware is usually quite good about that.

  6. @Frances Grimble,

    You are absolutely correct. A person's IP is theirs to do what they want with it. If a writer wants to lock a manuscript away and never let it see the light of day, or give it away for free, or offer leasing rights to the highest bidder, that's their prerogative.

    If you own the IP you are in control. No one can force you to do something with it you don't want to. And no one should be able to, ever.

  7. Marina,

    They should not just pay for producing the book, copyright holders should have full power to refuse to let them publish/produce it at all. BEFORE they produce it, not afterwards. Copyright holders don't want to get locked into a system where payment is fixed for them and they are forced to accept publication at that amount, or a system where they cannot refuse publication for any personal or professional reason they choose.

    Personally, I have no sympathy for readers who want books enough to clamor all over the net about their "rights" to freebies, but who don't want free books enough to take the trouble of going to a library to borrow the print book or fill out an interlibrary loan form for it. If they're inconvenienced because the book is not downloadable, tough. I'm a lot more inconvenienced if I can't pay my living or my business expenses.

  8. @jphbr: They are providing a great service to the end consumer, sure. But does that give them the right to steal? If they want to produce a book they should have to pay to lease the rights just like any other publisher.

    We don't ask city maintenance workers to work for free just because it benefits a lot of people. Or doctors. Or teachers. Or librarians. Or engineers. Or politicians. No one else is expected to get by on “exposure” and “that good feeling you get from a hard days work.”

    Writers have to eat, too.

  9. To balance the rather limited scope of the article and the acrimonious comments it generates, I'd like to stress that HathiTrust is a great resource for scholars and people doing research in general.

  10. Non-writers usually fail to understand that copyright is a legal asset that can be transferred, assigned, and inherited. Just read any publishing contract to see how copyright is addressed.

    Any writer whose will does not specifically address what becomes of his copyrights is shortsighted.

    If I build a company, which I have by the way, that becomes an asset that I can pass to my heirs and which they can then monetize in various ways. If I produce a quantity of written works, I have the same rights to pass that along to my heirs. This concept is what has produced the current US copyright protection of the author's life plus 70 years.

    Just because an author/copyright owner is difficult to locate or might object is no reason to just put his life's work out there for free (or for the benefit of others).

  11. Anon,

    Writers, historically, have never had pension plans. They have, for the most part, written with love — for their subject, for their readers, and for their families.

    Most writers live from royalty statement to royalty statement. Even the ability to live that way is considered a luxury to most writers. While there are always notable standouts who are both prolific and profitable, most writers write with the knowledge that all they will be able to leave their children when they die is the value of their copyrights.

    In other words, you better believe that knowing our copyrights will be respected after we are dead and buried is one of the motivating factors in "if I can just hold out long enough to finish the book…"!

  12. Anonymous,

    It is common for authors to die leaving works unfinished, unpublished, partway through publication, etc. The copyright heirs have a financial incentive to get works finished, get them published or reprinted, get them marketed, get derivative works such as translations produced, sell movie rights, license the right to use "worlds" and characters an SF or other author has created, and other uses.

    This can be of as much benefit to the public as if the author had done all that. Think: J. R. R. Tolkien. I haven't added it up, but I think his heirs have produced, or aided in the production of, more work after he died than he did when alive.

    Besides, why shouldn't authors' children inherit their works? Do you object when someone who has built up a small business leaves it to his or her children?

  13. The "preservation" issues are also a red herring. For one thing, the Hathi Trust is using the Google project scans, many of which are blurry, incomplete, have images of human fingers on the pages, and other flaws. All the OCR is unproofed. This is hardly the book archive anyone wants for all time.

    Even more, there is far, far more chance that books printed on paper will be readable a couple of hundred years from now, than any software format. I think the libraries want to carve a niche for themselves getting endless grants to upgrade software formats.

    Another issue is the photos and illustrations in the books, copyrights to which may well be held by the illustrator rather than the author. Hathi does not seem to be considering any attempt to locate illustrators.

    And the Hathi Trust absolutely is condescending. They posted a list of roughly 150 "orphan works" as a test. The Author's Guild invited their members and everyone else who wished to locate the copyright holders. When this attempt proved successful in many cases. Hathi withdrew the list and announced that their method worked. However, they can hardly expect the general public to locate copyright holders when lists of thousands of books are published and on a frequent basis. Remember Google has scanned millions of books and all the scans are being collected in the Hathi Trust.

  14. Well, anon, perhaps because otherwise authors would spend their time creating wealth that could be passed on to their descendants.

  15. I've gotten notes from library associations about the horrible Guild and the danger to intellectual freedom… hogwash.

    I think there are some issues in copyright which may need revisiting just because the world has changed, but your intellectual property is yours, and if you choose to will it to your heirs, that's your right… and for the anon who's worried about what good that does, well, it encourages current, living writers to keep going, knowing that they, like other entrepreneurs, may leave their legacy to their heirs–just like any other productive businessperson.

    Melissa, thank you for pointing out that not all librarians are jumping on this bandwagon. I haven't had the opportunity to try the coffee cup thing just yet, but I may.

  16. A little lite information… I write and publish under more than a dozen different pseudonyms. As a result, a “diligent” search may or may not find me. In fact, it is highly likely that it will not.

    For this reason, those who wish to declare any of my works as ‘orphans’ and who do not search very hard (not very deeply) will likely decide that most of my works are ‘orphans’. They most certainly are not such.

    I find it interesting that so many people have taken it upon themselves to ‘digitize’ paper bound books. Of course this is done in the interest of preservation and to make that ‘book’ more readily available for consumption.

    It does, however, violate a number of Federal laws.

    First and foremost as I see it, there are a number of different ISBN numbers assigned and separate copyrights of any given work. One is for the Hardbound printed book, another for the Paperback, and yet a third for the electronic version. All of these versions have the same story or material within their ‘covers’.

    A copyright, by the way, includes excerpts in whole or in part.

    In digitizing either of the paper bound versions to create an electronic version (digitized) these well meaning people have, in essence, counterfeited either of those works into an electronic format in competition with the electronic format(s) already in existence.

    They have created their own digital version which either now has the incorrect ISBN number attached or they must have a new ISBN issued to cover their digitized version. Either way they are in violation of U.S. copyright law.

    Yes, undoubtedly there are ‘orphan’ works out there. But many of those 'orphaned' works have heirs who now hold the copyright or who have initiated a new copyright. ALL of my works ownership reverts to me or my heirs once any of my books goes out of print.

    ‘Out of print’ for my purposes has been carefully defined to encompass a specific interval of time versus a specific number of prints sold within that period. I usually select less than one hundred and one copies sold in a period of six months following the first two years as my criteria. When that point is reached, the publishing rights and ownership of that specific format of the ‘book’ reverts to me or to my heirs. Please note, that it is possible for one of the three to revert while the other two are still ‘in print’ so to speak.

    At any rate, copying ANY of my works except for a short excerpt (less than two hundred words usually works well here) for use in literary critiques or for academia is expressly forbidden in my copyright statement. The ONLY way around this is to obtain my blessing (ie. Written authorization for something more than that aforementioned two hundred words). Purchasing one of each of the more than three formats (hardbound, paperback, e-book kindle, e-book nook, e-book pdf, e-book et cetera) will grant ONE use of my work in the format purchased.

    ANYTHING else is violation of my copyright or that of my publisher to whom I have assigned specific rights of publication.

    Sorry. But I intend to be VERY hard nosed about these potential thieves and counterfeiters.

    Harvey E. Seibert

  17. This is a brilliant article which I'll pass on to writers I know. As someone trying to contribute to SF as well, I appreciate the Fletcher Pratt reference. And I got steamed as soon as I read the Communications Officer's open letter. "Condescending" is definitely the right word!

  18. For Anonymous who wonders why the copyright continues after the writer's death:

    Because all your property doesn't suddenly belong to the first person who can grab it after you die, either. Including if you created that property–built that house with your own hands, knitted that sweater, crafted that cabinet. Your property goes to your heirs, whether or not it's valuable to anyone else: the diamonds and the silly little poem you or your child wrote at age eight are all part of your estate.

    Similarly with copyright. The copyrighted works still have value (or Google and Hathitrust wouldn't be after them) and your children or other heirs can (and often have) benefited from them, just as they'd benefit from inheriting rental real estate. I, for instance, have a disabled kid, now an adult. Why should Google, or Hathitrust, or you (to be specific, hiding-behind-anonymous though you are) benefit from my copyright being voided, when he–my son–is in need? Why should anyone but he get the royalties still coming in from my books, or be able to diminish those royalties by making the book available free to libraries?

    If you're going to void copyright at death, you might as well void *all* property rights at death. Why not, after all? Kick the heirs out of the house, let it be taken over by a corporation for their own profit (Google) or a non-profit 'for the public good' and all its contents shared with whomever, but certainly not your heirs.

    It's not as if copyright were the only property that can be "orphaned" in the HathiTrust or Google sense when someone dies. it happens with real property and personal property, too. Sometimes it takes months or even years to find the heirs (hence the existence of firms that do the seaching.) But that doesn't mean the property has no lawful owner and anyone else who wants can take it, use it, live in it.

    And, as this post clearly demonstrated, live authors of books still in print have had their rights violated…their books were never orphaned. Some of my books–all in print, all available in bookstores and on Amazon.com–were illegally digitized by Google. Not orphans. Not even out of print, let along out of copyright. Had Google or the library who provided the books done even a cursory search on my name, they would have found that I–and my website with my contact information–comes up at the top. But they didn't search; they didn't ask; they simply stole.

    So consider if there were a company, or a nonprofit, as interested in *your* property. Your shoes? There are people without shoes who need shoes. What if they could legally take them any time you weren't wearing them? Especially if they could claim that not only were you not wearing them, but they weren't worth that much any more, and you hadn't been renting them out lately. (The claim that books not selling well or at all must be "orphaned.") Your other clothes, on the same logic? Your computer, your phone, your furniture, appliances…what if you had to not only buy or make what you used, but you had to register ownership with an official bureau of ownership?

  19. I wasn't aware of the case law mentioned here, and I'm glad to hear it. Thanks for this well-reasoned and important article. I'll be bookmarking this for when I need fodder for debates!

  20. Intellectual property rights and copyrights are a total bugaboo! My mom was a song writer – words and music – and I know she registered her songs. I tried to find the rights when she died, and the library of congress wanted hundreds of dollars to find the documents she filed.

    As a writer, I protect what I produce, but when I've passed, as Stephen King must some day, his property should not be in the public domain. His wife and kids and grandkids should get royalties from the work.

    Would a TV show in syndication stop paying just because one of the principles died? NO! The royalties keep going, and the heirs are paid. So, it's not any different with the written word.

    There should likely be a recommendation and a "formal process" for searching for owners of rights. As of now, it's a free-for-all, with finding someone or their heirs up to interpretation. A recommendation by the courts as to what makes up a "diligent search" would be really helpful.

  21. @Anon–because, like the rest of my property, I have the right to will my rights to anyone I please. No one should have the ability to come and take the things I've built for my family just because I'm not alive to stop them.

  22. And again I ask you. Why does the copyright on an authors work need to be extended after said author has passed on. He has no more incentive to write something new and copyright is supposed to be "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." Where in that does it say anything about any heirs?

  23. I hope more authors or their heirs speak out about this. If someone owns the rights to the work, no one else has the legal right to reproduce that work just because they feel like it. Not only is it flat out illegal, it's also unethical. It's stealing, plain and simple.

  24. Not all librarians agreed with HathiTrust. Personally I think the whole project and the attitudes of the people running it smacks of arrogance.

    When all of this first broke one of my colleagues was ranting about how ridiculous the Author's Guild was being, and that it was only right that out of print books be made available for users. So I stole her favorite coffee mug off the ref desk when she went on break. After all, when I asked around it didn't belong to anyone who was there. Strangely enough, the "orphan" argument didn't work when it was her stuff being appropriated.

    Funny that.

  25. Why yes. Let us make sure that those dead authors have their copyrights preserved so that they have incentive to write more books.

  26. Ah, yes, of course exploiting orphan works will be much more profitable if you get authors and other concerned people to do all the work for you.

Leave a Reply

OCTOBER 25, 2011

A Small Press Implodes: The Inside Story of Aspen Mountain Press

NOVEMBER 2, 2011

Beware Zombies: Franklin-Madison Literary Agency