Authors Guild Alert on “Text to Speech” Function of Kindle 2

Just got this alert from the Authors Guild:

On Monday, Amazon CEO Jeff Bezos unveiled Amazon’s Kindle 2 e-book reading device at the Morgan Library in New York. Most of the changes from the first version of the Kindle are incremental improvements: the new Kindle is lighter and thinner, for example, and Amazon eliminated the scroll wheel. One update, however, is wholly new: Amazon has added a “Text to Speech” function that reads the e-book aloud through the use of special software.

This presents a significant challenge to the publishing industry. Audiobooks surpassed $1 billion in sales in 2007; e-book sales are just a small fraction of that. While the audio quality of the Kindle 2, judging from Amazon’s promotional materials, is best described as serviceable, it’s far better than the text-to-speech audio of just a few years ago. We expect this software to improve rapidly.

We’re studying this matter closely and will report back to you. In the meantime, we recommend that if you haven’t yet granted your e-book rights to backlist or other titles, this isn’t the time to start. If you have a new book contract and are negotiating your e-book rights, make sure Amazon’s use of those rights is part of the dialog. Publishers certainly could contractually prohibit Amazon from adding audio functionality to its e-books without authorization, and Amazon could comply by adding a software tag that would prohibit its machine from creating an audio version of a book unless Amazon has acquired the appropriate rights. Until this issue is worked out, Amazon may be undermining your audio market as it exploits your e-books.

Read the full Alert here.


  1. Very interesting reading, Victoria. I stand corrected. Apparently the technology does exist. I must be searching the wrong markets.

    That said, I now can see exactly why authors should be concerned.

  2. That's nice. Unfortunately for the visually impaired, the rest of us get to use library services for free. Don't get me wrong, I can see your point. However, having never considered the plight of the visually impaired in this respect before I can now see the other side of the argument as well.

    I honestly do not believe for a moment that a computer generated text to audio feature will in any way infringe upon the audio book market. It's a whole different world. When someone pays for an audio book they're getting so much more than the contents of the book itself. They are paying for the quality of the recording. They are paying for the rich voice of the narrator. It's not simply about convenience.

    I just don't believe that this computer generated feature is going to harm that market. I don't know of anyone who likes to listen to audio books who would trade out for this "free" feature instead. The only ones I can see benefiting from this feature (truly benefiting) are the visually impaired.

    Why should a visually impaired person have to pay a fee to use a library service? Why should he/she have to hire someone to read for them (or be at the mercy of friends & family for this simple pleasure)?

    I would think that an author advocating the use of this free feature would rack up a lot of "brownie" points with the public.

    Quite frankly, the only authors I can see being potentially "harmed" by this free service are those with a large enough following to warrant an audio translation to begin with. I mean no disrespect to other authors, but, unless you are in high enough demand to warrant an audio version via your publisher I wouldn't think that it's going to have any real impact on your career.

    Please understand that I am only expressing an opinion & mean no disrespect to any of the authors here.

  3. Sorry, I should have worded that better. Bookshare is free for student with qualifying disabilities; others with qualifying disabilities pay a $50 annual fee. But it works like a library.

  4. And LK Hunsaker nicely encapsulated my point as well.

    I’m not really sure why I’m being spoken to like I’m some kind of vicious moron simply for thinking that perhaps, if Amazon is adding bells and whistles to my work from which they will profit, they should give me a cut of that additional profit, and that perhaps in several years’ time this will be a much bigger concern. It’s got nothing to do with not wanting the visually impaired to be able to enjoy my books as well. All I know is, I and a lot of other people would probably be quite willing to put up with some loss of audio quality if it’s free.

    Oh, and btw…a commenter on another site posted a link to Bookshare, a service for the visually impaired which converts text to speech; it’s free for students and others with “qualifying disabilities.” Authors can send their books, and Bookshare will convert them and make them available. I think it’s a fantastic service and am already discussing it with my publishers.

  5. Maybe the Guild is overreacting and maybe they’re getting rightly concerned about Amazon’s attempted dominance of the book industry, first with, then buying AuthorHouse and insisting all POD books go through them and now this. My question is … what next?

    They already pay authors next to nothing when books are sold through their site. They make more on my books than I do when they sell them. Why should they get this extra feature to help them make more money when the authors don’t stand to make enough to be worth allowing those rights?

    There’s a bigger issue than the quality of the speech. By all means, allow the extra function for hearing impaired buyers, but the Guild (of which I am not a member) better be watching Amazon.

  6. …A little late putting in my two cents worth (been out of pocket for a little while).

    Personally, I'm not particularly worried about this kindle feature. The voice command feature on my cell phone mispronounces at least 80% of the names in my contacts list. My GPS is constantly mispronouncing street & city names. I know there are those out there who fear the day when a computer generated voice can actually mimic the human voice well enough to make voice actors a thing of the past, I just don't happen to be one of them.

    I've been eyeing this feature in its various forms for over 15 years & honestly can't say that there has been any commercially viable improvement.

    Sure, it MIGHT happen, but I, for one, seriously doubt that it will within the next 15 years.

    The visually impaired, much like those who like to listen to audiobooks while driving, painting, plumbing, etc., prefer to hear something read well. Reading for "pleasure" is just that. If it's not a pleasure to listen to then, odds are, it won't be a detriment to the "real" audio book market.

  7. It wouldn’t take an AI capable of emotional thought to put emotions into reading a book. I am pretty sure that you could encode the emotional information into a book in much the same way that HTML encodes the page formatting into text. However, if you’ve deliberately encoded that into an e-book, do you have an e-book or an audiobook? It’s hard to say where you draw the line. If you think about it, the alphabet was originally a means of encoding audio information.

    The way I see it, a feature that makes e-books more useful would be good for authors: It would make e-books sell better, raise their status, and that should mean that authors would get more money and more credit for them. Personally, I think e-books are too stuck in their book metaphor, at least for nonfiction. They don’t really make the best use of the medium they are in and many of them instead manage to add artificial limitations. Imagine an e-book on car repair that asks you questions about what’s going on to narrow down why your car won’t start or a wiring diagram where certain wires go dark if you pull a virtual fuse.

    But right now when I try to access a set of e-books on car repair through one well known website of online car repair e-books, I often get wiring diagrams that come up on my computer screen sideways, or cases where the actual chart in a book is on fold-out pages but the site splits it awkwardly across several pictures that it won’t let you pull up at once. A mechanic can turn the paper book sideways or fold out the chart and expand it to see the whole thing at once, but the e-book doesn’t allow that.

    I think e-books have a lot of potential, and encoded sound is one of them… but right now too often they’ve kept the limits of a book and added the limits of a computer instead of using the computer to get around the limits of a book.

  8. Yes, Stacia, because when computers become able to READ books with the appropriate emotional context, they will be able to WRITE books with the appropriate emotional context and we authors will have a lot more to worry about than if we’re getting our dollar per copy of audiobooks being sold.

  9. Yes, Ann. Perhaps it is. Does it automatically follow, then, that it will continue to be so for the next five or ten or twenty years?

  10. December/Stacia wrote:
    Sure, maybe I lose a little something in the audio. But I haven’t had to buy anything extra either.

    Have you ever listened to an e-book using the “text to speech” function of Microsoft Reader? It ranges from dull to laughable — mostly dull. You don’t lose “a little” — you lose a lot. It’s dreadful. I can’t imagine listen to it for hours, let alone more than five minutes. Maybe the technology has improved since then, but I doubt it. The only reason visually impaired people rely on it is because they have fewer choices.

    Right now, people who want to listen to a book while driving or painting or packing or whatever can get them for free from their library. They can get public domain audiobooks for free from Project Gutenberg. They can even download them for free from their local library. All without paying for the audiobook. I can’t imagine the typical reader rejecting all those options and listening to the world’s dullest computerized voice slowly intoning its way through a book.

    Also, why is the Author’s Guild just noticing this feature? Microsoft Reader could do this years ago, and as far as I know, they never addressed it. Does this mean they didn’t know it existed? Or did the AG think e-books didn’t “matter” until the Kindle came along?

    Anne Marble (who can’t get her !@#$ password to work)

  11. Yes, it’s all very well and good to say the audiobook version is better; of course it is. But if I’m taking a drive and want to keep reading my book, or cooking dinner, or painting the bathroom, or cleaning (all of which are wonderful times to listen to audiobooks; I spent an afternoon once paitning and listening to Maeve Binchy’s Tara Road and the hours flew by), I have two choices. I can go buy the audiobook and keep it in my car, or I can NOT spend the extra money and simply use the text-to-speech function on my Kindle.

    Sure, maybe I lose a little something in the audio. But I haven’t had to buy anything extra either.

    The idea that *only* the visually impaired are going to listen to books on the Kindle is kind of short-sighted, I think. I would certainly do it. Heck, I’m moving in a couple of months; I can’t think of a pleasanter way to spend my unpacking hours than listening to a book at the same time.

  12. I agree with Barbara Doran and others that mention that having a program convert words to sound is not the same as having an actor perform the work. The only way I could see that happening is if someone placed complex cues throughout the text to tell the voice when to modify itself to sound angry, sad, conflicted (with all the subtypes of conflict) and so forth.

    Actors do so much more than sound nice. Regardless of the solutions attempted to create a variety of pauses, pitch and so forth, only AI will replace an actor, and not just any AI. It would have to be AI with an ability to interpret the page and communicate empathetically as well as perform nicely for the audience. When AI has that level of depth, we’ll be competing with another intelligent life form on this planet and this stink over the Kindle won’t apply.

    The only reason to fuss would be if Kindle got more money for audio-enabled versions, in which case some money should flow to the author (even if indirectly.) If it’s going to be just a feature, well, I think it’s added value, like a nice hardcover edition, rather than a true ‘rights grab.’

  13. Nareshe, I’m not sure if you or your partner are familiar with On the Go Books – It’s basically Netflix but for audiobooks. I used to have an account when I had a boring job and could wear headphones. Listening to music 10 hours a day got tiring, but I made my way through a ton of audiobooks.

    There selection was average, IMO, when I had an account 4 years ago. I can only hope it’s improved since them. The turn around is a little slower than Netflix because they don’t have as many drop boxes. But, for the price, it might be worth it if there are books your partner is interested in reading but doesn’t really want to own.

  14. @Anonymous – If reading a book aloud were a valuable commodity to everyone then I could see the Guild’s concern. This function is only truly of use to those for whom the written word is an issue. Most people who read books want to read them, not listen to them. Especially not in a mechanical monotone.

    Audiobooks add value to the text by using readers who actively perform it. Kindle’s function does not and cannot. All joking aside, this is not going to be a threat to audiobooks any time soon.

    It certainly isn’t a threat to author.

  15. Gaiman’s argument is that if you buy the book, you have the right to read it aloud, or have someone read it to you, etc.

    That’s private behavior.

    Amazon wants to sell the reading-aloud in the public marketplace.

    That’s not private behavior. That’s commercial use. An additional commercial use, for which the author should be paid additionally.

    Isn’t it?

  16. I have to agree with Neil Gaiman’s comments on the matter. I’ve listened to him read his own books aloud and, in my opinion, he’s a brilliant narrator. I can’t imagine a computer ever coming close to the quality of a professional audio book. If I want to listen to a book rather than read the text, I would go to Audible, not my Kindle.

  17. It occurs to me that, should Artificial Intelligence reach a point where a computer is able to ‘fool’ the listener into thinking there’s another sentient being at the other end that said sentience is very likely to want payment as well.

    In which case A.I.’s will just join the work force and have to earn their bytes the way flesh and blood folk do.

  18. If a computer-generated voice was at all marketable then autiobook publishers would have switched to it a long time ago. It’s certainly cheaper than paying a human reader. I’m betting that the only people who are going to use this are the seeing impaired and maybe the occasional person who doesn’t want to listen music or talk radio while they’re working.

  19. If computer voices are ever able to do a better job at reading fiction than professional humans are, then more than just authors are going to be in trouble.

    The voice actor puts a lot more into the job than just turning text into speech. There’s an interpretation there, decisions about what the text MEANS.

    The day will come, someday, when machines can cross the “creepy valley” (look it up) and when it happens, this will be the least of our problems.

  20. As I posted on John Scalzi’s blog earlier this week:

    I agree that text-to-speech and audiobooks are two very different experiences: one is a delivery method and the other is, well, a derivative work.

    It does seem short-sighted (heh) of Amazon not to make the interface a talking one. My partner is blind and is always trying to find new ways to “read” books; I’m sure he’d buy the Kindle in a heartbeat if it were properly accessible. He has text-to-speech software whose voice he’s used to, which doesn’t always work with e-books (as mentioned above). You’d think that the arriving e-book revolution would open up limitless avenues of reading for those who are visually impaired or otherwise in need of alternative formats, but that doesn’t seem to be the case, at least not yet.

    (Further notes on reading for blind people: My partner may not be typical, but he doesn’t like using Braille or listening to actual audiobooks. He likes text-to-speech software because he can speed it up much faster than actual talking speed and thus consume a book at something closer to reading speed.)

  21. I read about this on at the beginning of the week. They were taking the mick out of the guy, Paul Aitken, who brought this up:

    and they’ve since made a second post about it.

    I personally think he’s got half a point about it. Computer voices may sound flat now but we don’t know what they will sound like five years from now – as a couple of others have pointed out. They could link it to the Microsoft music programme in development and you could get backing music as your kindle reads to you.

    Off topic but Techdirt also discussed the Google book scanning issue again and concluded that it wasn’t as good a deal as some people think it was. Don’t know if you blogged about this Victoria (but I think you did) but it could be worth looking at again.

  22. anyone who’s listened to a REAL audiobook can tell you there’s a world of difference between having a text read to you by a computer voice and the text being performed by a skilled cast including sound effect, emotional speaking, and actual presentation of the material.

    just saying.

  23. I think the Author's Guild is shortsighted on this. I explain my reasoning here: But in a nutshell, I think text to speech will increase book sales rather than hurt them. There is a huge potential market out there, the reading impaired (not just blind and dyslexic – there are other reading disabilities as well) who would buy an ebook knowing they can read along with the speech. With that feature I will strongly consider a Kindle for my daughter. Without it, we won't be buying any ebooks (or Kindles).

  24. Just dittoing Anon 10:40 and Jean. Speaking as someone who started their career in ebooks, I’m tired of ebooks being treated as “not real books” and therefore not worthy of defense.

    While I totally get the point that the technology as it stands today is no substitute for real audiobooks, at some point, probably pretty soon, it will be. I’m glad the Author’s Guild, at least, is looking ahead and paying attention to where these things may go (unlike, say, the RWA, who prefer to bury their heads in the sand and pretend those dirty old ebooks don’t exist.)

    To me it’s not about disabling access for the visually impaired; I am all for any technology that will allow people with lower visual capabilities to continue to read books, or to enjoy them as audiobooks. It’s simply a matter of finding a way to make sure the author is paid for those rights *before* the issue hits critical mass and the text reader functions sound as natural and good as an actual person sitting and reading to you.

    Why does the author have to give up the rights and lose out? Why can’t Amazon give up a percentage of its profits for every Kindle book sold with the audio function enabled, and give that money to the author? Why can’t there be two Kindle versions, one audio-enabled and one not, at the same price (nobody is interested in making visually impaired people pay more; I’m certainly not), but on one version Amazon makes, I don’t know, 5% of the sale price and on the other they make 4.5%? Since they’re the ones “taking” the rights, shouldn’t they freaking pay for it? Why is it always on the writers’s shoulders to give up the right to be paid for our work?

  25. Yup, this is a massive case of overreaction. There’s a difference between protecting the rights of the authors, and blindly grasping for more power when there’s something out there that someone could possibly profit from.

    In short, just because the technology allows for something doesn’t mean you immediately have to start thinking ways to profit from that. A lot of great technologies have been brought down when people started inventing ways to fleece every last penny from people using those technologies.

    (A provocative example I mentioned somewhere recently: Would they also stomp on people’s right to burn books? I mean, if it’s based on the content of the books, and can’t be done without the entirety of the accompanying text at hand rather than just fair-use proportions – as a result such form of expression would be clearly a derivative work under copyright law, no? Protecting rights! Ka-ching!!!)

    It’s also definitely a case of trampling on consumer rights. The moment you start thinking what people are (privately) doing with the media product you sell, you’re probably going way too far already.

    How about the authors just provide the texts, the customers buy them, and the engineers think how to deliver it electronically for the best satisfaction of all parties? It used to be so simple. Just sell the texts.

  26. As I read this, the idea is to head off the audio impact when text-to-speech quality improves to a level to rival professionally created audio books. Certainly, an Audible (for example) product read by a real human, usually with a fine voice quality, is superior to a machine-made audio product which is likely to equate to Adobe Acrobat Reader’s already available text-to-speech synthesizer.

    I think the idea is that if this goes unchallenged now, it will be harder to fight when the capability rivals the better quality audio products currently available and an author’s rights to be paid for them.

    Since this same concern doesn’t appear to have been raised for the Adobe Acrobat Reader capability, I can only surmise Amazon’s Kindle is perceived as a real threat.

    And, yes, a software tag inserted by Amazon would be a simple way to protect the rights, but there’s bound to be someone out there who finds a way to break the protection. Is it worth it?

    Interesting item for discussion and consideration.

  27. Well, but here’s the thing. What happens when the robotic voice improves to the point at which it is as esthetically acceptable as human speech? Because it will. And when it does, it will be too late to say, Hey, wait a minute.

    To me, the point is that Amazon is using content that authors created in a way that adds value to Amazon’s product, without sharing the added revenue with the authors who made the whole thing possible in the first place.

    That one will of course make exceptions for visually impaired readers goes without saying. But that Amazon can take whatever it wants just because it can should not go without saying…or so it seems to me.

  28. Just to add my two quatloo’s worth of opinion… I have been working in the computer industry for almost forty years and have seen the evolution of society from that perspective. This dust over the Kindle text-to-speech is bothersome to me both as a geek and as a writer. From the technical perspective, it smacks of stifling, censorship and discrimination (against people with a visual handicap). From a writer’s perspective, it limits the range of people (again, people with a visual handicap) that can read my work. Remember, this is NOT audio book format. It is the printed word converted to a different medium.

    Cases have already been made that a ruling against the Kindle could have far reaching impacts, such as would it be illegal to read your child a story? Me thinks some people have way too much time on their hands.

  29. My partner is legally blind, and she absolutely loves her Kindle. She can make the text large enough for her to read–and it’s allowed her access to a huge number of books that aren’t out in audiobook. (Anyone who hasn’t been dependent on audiobooks as their only source of books probably isn’t aware of the hideously limited selection of audiobooks out there, especially for new fantasy by lesser-known authors.)

    Unfortunately, the new Kindle won’t read menus aloud. So while a blind person might be able to access the ebooks…they won’t be able to use the menus.

    This is especially frustrating for my family, as my partner recently had a detached retina in her “good” eye, which led us to run into all sorts of trouble with her computer. It would have been really nice if her Kindle could have done text-to-speech for her, since a lot of the stuff she’s used to doing with her computer was either more complicated/annoying, or downright impossible.

    She prefers audiobooks when she can get them–anyone would, I think. But audiobooks are expensive (for good reason, I hasten to add! Not quibbling about the price of a professionally read book, especially since I’ve heard some of the Library for the Blind stuff) and avid readers can quickly run through the available selection. Using the library helps, but libraries can’t get what doesn’t exist.

  30. This isn’t the first time text to speech has been an issue with e-books. Microsoft Reader can read e-books aloud, too. Most large publishers have blocked this “feature” from working on their secure (DRM) Microsoft Reader titles. Every time I open a secure Microsoft Reader title, I have to click a warning that explains text to speech functionaliy has been diabled on that book. It is annoying.

    Publishers feared this “feature” would compete with sales of audio books. Anyone who has ever tried the “text to speech” feature will know this is insane. An automated voice slowly intoning its way through a book is not competition for an audio book read by a real person. Try an SF or fantasy novel. Microsoft Reader thinks that treecat should be pronounced treekit. Heck, it can’t even pronounce “shouldn’t.”

    When publishers first disabled this feature, there was controversy because blocking the feature prevented the visually impaired from being able to use the e-books they paid for legally. I don’t think those issues were ever addressed, and the publishers came across looking overly paranoid. Now maybe it’s time for the Author’s Guild to do the same.

  31. It is kind of an interesting question, but from a computer science perspective instead of a lawyer’s, I kind of doubt it qualifies as audio recording rights since it is an on the fly reading device. Besides, even if it doesn’t invent comical mispronunciations, I don’t think it is going to be at all comparable to what a competent reader can do. I don’t see this as any different from how you could potentially copy text from a PDF and paste it into a text to speech program on a PC.

  32. As a producer of audio content, I am completely unafraid of this function.

    If it’s a “rights violation” for a computer to read the book to you, is it a rights violation for a parent to read a bedtime story to his child?

    This is not a public performance. The Guild is overreacting.

  33. Another point to think about is that the Kindle is a device that actually restricts the rights of readers. You can lend or resell a physical book, but you can do neither with a Kindle.

    Most writers, I think, are readers, too. That being so, they should think seriously about supporting something that benefits publishers at the expense of readers.

  34. I see the troll is back. I won’t bother deleting its comment; if you spew silly crap, you deserve the ridicule you will likely receive

    I’m not yet completely sure what I think of this issue, but on balance I think I agree with those who feel the Authors Guild is over-reacting, and point out that a computer-generated voice isn’t equivalent to an audio production.

  35. The general hatred of e-bboks and anything non-traditional on this forum is yet another example of overreacting. Now we have to make text books available to the blind only as braille! Come into the 21st century, you people who run this forum. For weeks now you’ve been resisting everything that not a traditional printed book as a ‘scam’. You let your so-called ‘faithful followers’ use the word and then give them a chastising that is hardly a slap on the wrist and say that it wasn’t you who used the term. That’s like Stalin trying to say that he was innocent, that the KGB did all the crimes. Disgraceful. You would rather see the blind go without a book than break your traditional mode. Let’s face it. Books (i.e. paper surrounded by a cover) are not on the way out, not by a long shot. But they will have to put up with some competition from other forms of media. AND NOT ALL OF THESE FORMS ARE SCAMES!!! GET THAT INTO YOUR HEADS. This so-called ‘alert’ is the overreaction of people who are running scared.

    The published authors on this forum who love to tell us that anything other than their old paper book is not a real book have to revise that opinion. Look at it scientifically instead of having the same old ALCs and nightmusics calling us ‘Einstein’ in their snobby down the nose way. Wake up, the future is here, Isaac Newton!

  36. It would be amusing to hear the Kindle’s version of some of Patrick O’Brian’s works, sprinkled as it is with Latin, Spanish, French and nautical terms such as “fo’c’s’le”. I suspect it would sound painful at times.

  37. I have to add my general agreement. If the function meant that the Kindle would create a .WAV file that could then be passed to other people I’d worry. As it is, this is not much different from using a magnifying glass to read.

  38. A lot of the new technology ha already had its wrist slap for inaccessibility for the blind; the ubiquitous of touch screens is putting phones, MP3 players and some eReaders out of reach of the blind. I thoroughly support an attempt to rectify that! It will never compare with a decent audio book, but it’s a nod towards disabled users.

  39. Anonymous Coward seen in this thread.

    What “rights” are being given away here? The audio book rights are for a professionally produced piece, not for a computer generated voice to read out the text in a flat monotone.

  40. Tell you what, you give your rights away and I’ll sell mine, and we’ll see where we all end up, okay? This is a rights grab, pure and simple.

  41. I concur. The guild is overreacting. The people utilizing this technology would most likely be the visually-impaired.

    Particularly if the price of ebooks and audiobooks are somewhat similar, those who are audiobibliophiles (Hey Martha, I just made up a word)like myself would much rather listen to a real person reading than a computerized voice.

  42. I think they’re over reacting as well. The Kindle text to speech is the same that comes on any computer…it’s an accessibility tool for the blind and hard of seeing.

  43. Neil Gaiman, John Scalzi, and at least one other author whose name escapes me at the moment have all said that they think the Author’s Guild is over-reacting on this point.

Leave a Reply

FEBRUARY 9, 2009

Opening Paragraphs for Sale

FEBRUARY 16, 2009

Writing Oddities: