Artificial Intelligence and Copyright: SFWA’s Comments to the US Copyright Office

Image Header: Neon image of a human brain embedded in a computer motherboard. Credit: vchal at Shutterstock.com https://www.shutterstock.com/

In August, the US Copyright Office issued a Notice of Inquiry (NOI) seeking public comment on copyright and artificial intelligence. The NOI is part of the Copyright Office’s AI Initiative, which seeks to assess whether copyright-related legislation or regulation is needed in response to the rapid development and deployment of generative AI.

The NOI seeks factual information and views on a number of copyright issues raised by recent advances in generative AI. These issues include the use of copyrighted works to train AI models, the appropriate levels of transparency and disclosure with respect to the use of copyrighted works, the legal status of AI-generated outputs, and the appropriate treatment of AI-generated outputs that mimic personal attributes of human artists.

Below is the Science Fiction and Fantasy Writers Association’s response to the NOI. It details a variety of harms reported by novelists and short fiction writers as a result of generative AI (including the training of AI systems on vast amounts of in-copyright work without permission), and offers four specific suggestions for protecting creators and their work.

Although the response is specific to SFWA and writers of speculative fiction, these are not issues confined to genre: they confront all writers, in all markets, at every stage of their careers.


Logo of the Science Fiction and Fantasy Writers Association

The Science Fiction and Fantasy Writers Association (SFWA), formerly Science Fiction and Fantasy Writers of America, is a 501(c)(3) nonprofit organization whose mission is, in part, to support, defend, and advocate for writers of science fiction, fantasy and related genres. Formed in 1965, SFWA currently has over 2,500 commercially published writers in those genres across various types of media. Its membership includes writers of both stand-alone works and short fiction published in anthologies, magazines, and in other media. SFWA is not a subsidiary of any other entity. SFWA has no subsidiaries or other ownership interest in any other organization that may be affected by the Copyright Office’s policies on AI. 

It is in that capacity that we write this letter in response to the Copyright Office’s call for comment on issues raised by artificial intelligence systems. As creative writers who have long had an eye on the future, we are no strangers to the concept of artificial intelligence; indeed, the work of our members is frequently mentioned by the people who over the years have made progress in that field. We have long anticipated these developments and have thought deeply over the years about its promise and pitfalls. With this in mind, it is with much regret that we cannot yet speak in favor of using AI technology in the business of creating art.

The current crop of artificial intelligence systems owes a great debt to the work of creative human beings. Vast amounts of copyrighted creative work, collected and processed without regard to the moral and legal rights of its creators, have been copied into and used by these systems that appear to produce new creative work. These systems would not exist without the work of creative people, and certainly would not be capable of some of their more startling successes. However, the researchers who have developed them have not paid due attention to this debt. Everyone else involved in the creation of these systems has been compensated for their contributions—the manufacturers of the hardware on which it runs, the utility companies that generate their electrical power, the owners of their data centers and offices, and of course the researchers themselves. Even where free and open source software is used, it is used according to the licenses under which the software is distributed as a reflection of the legal rights of the programmers. Creative workers alone are expected to provide the fruits of their labor for free, without even the courtesy of being asked for permission. Our rights are treated as a mere externality.

Perhaps, then, creative workers uniquely benefit from the existence of these artificial intelligence systems? Unfortunately, to date the opposite has been the case: SFWA has thus far seen mainly harm to the business of writing and publishing science fiction and fantasy as a result of the release of AI systems. 

For example, short fiction in our genres has long been recognized as a wellspring of the ideas that drive our work as well as inspiring works in film, games, and television. Writers in our genres rely on a thriving and accessible landscape, which includes online and paper magazines. Part of the success of these publications depends on an open submission process, in which writers may submit their stories without a prior business relationship. This has frequently served as a critical opportunity for new and marginalized authors to have their voices heard.

Over the last year, these venues, particularly the ones that pay higher rates for stories, have been inundated with AI-written stories. The editors uniformly report that these submissions are poorly conceived and written, far from being publishable, but the sheer volume materially interferes with the running of these magazines. Once submission systems are flooded with such content, it takes longer to read and reject a submission than it took someone to have an AI produce it in the first place. Every submitted work must be opened and considered to verify that the writers for whom the system was originally designed are not missed or forgotten.

This amounts to a denial-of-service attack against a critical part of our community: at best, authors wait longer for the attention of a more-hurried editor; at worst, we fear that valued markets may be forced to close their doors to unsolicited submissions. As more and more users of AI hear about potential get-rich-quick submission schemes, this problem will only grow worse. Significantly, the harm to our marketplace will fall hardest on new and marginalized authors.

Likewise, novelists have reported to SFWA that their work has been increasingly crowded out in online marketplaces such as Amazon, where there is a well-known issue of large numbers of AI-generated book-length works for sale. In this case the market itself becomes clogged with AI-written books that are posted in the hopes of playing Amazon’s algorithms to lure readers into careless or adventurous spending, leaving those readers less able to find real writers’ work and less willing to take a future chance. Again it is new and marginalized authors without name recognition who will suffer most from the significant rise in noise in the marketplace, but even established authors are seeing AI-generated counterfeit works appear under their names.

In addition, we have received reports of AI systems trained on copyrighted material being induced to produce material that would be a violation of the original creators’ rights to license derivative works. The background and setting of science fiction and fantasy books are uniquely copyrightable and suited to derivative works that may not even copy the characters, but only the world in which the story takes place. It is common for science fiction and fantasy authors to license these worlds to other authors or publishers as derivative works. AIs have been documented being used without permission to produce putative prequels or sequels to copyrighted works using these derivative elements, for example, or to provide quotes from books. 

AI does not uniquely enable such violations, of course; the potential harm is that what can be done on purpose can be done by accident. It seems entirely plausible for someone to generate text from an AI for use in their own creative work that violates the rights of another author. They would then publish the resulting work or seek copyright registration without ever understanding that their work infringes on the rights of others. Should this become common, the courts would likely become increasingly full of litigation to protect original intellectual property, at considerable cost and chilling effect.

Clearly, we are only experiencing the beginning of a potentially devastating process for both creators and consumers alike.

Even if at some point in the future as AI becomes more sophisticated, these ersatz books and stories become acceptable to human readers, that will be wholly because of the copyrighted material that these systems have copied and ingested. If they have any value at all, it will be due to the work of creators that has been taken without permission or compensation, and turned to a use that harms the ability of those creators to ply their trade. This state of affairs cannot stand.

The Copyright Office is rightly concerned with the details of the registration process and how AI generated writing should be treated by the law. We agree wholeheartedly that AI-generated works should be uncopyrightable and the Office should refuse to register such works. To promote the progress of science and the useful arts, that line must be held. 

We are forced to conclude, however, in the light of the harms listed above, that existing policy is not sufficient. Lack of copyrightability has not prevented marketplaces from being flooded with counterfeit fiction. Treating AI-generated text as legally on par with public domain work ignores the possibility that it in fact inadvertently contains infringing elements placed there by a machine seeking only to respond to a prompt and not concerned with rights.

It must also be considered that AI systems may be trained in other countries with different laws regarding the ingestion of creators’ work. Care must be taken to craft regulations and laws that protect rightsholders in the US, no matter where the AI itself is trained. Training outside the US must not become a loophole allowing the violation of rights in the US.

Remedies will not be easy. However, we offer four suggestions that we consider necessary to protecting creators and preserving their future ability to create.

  • We recommend a disclosure requirement, to protect readers and potential business partners by explicitly identifying AI-created contributions representing at least a de minimis creative expression. Supporters of human authorship want to know what it is they’re buying, and markets have a right to not be fooled into publishing work which the putative authors have limited or no legal right to license.
  • The copyright registration of textual works that include a mix of AI-generated and human-authored expression should indicate how much of each there is. There is no easy way to distinguish whether a work is 95% human generated and 5% created by AI, or vice versa. The inclusion of purpose-created AI work thus differs from the otherwise-analogous inclusion of public domain work, which can in principle be researched and identified. We therefore ask that the CO include in registration a disclosure statement from the registrant estimating how much of the work is created by AI.
  • We recommend requiring proof of license or legal right to all training data before any AI-generated work based on that data can be incorporated into a copyright registration. Just as it is considered necessary for commercial software to track the licenses of all libraries used in its creation and verify the right to put those libraries to such use, it should be necessary as well for the training data of an artificial intelligence to be tracked such that it can be proved that all incorporated work is used under license by consent of the author. Work incorporating AI without that clear provenance should be treated as presumptively infringing on the copyright of unknown authors.
  • In the event that a collective license for AI textual works may be established, a number of factors need to be taken into account before accepting such a license, many of which are often overlooked. We believe that such a system should be voluntary, opt-in, and should cover any professional authors whose work has been ingested by the AI system. Collective licenses that do not receive explicit permissions from the authors of covered work should be treated as no license at all for these purposes, and AI-generated work based on them should be treated as infringing on the copyright of authors who did not voluntarily opt in.

As writers of science fiction and fantasy, we are confident that the boundless ingenuity of the researchers who brought these AI systems to the world will find ways to do their work that fully respect our rights.

SFWA looks forward to the opportunity to provide input on whatever additional subjects may arise during the course of this study.

9 Comments

  1. If my writing was being used in a classroom to train young human minds, I believe it would have been obtained through authorized means: buying copies of my novels, buying a subscription to the magazines where my stories appear, or otherwise supporting the medium of delivery. It is my understanding that AI will do none of this. It will scour the internet for text and use it to train itself. Thus I would not be enumerated for its use of my property.

    1. No argument with the premise, but a few observations and perhaps a question or two.

      1) used books – there’s a similar argument to be made in the case of used books. Someone other than the author is profiting from the used book trade. The counterargument would be that the author has already been compensated for their work through the original publication venue, and now we’re dealing with a secondary market. For the sake of argument, do you feel authors are due a cut from the proceeds of the secondary market?

      2) libraries – here, we deal with a dilution of potential profits. Sure, the author has made a number of sales to the institutions (usually at a discount), but that number is small compared to the hypothetical potential market if every reader had purchased the books, magazines, or other medium. Some argue that’s essentially the equivalent of advertising costs, getting one’s work out there to be discovered. I don’t agree because libraries don’t buy all published works; only the ones for which there’s already a demand (and sometimes mandated by a school’s curriculum). Essentially, if you’re an unknown author, the chance of libraries buying your books are practically zero. How do you view library purchases of author’s works for distribution to a non-paying market?

      3) schools – I don’t know if your works are being used in schools to train young and susceptible minds (congratulations if that’s the case), but if they are, please refer to #1 and #2 as likely sources for those books (at least they were in the previous century when I attended school).

      Side note: consider how forcing readers to read someone’s books may actually work to sour said readers to the work in question . . . I know it did me.

      All this goes to the larger question . . . given the breadth and scope of the mining by large language models, and say we could agree on some amount of compensation for the authors, what would you imagine the compensation calculation to be?

      So, for example, pick any amount of gross or net profit made by these companies, would you say a given author (or writer) is due a flat percentage? Or maybe an amount based on the percentage of their works compared to the totality of the information mined? Some other metric?

      Let me float a likely real-life answer . . . pick a number for a settlement (a billion dollars), subtract the lawyer’s cut (say, 30%), then divide the remainder by the number of authors (published), and unpublished writers, bloggers, etc. etc. whose words were scanned by the algorithm, and I suspect the resulting cut would amount to not even being worth the time of making a claim (my experience from class-action suits) for anyone other than someone very famous or with a large investment in the original product.

      Ah, you say, but it would teach the companies a lesson and force them to change their ways . . . perhaps, in a fictional world, but I don’t see any evidence of it in the real world.

      I could be wrong . . . especially if I’m actually an AI.

      Even then, I would still make the following claim: none of this content has been generated by or with the help of AI.

  2. Another area of contention, but in that case — and without knowing anything about the case other than as presented by you — I think LoC is way off-base and wrong, ruling on the censure the product because of the tool.

    The thing is, almost everything is derivative. I publish lots of posts with photos I’ve taken, modified, made into paintings, etc., and I also post images I generate via various AI Art Generators. Those are my images generated from prompts I give to the AI using a tool I pay for, and for which I have rights of publication.

    I don’t see why those should be treated differently. Most of the artists who are complaining de so for the use of their source material, but they don’t have an argument if I want to employ their style for creating my own works.

    As far as I know, you can’t copyright styles or ideas, so I’d be interested in hearing the LoC argument.

    1. I fully agree. I have aa clothing designer who does futuristic designs using generative AI. She always says created by her name and assisted by Midjourney. I use either Midjourney or Nightcafe to generate art for my LinkedIn posts. I always let people they are created by one or the other and my brain. Should I choose to use the artwork for one of my children’s books I would note in the credits the illustrations are by me assisted by which ever one I was using. I don’t think that’s wrong.

  3. What about the use of AI generated artwork? There was a young lady wrote a graphic novel and obtained a copyright. She mailed a copy of her book to the Library of Congress and they revoked her copyright because she used AI to generate her cover and her art. They were not disputing content, they were disputing the art.

  4. Interesting topic.

    Personally, I don’t believe there’s any way to rein this in, and the suggested rules — as with most rules — are likely to be followed only by honest individuals.

    Besides, considering already few people pay attention to copyright material, and the reality is that bringing suit against someone is costly and judgments limited to damages (for example, zero in the case someone steals any of my blog-published fiction content), copyright infringement law will — as is currently — only be useful to someone with deep pockets suing over content that achieved a measure of success. 

    Given traditionally published books these days require agents just to be considered, AI content will likely flood the self-published arena, but that already offers modest reward, and even then, only for a few.

    Side note: I await the day scammers will approach self-published authors of AI-generated tripe with offers of traditional publishing and movie contracts procured for a fee . . . and won’t that be fun to watch!

    Given my understanding of how things work, I don’t see AI-produced work having any advantage over new writers . . . neither will get a break unless they network, and there, AI is at a definite disadvantage (until merged with some type of soft or hard robot).

    I agree that the short fiction market will suffer the most, but, again, I don’t see how the coming flood of AI-generated submissions can be stemmed.

    In fact, I see a business opportunity for someone actually asking for AI content, likely becoming a genre onto its own.

    But, what do I know? Perhaps the copyright office can formulate an effective policy. Still, I think it will be difficult to enforce.

Leave a Reply

NOVEMBER 3, 2023

Contest Caution: Lichfield Institute Writing Contest

READ
NOVEMBER 24, 2023

Two New Solicitation Bewares: DiscoverPublishers.com and Reseller Ventures

READ