The artists suing AI image generators for copyright infringement suffered a setback this week when a federal district judge in California dismissed most of the charges they brought against Stability AI, Midjourney and DeviantArt. In a 28-page order, Judge William Orrick called the plaintiffs’ complaint “defective in numerous respects,” and noted that two of the three named artist-plaintiffs, Karla Ortiz and Kelly McKernan, had never registered their work with the U.S. Copyright Office and therefore are precluded from bringing an infringement claim in federal court in the first place.
The ruling is not necessarily fatal to the putative class action case, however, as the judge granted plaintiffs leave to file an amended complaint to remedy the defects, at points even suggesting ways that some of the problem with the initial pleading might be fixed.
The ruling does, however, illustrate the difficulty creators and copyright owners face in trying to hammer the design and function of generative AI systems into a legal framework premised on a very different technological architecture and operational paradigm.
To highlight just two examples from Judge Orrick’s ruling:
The court distinguishes between the work of Stability AI, which had a hand in directing the copying and assembling of presumably copyrighted images into the LAION (Large-Scale Artificial Intelligence Open Network) dataset used to train Stable Diffusion and could plausibly sustain a charge of direct infringement, and the subsequent use of the Stable Diffusion software engine by DeviantArt and Midjourney to power their apps.
But Judge Orrick is skeptical of plaintiffs’ claim that Stable Diffusion retains “compressed” copies of the images that DeviantArt and Midjourney are distributing through their apps.
He writes:
Turning to the first theory of direct copyright infringement and the plausibility of plaintiffs’ assertion that Stable Diffusion contains “compressed copies” of the Training Images and DeviantArt’s DreamUp product utilizes those compress copies, DeviantArt is correct that the Complaint is unclear. As noted above, the Complaint repeatedly alleges that Stable Diffusion contains compressed copies of registered works. But the Complaint also describes the diffusion practice as follows:
Because a trained diffusion model can produce a copy of any of its Training Images — which could number in the billions — the diffusion model can be considered an alternative way of storing a copy of those images. In essence, it’s similar to having a directory on your computer of billions of JPEG image files. But the diffusion model uses statistical and mathematical methods to store these images in an even more efficient and compressed manner.
Plaintiffs will be required to amend to clarify [sic] their theory with respect to compressed copies of Training Images and to state facts in support of how Stable Diffusion – a program that is open source, at least in part – operates with respect to the Training Images…
Depending on the facts alleged on amendment, DeviantArt (and Midjourney) may make amore targeted attack on the direct infringement contentions. It is unclear, for example, if Stable Diffusion contains only algorithms and instructions that can be applied to the creation of images that include only a few elements of a copyrighted Training Image, whether DeviantArt or Midjourney can be liable for direct infringement by offering their clients use of the Stable Diffusion “library” through their own apps and websites.
The questions Judge Orrick raises go very much to the heart of claims heard frequently from copyright owners about generative AI systems: that they make and retain actual copies — compressed or otherwise — of protected works, in the same manner that a computer makes and retains copies of documents it accesses, or as a tape recorder makes an archivable hard copy of a musical work. But as Orrick notes, the plaintiffs themselves struggle to make clear how they believe that process actually works.
A second problem the ruling identifies, also concerning a claim heard often from rights owners, is the contention that generative AI models necessarily produce derivative works, as defined in the Copyright Act, that are based on protected works copied in their training.
Referring to a line of cases plaintiffs cite to support their derivative works claim, Orrick writes:
Plaintiffs rely on that line of cases and point to their allegation that all elements of plaintiff Anderson’s copyrighted works (and the copyrighted works of all others in the purported class) were copied wholesale as Training Images and therefore the Output Images are necessarily derivative. See Compl. ¶ 95 (“Every output image from the system is derived exclusively from the latent images, which are copies of copyrighted images. For these reasons, every hybrid image is necessarily a derivative work.”).
A problem for plaintiffs is that unlike in Range Road – observed wholesale copying and performing – the theory regarding compressed copies and DeviantArt’s copying need to be clarified and adequately supported by plausible facts.
The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable), or that all DeviantArt users’ Output Images rely upon (theoretically) copyrighted Training Images, and therefore all Output images are derivative images.
Even if that clarity is provided and even if plaintiffs narrow their allegations to limit them to Output Images that draw upon Training Images based upon copyrighted images, I am not convinced that copyright claims based a derivative theory can survive absent “substantial similarity” type allegations.
It’s unfortunate for all concerned that the plaintiffs in this case (and their lawyers) have framed their arguments so poorly. It can perhaps be attributed, at least in part, to the panicked haste with with artists and rights owners are racing to the court house in the face of the explosive development of generative AI technology. But as the lawyers say, bad cases make bad law. And this is a bad case.
More fundamentally, though, its a case that illustrates the challenge of driving the square peg of AI into the round hole of copyright. Copyright law has, for centuries, managed to adjust to and accommodate new technologies. It arose, in the Statute of Anne in 1709, in response to the technology of the printing press. And it has been extended and refined with the introduction of each new such technology without it deviating from its original operational premises.
The VCR clearly made human-perceptible copies of TV programs; the only question was whether those copies should be considered infringing. Television itself plainly performed and distributed copyrighted programming; the only question was who should pay whom for the rights to do so and how.
Here, the plaintiffs make claims of infringement of three of the five basic rights reserved exclusively to authors by the Copyright Act: the rights of reproduction, distribution and the preparation of derivative works. But the court found none of those claims persuasive based on the facts alleged so far.
Whatever Stable Diffusion is doing, it isn’t making, retaining or distributing copies of all 5 billion or more images in the LAION dataset, no matter how they might be compressed. And to treat what it does as comparable or analogous to those things would be to make the sort of category error that serves neither artists, technology nor the law.