In Part 1 of our ongoing series on Artificial Intelligence (AI) and Ownership, we reviewed how various governments and courts deal with ownership over AI generated work.
In Part 2 of the series, we examined novel legal issues arising at the intersection of AI, IP, competition and privacy laws, including how celebrities are increasingly asserting their privacy and publicity rights in light of AI-generated deepfakes.
In this, Part 3 of our ongoing series, we analyse the increasing number of copyright cases that have been brought against various AI companies over the last few years.
SCRAPE, LICENSE, STEAL
We know from Part 2 of our series, that Large Language Models (LLMs) are trained using existing work, in which copyright often subsists, and that companies such as Stability AI, Meta, and Open AI have been sued for using copyrighted material to train their AI models without permission.
What started as a trickle of litigation is now a flood. On June 24, 2024, Sony, Universal, and Warner Music filed lawsuits in the District Court of Massachusetts against Udio and Suno, two companies that allow users to create AI-generated music. The lawsuits contend that these companies have used copyrighted sounds and songs without authorization to train their advanced systems. Notably, the music produced by the AI often closely resembles the original songs and voices, posing a threat to the value of the original recordings.
In response, Udio defended their model’s training process by comparing it to how students study music, suggesting a learning rather than copying mechanism. Meanwhile, Suno argued that their technology creates entirely new outputs, prioritizing originality over mere imitation. The labels are demanding up to $150,000 in damages for each instance of copyright infringement.
On 13 June 2024, Forbes accused Perplexity of copyright infringement for using its content without attribution. The letter from Forbes demands removal of misleading articles, reimbursement for advertising revenue, and assurances against future use of Forbes’ intellectual property. Perplexity’s CEO acknowledged the issue and pledged improvements. Forbes awaits a response within 10 days, reserving the right to pursue further action if needed.
AI COPYRIGHT LITIGATION: STATE OF DISUNION
In this piece, we take a closer look at the facts and issues involved in numerous lawsuits, similar to the Sony and Forbes cases, that are currently being heard before various U.S. courts.
Most of these cases are still in early stages of hearing, and opinions or decisions have only been handed down in a handful of cases, including:
- Thomson Reuters v. Ross Intelligence[1]
Ross Intelligence wanted to create a legal AI which generates quotes of judicial opinions and hired LegalEase to prepare the training data. LegalEase had a limited license from Thomson Reuters to use WestLaw and a service agreement that barred transferring WestLaw’s data to others.
Thomson Reuters suspected that the training data had WestLaw’s copyrighted headnote and alleged both copyright infringement and tortious interference with LegalEase’s contract with WestLaw. Ross claimed the defence of fair use and pre-emption. Subsequently, both parties sought summary judgment on their claims.
In the District Court of Delaware, a Circuit Judge presided over the case and opined that infringement depends on factors such as: valid copyright ownership, actual copying, and substantial similarity. Although evidence showed actual copying, the jury was left to decide the infringement claim due to factual discrepancies.
The defence of fair use claimed by Ross was discussed and examined based on four factors:
- The purpose and use- This factor suggests that if commercial use outweighs the transformativeness, it is not fair use. The Judge opined that a jury should examine the transformativeness in this case.
- Nature of copyrighted work- This factor suggests that copying of factual work acceptable. The Judge expressed that since judicial opinions or headnotes serve an informational purpose, it is fair use, however, due to disputed facts, the jury should determine the same.
- Amount and Substantiality of the portion copied- This factor focuses on the essence of portion copied. The Judge opined that since the working of the AI is a disputed fact, the jury needs to determine this factor.
- Effect on the market- This factor examines the potential of the copied work to substitute the original work. The Judge opined that while both the parties are competing in the same field, the jury should determine whether Ross can substitute Westlaw and whether training of AI using copyrighted material is justified when it is for public benefit.
Tortious interference was claimed on grounds that Ross induced LegalEase to breach contracts by: using Westlaw for a competing product, scraping content, and sharing passwords. The first claim was pre-empted by Copyright Act, while the other claims are to be decided by the jury on whether Ross knew about the contract, intentionally caused the breach, and had justification for his actions.[2]
- Sarah Anderson et al v. Stability AI et al [3]
This case involves a class action lawsuit against Stability AI, Midjourney, and DeviantArt (collectively the defendants). The allegations are that Stability AI’s software, Stable Diffusion, utilized by the defendants to train AI models, collected and saved billions of copyrighted images belonging to the plaintiffs/authors without their consent.
Part 2 of our series discussed the claims against which Stability AI and Midjourney filed motions to dismiss and DeviantArt filed a special motion to strike under the California Code of Civil Procedure.[4]
The defendants argued the infringement and DMCA claims lacked specificity on the material used. They claimed the work was transformative and not violative of publicity rights. The Unfair competition and deception claims were denied as they were pre-empted by the Copyright Act.
The Judge dismissed the plaintiff’s claims due to lack of clarity regarding how each company violated their copyrights, altered, or removed their copyright management information (CMI), or infringed upon their rights of publicity and gave them an opportunity to amend all their claims.
- Main Sequence, Ltd. et al vs. Dudesy, LLC et al [5]
Plaintiffs, Main Sequence, Ltd., Jerold Hamza as executor for the Estate of George Carlin, and Jerold Hamza inpidually, initiated a legal action for copyright infringement and violation of Carlin’s right of publicity against Dudesy, LLC, Will Sasso, Chad Kultgen, and others. They claim ownership of all intellectual property rights related to George Carlin, including his albums and comedy specials. Jerold Hamza, Carlin’s longtime manager, manages Main Sequence Ltd.The Plaintiffs argue that the Defendants unlawfully used Carlin’s works to create an AI-generated “George Carlin Special,” distributed via the Dudesy podcast and website. This show, mimicking Carlin’s voice and style, allegedly infringes on their exclusive copyright and right of publicity, damaging Carlin’s reputation and causing financial harm. The Defendants monetized this content through advertisements and promotions.
Defendants claim their creation is similar to a human impressionist’s work, but the Plaintiffs counter that using AI to replicate Carlin’s work without permission is fundamentally different. They argue this technological manipulation unlawfully appropriates Carlin’s identity and diminishes the value of his genuine creations.
Following negotiations, the parties agreed to a consent judgment and permanent injunction. This agreement permanently restrains the Defendants from using Carlin’s image, voice, or likeness without written approval from the Plaintiffs and from broadcasting the Dudesy Special. The judgment also clarifies that while third-party uploads of the Special violate the injunction, the Defendants would not be held liable.
- Tremblay v. OpenAI [6]
The plaintiffs are authors who filed a class action suit for unauthorized use of their copyrighted material, alleging infringement, unfair competition, DMCA violation, and unjust enrichment. OpenAI, the defendant, filed a motion to dismiss all claims except infringement.The judge dismissed infringement due to insufficient evidence, DMCA violation for no proof of CMI removal, and unjust enrichment for lacking evidence of benefits through fraud, mistake, or coercion. Most claims were dismissed with an opportunity to amend, but unfair competition favoured plaintiffs since OpenAI profited commercially.
Subsequently, the court ordered consolidation of the case with two similar suits (Silverman et al v. OpenAI and Chabon et al v. OpenAI) and the consolidated case is pending in the court.
Kadrey v. Meta along with Chabon v. Meta
The plaintiffs are authors who sued Meta for using their copyrighted material without permission to train AI. They alleged that the language modules of Meta were infringing derivative works[7] and violative of DMCA for removing/altering CMI. Other claims such as unjust enrichment, unfair competition, and negligence were also made.The court consolidated another similar case by Chabon and dismissed the infringement claim due to insufficient evidence for derivative work claims. It also dismissed claims of unfair competition, unjust enrichment, and negligence because they were pre-empted by the Copyright Act. The DMCA violation claim was also dismissed as there was no evidence that Meta distributed copies of the books.
The court granted permission to amend all the claims, except the claim for negligence which was completely dismissed.
- Doe v. GitHub
This case pertains to the complaint by programmers[8] against GitHub, Microsoft, and Open AI as discussed in Part 2 of our series. Initially dismissed due to lack of clarity on harm, an amended complaint has been filed, providing proof of injury and including additional affected inpiduals, J.Doe 3, 4, and 5.The Court assessed damages and remedies for all J.Doe’s. It dismissed DMCA violation claims due to lack of proof regarding identical copies. Other claims like unfair competition, unfair enrichment, negligence, intentional and negligent interference with prospective economic relations were rejected due to pre-emption. Regarding relief, it decided that J.Doe’s 1, 2, and 5 can seek injunctive relief and damages, while Does 3 and 4 can only seek injunctive relief.
While court granted an opportunity to amend the DMCA claims, the other claims were dismissed with prejudice.
PENDING LITIGATION
There are also several ongoing cases where a complaint has been filed but the matter hasn’t reached hearing:
- Recently, eight popular newspaper publishers sued OpenAI and Microsoft for using copyrighted material without permission to train their AI model.[9] They alleged ChatGPT scraped their online content without compensating them and replicated their work verbatim. OpenAI has not filed a legal reply but expressed their intent to support news organizations and improve reader experiences. Microsoft, on the other hand has not commented on the issue.
- In February 2024, Intercept Media and Raw Story and Alternet Media separately filed cases, alleging that ChatGPT omitted copyright credits in its responses, indicating that training models lacked required copyright management information (CMI). On April 29th, OpenAI moved to dismiss the claims, citing jurisdictional issues and insufficient evidence of CMI removal.
- The New York Times sued OpenAI for unauthorised use of its copyrighted material. OpenAI responded to the suit through a blog post claiming the defence of fair use. While they subsequently sought to dismiss the claims, the decision of this case remains pending.
- The Authors Guild along with 17 authors sued OpenAI for unauthorised usage of copyrighted work which caused monetary losses.[10] Following that, author Julian Sancton filed a complaint and a motion to link their case with this case. Complaints by Alter and Basbanes, with similar claims, were consolidated. OpenAI has moved to dismiss the case citing fair use.
- Political figure Mike Huckabee and various authors have filed a lawsuit against Meta, Eleuther AI, and Bloomberg (the defendants) for copyright infringement, unjust enrichment, and negligence arising from unauthorized use of copyrighted material in AI training. The defendants have claimed the defence of fair use in their motion to dismiss the claim
- Getty Images filed a suit against Stability AI for copying more than 12 million images from Getty’s website.[11] While Stability AI filed a motion to dismiss the claims, the motion was dismissed without prejudice.
CONCLUSION
The increasing flow of copyright lawsuits appears to have persuaded AI companies that discretion is the better part of valour, and brought them to the negotiating table to seek licences for the material they need to train their datasets, instead of scraping it first and letting their lawyers deal with the mess later.
On 22 May 2024, News Corp, which is the owner of publications like the Wall Street Journal announced its partnership with OpenAI. This multi-year agreement grants OpenAI access to News Corp’s vast archive of articles. This data will be used to train OpenAI’s models and potentially be incorporated into user queries, with the ultimate goal of ensuring AI responses are grounded in reliable news sources and uphold journalistic integrity. On 22 February 2024 , Google and Reddit entered a significant partnership wherein Reddit supplies Google with real-time content via its data API. This collaboration enables Google to efficiently access Reddit’s vast content corpus, enhancing its capability to feature Reddit content innovatively across its platforms. On 11 July 2023, Shutterstock announced its partnership with OpenAI. This six-year agreement grants OpenAI access to Shutterstock’s vast library of creative assets for training its AI models, ultimately benefiting brands and marketing companies with cutting-edge creative tools.
At any rate, these ongoing copyright lawsuits will, when resolved, essentially establish the rules of the game in the Age of AI: as to whether an AI model can violate a person’s publicity rights, as to whether scraping constitutes fair use, as to the limits of the Text Data Mining exception, and many other issues.
[1]No. 1:20-cv-613-SB filed on 6th May 2020
[2]Section 106 mentions author’s rights in reproduction, distribution for sale, public performance and display, and derivative works.
[3]https://fingfx.thomsonreuters.com/gfx/legaldocs/byprrngynpe/AI%20COPYRIGHT%20LAWSUIT%20mtdruling.pdf
[4]https://www.roninlegal.in/post/ai-ownership-ip-competition-law-and-publicity-rights-in-a-changing-world-part-2
[5] https://www.thewrap.com/wp-content/uploads/2024/01/Carlin-_-Kultgen.pdf
[6]https://fingfx.thomsonreuters.com/gfx/legaldocs/zgvokjoyevd/frankel-aidmca–tremblayruling.pdf
[7]A derivative work is a new piece of work that is built upon one or more existing works and can be transformed or adapted in different forms.
[8]Unidentified persons J.Doe 1 and 2
[9]Complaint: Microsoft Word – MNG Complaint (FINAL for filing 4-30-2024)(5006410.1) (thomsonreuters.com)
[10]Authors Guild v. Open AI: https://storage.courtlistener.com/recap/gov.uscourts.nysd.606655/gov.uscourts.nysd.606655.1.0_1.pdf
[11] Getty v. Stability AI: Complaint
Authors: Shantanu Mukherjee, Jyothsna Kishore, Alan Baiju, and Kalpana Nailwal