Writer accuses OpenAI of copyright infringement, court allows lawsuit to proceed

On Monday, October 27, US District Court Judge Sidney H. Stein ruled that the American artificial intelligence company, OpenAI, must face a collective lawsuit filed by authors alleging copyright infringement. The motion by OpenAI to dismiss the lawsuit was rejected by the court.

The lawsuit, initiated by multiple authors on June 13, accuses OpenAI and its major funder Microsoft of “blatantly and harmfully infringing on their copyrights”.

According to the lawsuit, the defendants copied the works of the authors and input them into their “large language models” (LLMs) – algorithms designed to generate human-like textual responses to user prompts and queries. The lawsuit claims that such algorithms constitute the core of the defendants’ vast commercial enterprise and that the core of these algorithms involves extensive systemic theft.

OpenAI sought the court to dismiss the authors’ copyright infringement claims.

In the ruling on October 27, Judge Stein sided with the authors, rejecting the defendants’ motion and stating that the authors’ allegations “at the very least satisfy the elements of a preliminary infringement claim regarding some output content of ChatGPT”.

The judge wrote that to train ChatGPT, OpenAI utilized textual data sets that include copyrighted works by the plaintiffs. Upon receiving prompts, ChatGPT could produce accurate summaries of the plaintiffs’ books.

In a court filing, OpenAI argued that the plaintiffs failed to convincingly demonstrate “substantial similarity” between their works and the output content of ChatGPT, without providing any examples within the complaint of copyrighted infringements in the ChatGPT output content.

OpenAI contended that summarizing book content does not constitute infringement. For instance, summarizing the ending of Mary Roberts Rinehart’s novel “The Door” as “the butler is the killer” should not be considered infringement.

However, Judge Stein rejected OpenAI’s arguments, stating that the complaint had sufficiently alleged that OpenAI accessed their works and ChatGPT’s generated output content was based on these works, thus constituting conditions for “actual copying”.

Subsequently, Judge Stein elaborated on ChatGPT’s summary of George R. R. Martin’s “A Game of Thrones” from the “A Song of Ice and Fire” series. ChatGPT in the summary described the book’s background setting, prologue, main plot points, and ending.

He pointed out that “it is not difficult to discern that this detailed summary bears substantial similarity to Martin’s original work, as it recapitulates the original’s plot, characters, and themes, especially when conveying the overall tone and atmosphere of the original work.”

Martin is one of the plaintiffs in this case, with other plaintiffs including authors John Grisham and David Baldacci.

“Epoch Times” reached out to OpenAI for comments but had not received a response at the time of publication.

On September 5, the law firm Banner Witcoff released an article stating that prior to the court’s ruling, the AI company Anthropic (manufacturer of Claude AI) agreed last month to pay $1.5 billion for copyright infringement issues.

The article alleges that Anthropic was accused of using approximately 500,000 copyrighted works to train its AI. The court had previously ruled that using pirated copies to train AI constitutes not “fair use”.

In a statement on September 25, the Writers Guild of America, which is also a plaintiff in the OpenAI lawsuit, welcomed the preliminary approval of the settlement with Anthropic.

The guild stated, “This settlement marks a significant milestone in writers’ fight against AI companies’ infringement. It sends a clear message to the AI industry that infringing on writers’ rights will come at a high price and will undoubtedly push AI companies to obtain necessary books through legitimate licensing.”

“This case is particularly significant as it serves as an example of how class action lawsuits can be an effective way to pursue substantial damages for widespread copyright infringement.”

In addition to OpenAI and Anthropic, authors have also initiated similar legal actions against multiple tech companies using their works to train AI models.

On October 15, two authors filed a lawsuit against the cloud service company Salesforce, alleging that the enterprise “illegally used hundreds of thousands of copyrighted books” for developing its XGen series large language models.

In another lawsuit, thirteen authors took similar infringement actions against Meta, which concluded in Meta’s favor in June.

The court dismissed the plaintiffs’ claim that Meta reproduced their works to create products that could saturate the market with similar works, stating that the authors “failed to provide evidence of how Meta’s model’s current or anticipated output content could weaken the market for their own works”.