New York Times Files Lawsuit Against OpenAI and Microsoft for Copyright Infringement

By Matt McGregor

The New York Times has filed a copyright infringement lawsuit arguing that Microsoft and OpenAI used its content as a paradigm from which the two companies built their large-language models (LLMs).

The lawsuit, filed in the U.S. Southern District of New York, alleges that, to construct its generative artificial intelligence program, the defendants copied “millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more.”

“While Defendants engaged in wide-scale copying from many sources, they gave Times content particular emphasis when building their LLMs—revealing a preference that recognizes the value of those works,” the lawsuit states. “Through Microsoft’s Bing Chat (recently rebranded as “Copilot”) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.”

The lawsuit states that the defendant’s illegal use of NY Times’ content puts its work as a reliable, independent news source at risk while violating U.S. copyright law, which the defendants “refused to recognize.”

“The Constitution and the Copyright Act recognize the critical importance of giving creators exclusive rights over their works,” the lawsuit states. “Since our nation’s founding, strong copyright protection has empowered those who gather and report news to secure the fruits of their labor and investment. Copyright law protects The Times’s expressive, original journalism, including, but not limited to, its millions of articles that have registered copyrights.”

According to the lawsuit, there are numerous examples of how the AI programs copy The Times’s content verbatim, in addition to attributing incorrect information to the media source.

“Using the valuable intellectual property of others in these ways without paying for it has been extremely lucrative for Defendants,” the lawsuit states. “Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone.”

OpenAI’s value has increased to $90 billion since the release of its ChatGPT, the lawsuit says.

And OpenAI’s release of ChatGPT has driven its valuation to as high as $90 billion.

“Defendants’ GenAI business interests are deeply intertwined, with Microsoft recently highlighting that its use of OpenAI’s ‘best-in-class frontier models’ has generated customers—including ‘leading AI startups’—for Microsoft’s Azure AI product,” the lawsuit states.

Screens displaying the logos of Microsoft and ChatGPT, a conversational artificial intelligence application software developed by OpenAI. (Lionel Bonaventure/AFP via Getty Images)

No Resolution

The Times filed the lawsuit on Dec. 27 after it failed to reach a resolution with the defendants, who said their use of its content was protected by “fair use.”

“For months, The Times has attempted to reach a negotiated agreement with Defendants, in accordance with its history of working productively with large technology platforms to permit the use of its content in new digital products (including the news products developed by Google, Meta, and Apple),” the lawsuit states.

The negotiations were being carried out to guarantee that The Times was able to maintain control over its intellectual property rights while assisting in the development of artificial intelligence in “a responsible way,” the lawsuit states.

Artificial intelligence programs have become a threat to journalism, the lawsuit says.

“If The Times and its peers cannot control the use of their content, their ability to monetize that content will be harmed,” the lawsuit states.“With less revenue, news organizations will have fewer journalists able to dedicate time and resources to important, in-depth stories, which creates a risk that those stories will go untold. Less journalism will be produced, and the cost to society will be enormous.”

A sign for The New York Times hangs above the entrance to its building in New York, on May 6, 2021. (Mark Lennihan/AP Photo)

From Nonprofit to For-Profit

According to the lawsuit, OpenAI was founded in 2015 as a nonprofit, but after three years, it dropped status as a nonprofit and became a “multi-billion-dollar for-profit business built in large part on the unlicensed exploitation of copyrighted works belonging to The Times and others.”

This was followed by an end to a former commitment to its research and development being open to the public, the lawsuit states.

OpenAI defended the secrecy as a means to protect its designs from other companies; however, the lawsuit argues that this secrecy is instead to hide the content OpenAI is copying.

After its release in November 2022, OpenAI’s ChatGPT “became a household name,” the lawsuit states, garnering 100 million users within three months and generating $80 million a month for the company.

The lawsuit alleges that Microsoft has been “intensely involved” in developing and commercializing OpenAI’s programs.

“Microsoft is the sole cloud computing provider for OpenAI,” the lawsuit states. “Microsoft and OpenAI collaborated to design the supercomputing systems powered by Microsoft’s cloud computer platform Azure, which were used to train all OpenAI’s GPT models after GPT-1.”

The lawsuit alleged that both Microsoft and OpenAI “acted jointly” in copying massive amounts of The Times’s content to train the AI programs so that they could imitate the media outlet’s writing.

“The Times invests enormous resources in creating its content to inform its readers, who in turn purchase subscriptions or engage with The Times’s websites and mobile applications in other ways that generate revenue,” the lawsuit states. “Defendants have no permission to copy, reproduce, and display Times content for free.”

More Lawsuits to Come

According to the BakerHostetler law firm, there has been “a flurry of copyright litigations” since the rise of AI, with currently ten lawsuits filed, with more expected.

“Generative AI raises challenging (and sometimes existential) questions about copyright protection, liability and enforcement,” the firm said. “Content creators, generative AI developers and end users are monitoring how these issues play out in courts and trying to adapt their own conduct to minimize risk without unnecessarily forgoing the benefits of this technology.”

The cases will carry “far-reaching consequences,” the firm said, about how AI will be used in the future.

“Whether courts find that using copyrighted works to train large language models is fair use will impact the finetuning of existing LLMs and the creation and use of more specialized machine learning (ML) models,” the firm said. “Some of the plaintiffs’ theories raise questions of copyright liability based on using generative AI, not just creating it.”

No Resolution

From Nonprofit to For-Profit

More Lawsuits to Come

Related posts:

USNN2020