New York Times sues OpenAI, Microsoft for using articles to train AI
[ad_1]
“For months, The Times has attempted to reach a negotiated agreement,” the Times’s lawyers said in the lawsuit. “These negotiations have not led to a resolution.”
Spokespeople for OpenAI and Microsoft did not immediately respond to requests for comment.
The “large language models” (LLMs) behind AI tools such as ChatGPT work by ingesting huge amounts of text scraped from the internet, learning the connections between words and concepts, and then developing the ability to predict what word to say next in a sentence, allowing them to mimic human speech and writing. OpenAI, Microsoft and Google have refused to reveal what goes into their newest models, but previous LLMs have been shown to include large amounts of content from news organizations and catalogues of books.
The tech companies have steadfastly said that the use of information scraped from the internet to train their AI algorithms falls under “fair use” — a concept in copyright law that allows people to use the work of others if it is substantially changed.
The Times’s lawsuit, however, includes multiple examples of OpenAI’s GPT-4 AI model outputting New York Times articles word for word. More artists, authors, musicians, filmmakers and other creative professionals also are pushing back, saying that wealthy tech companies are using the creators’ output to build tools that in some ways already are undermining the creators’ work.
Legal experts have said that plaintiffs will have stronger cases of copyright infringement if they can show that AI tools are directly reproducing copyrighted works, rather than paraphrasing the information from them.
Some of these plaintiffs, including blockbuster writers such as George R.R. Martin, Jodi Picoult, Jonathan Franzen and George Saunders, have sued OpenAI. Since August, at least 583 news organizations, including the Times, The Washington Post and Reuters, have installed blockers on their websites to prevent tech companies from scraping their articles. But it’s likely that their online catalogues, going back decades, already have been used to create AI tools.
Meanwhile, OpenAI has been negotiating deals with news organizations over the past year to pay them for content. In July, it signed a deal with the Associated Press for access to its archive of news articles. But in October, a spokesperson for OpenAI said that the company’s practices do not violate copyright laws and that the deals it was negotiating would be intended only for accessing content that it couldn’t get online or to show links or full sections of articles in ChatGPT.
[ad_2]