OpenAI destroyed a trove of books used to train AI models

Newly unsealed documents in the class-action lawsuit brought by the Authors Guild against OpenAI show the startup deleted two huge datasets, named “books1” and “books2,” that had been used to train its GPT-3 artificial-intelligence model. Lawyers for the Authors Guild said in court filings that the datasets probably contained “more than 100,000 published books.”

Source: OpenAI destroyed a trove of books used to train AI models. The employees who collected the data are gone.

Get the latest RightsTech news and analysis delivered directly in your inbox every week
We respect your privacy.