OpenAI destroyed a trove of books used to train AI models

Newly unsealed documents in the class-action lawsuit brought by the Authors Guild against OpenAI show the startup deleted two huge datasets, named “books1” and “books2,” that had been used to train its GPT-3 artificial-intelligence model. Lawyers for the Authors Guild said in court filings that the datasets probably contained “more than 100,000 published books.”

Source: OpenAI destroyed a trove of books used to train AI models. The employees who collected the data are gone.

Share this: