Millions of stories published by sites including The New York Times and The New York Daily News have been found in three weeks of searching OpenAI’s training dataset. The news publishers are currently trawling through data to find instances of their copyrighted work being used to train OpenAI’s models – but they say the tech company should be forced to provide the information itself.
Source: ‘Millions’ of NYT and NY Daily News stories taken by OpenAI for training data