Wikipedia is attempting to dissuade artificial intelligence developers from scraping the platform by releasing a dataset that’s specifically optimized for training AI models. The Wikimedia Foundation announced on Wednesday that it had partnered with Kaggle — a Google-owned data science community platform that hosts machine learning data — to publish a beta dataset of “structured Wikipedia content in English and French.”
Source: Wikipedia is giving AI developers its data to fend off bot scrapers