Wikipedia is giving AI developers its data to fend off bot scrapers

Wikipedia is attempting to dissuade artificial intelligence developers from scraping the platform by releasing a dataset that’s specifically optimized for training AI models. The Wikimedia Foundation announced on Wednesday that it had partnered with Kaggle — a Google-owned data science community platform that hosts machine learning data — to publish a beta dataset of “structured Wikipedia content in English and French.”

Source: Wikipedia is giving AI developers its data to fend off bot scrapers

Get the latest RightsTech news and analysis delivered directly in your inbox every week
We respect your privacy.