Leading AI companies rely more on content from premium publishers to train their large language models (LLMs) than they publicly admit, according to new research from executives at Ziff Davis. While AI firms generally do not say exactly what data they use for training, executives from Ziff Davis say their analysis of publicly available datasets makes it clear that AI firms rely disproportionately on commercial publishers of news and media websites to train their LLMs.
Source: Ziff Davis study says AI firms rely on publisher data to train models