Tech companies are turning to ‘synthetic data’ to train AI models – but there’s a hidden cost

A primary concerns is that AI models can “collapse” when they rely too much on synthetic data. This means they start generating so many “hallucinations” – a response that contains false information – and decline so much in quality and performance that they are unusable. For example, AI models already struggle with spelling some words correctly. If this mistake-riddled data is used to train other models, then they too are bound to replicate the errors.

Source: Tech companies are turning to ‘synthetic data’ to train AI models – but there’s a hidden cost

Get the latest RightsTech news and analysis delivered directly in your inbox every week
We respect your privacy.