Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI development’s accepted core beliefs – that the more pre-training data the better the performance. As reported by HPCwire, a new paper discuses the concept of “catastrophic overtraining,” whereby extended pre-training can harm a model’s performance after fine-tuning.