The problem of ‘model collapse’: how a lack of human data limits AI progress

The use of computer-generated data to train artificial intelligence models risks accelerating their collapse into nonsensical results, according to new research that highlights looming challenges to the emerging technology.

Leading AI companies, including OpenAI and Microsoft, have tested the use of “synthetic” data — information created by AI systems to then also train large language models (LLMs) — as they reach the limits of human-made material that can improve the cutting-edge technology.

Research published in Nature on Wednesday suggests the use of such data could lead to the rapid degradation of AI models. One trial using synthetic input text about medieval architecture descended into a discussion of jackrabbits after fewer than 10 generations of output.

您已閱讀20%（770字），剩餘80%（2988字）包含更多重要資訊，訂閱以繼續探索完整內容，並享受更多專屬服務。

相關話題

The problem of ‘model collapse’: how a lack of human data limits AI progress

人工智慧

相關話題