Why language models collapse when trained on recursively generated text, 2024

Lecheng Wang, Xianjie Shi, Ge Li, Jia Li, Yihong Dong, Xuanming Zhang, Wenpin Jiao, Hong Mei · 2024 · arXiv 2412.14872

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

PASTA: A Paraphrasing And Self-Training Approach for Knowledge Updating in LLMs

cs.CL · 2026-06-27 · unverdicted · novelty 5.0

PASTA combines data augmentation and a self-learning DPO process to integrate new factual knowledge from news articles into LLMs, raising accuracy from 0.02 to 0.82 on post-cutoff questions while preserving general capabilities.

citing papers explorer

Showing 1 of 1 citing paper.

PASTA: A Paraphrasing And Self-Training Approach for Knowledge Updating in LLMs cs.CL · 2026-06-27 · unverdicted · none · ref 23
PASTA combines data augmentation and a self-learning DPO process to integrate new factual knowledge from news articles into LLMs, raising accuracy from 0.02 to 0.82 on post-cutoff questions while preserving general capabilities.

Why language models collapse when trained on recursively generated text, 2024

fields

years

verdicts

representative citing papers

citing papers explorer