In 30-step recursive LLM loops, append-mode persistent escape from source basins reaches 50% near 400 tokens under full history but plateaus below 50% under tail-clip memory policy, while replace-mode switching largely reflects state reset.
Self-consuming generative models go mad
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 10representative citing papers
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
Self-training restructures language by amplifying surface markers and collapsing deep syntax according to structural depth rather than frequency, as evidenced by correlations across multiple models and a human fine-tuning control.
Alice v1 is an open video model that surpasses its teacher and closed-source systems like Veo3 and Sora2 in quality while running 7x faster through specialized distillation.
LLM translations introduce model-specific statistically significant emotional fingerprints that limit preservation of author voice, with post-editing providing partial alignment to human norms.
Closed-loop multi-LLM systems exhibit robust semantic collapse across model families and interventions, consistent with intrinsic properties of autoregressive generation.
Filter Babel explores a future of AI-personalized private experiences that may erode common ground in communication while supporting individual identity and selfhood.
A theoretical framework for parameter estimation in inverse problems shows inversion does not necessarily improve accuracy per the data processing inequality and reveals a vulnerability in domain generalization via the Double Meaning Theorem.
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.
citing papers explorer
-
On Inverse Problems, Parameter Estimation, and Domain Generalization
A theoretical framework for parameter estimation in inverse problems shows inversion does not necessarily improve accuracy per the data processing inequality and reveals a vulnerability in domain generalization via the Double Meaning Theorem.
-
Losing our Tail, Again: (Un)Natural Selection & Multilingual LLMs
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.
-
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.