In 30-step recursive LLM loops, append-mode persistent escape from source basins reaches 50% near 400 tokens under full history but plateaus below 50% under tail-clip memory policy, while replace-mode switching largely reflects state reset.
Self-consuming generative models go mad
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 9representative citing papers
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
Self-training restructures language by amplifying surface markers and collapsing deep syntax according to structural depth rather than frequency, as evidenced by correlations across multiple models and a human fine-tuning control.
Alice v1 is an open video model that surpasses its teacher and closed-source systems like Veo3 and Sora2 in quality while running 7x faster through specialized distillation.
Closed-loop multi-LLM systems exhibit robust semantic collapse across model families and interventions, consistent with intrinsic properties of autoregressive generation.
Filter Babel explores a future of AI-personalized private experiences that may erode common ground in communication while supporting individual identity and selfhood.
A theoretical framework for parameter estimation in inverse problems shows inversion does not necessarily improve accuracy per the data processing inequality and reveals a vulnerability in domain generalization via the Double Meaning Theorem.
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.
citing papers explorer
-
Perturbation Dose Responses in Recursive LLM Loops: Raw Switching, Stochastic Floors, and Persistent Escape under Append, Replace, and Dialog Updates
In 30-step recursive LLM loops, append-mode persistent escape from source basins reaches 50% near 400 tokens under full history but plateaus below 50% under tail-clip memory policy, while replace-mode switching largely reflects state reset.
-
The Impact of AI-Generated Text on the Internet
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
-
Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies
Self-training restructures language by amplifying surface markers and collapsing deep syntax according to structural depth rather than frequency, as evidenced by correlations across multiple models and a human fine-tuning control.
-
Alice v1: Distillation-Enhanced Video Generation Surpassing Closed-Source Models
Alice v1 is an open video model that surpasses its teacher and closed-source systems like Veo3 and Sora2 in quality while running 7x faster through specialized distillation.
-
Multi-LLM Systems Exhibit Robust Semantic Collapse
Closed-loop multi-LLM systems exhibit robust semantic collapse across model families and interventions, consistent with intrinsic properties of autoregressive generation.
-
Filter Babel: The Challenge of Synthetic Media to Authenticity and Common Ground in AI-Mediated Communication
Filter Babel explores a future of AI-personalized private experiences that may erode common ground in communication while supporting individual identity and selfhood.
-
On Inverse Problems, Parameter Estimation, and Domain Generalization
A theoretical framework for parameter estimation in inverse problems shows inversion does not necessarily improve accuracy per the data processing inequality and reveals a vulnerability in domain generalization via the Double Meaning Theorem.
-
Losing our Tail, Again: (Un)Natural Selection & Multilingual LLMs
Position paper warns that model collapse in self-consuming multilingual LLM training loops risks flattening linguistic diversity and cultural nuance.
-
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.