Model Collapse as Cultural Evolution

Dongxin Guo; Jikun Wu; Siu Ming Yiu

arxiv: 2605.23054 · v1 · pith:SWA2B4RFnew · submitted 2026-05-21 · 💻 cs.CL · cs.AI· cs.LG

Model Collapse as Cultural Evolution

Dongxin Guo , Jikun Wu , Siu Ming Yiu This is my paper

Pith reviewed 2026-05-25 05:28 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG

keywords model collapseiterated learningcultural evolutioncompositionalityself-traininglarge language modelslinguistic structure

0 comments

The pith

Model collapse arises from the compression-communication tradeoff in iterated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies iterated learning theory from cultural evolution to explain model collapse in large language models. It derives five predictions about linguistic structure changes during self-training and tests them on LLaMA-2-7B and Mistral-7B across multiple languages and generations. The key result is that compositionality follows a non-monotonic path, rising then falling, even with regular seed data, and this pattern requires task-grounded filtering rather than random selection. The experiments confirm all predictions with large effect sizes and show LLM behavior aligning with human data from cultural evolution studies. This reframes collapse as a transmission phenomenon with implications for pipeline design.

Core claim

Model collapse is the outcome of iterated learning across model generations in which compression favors simpler structures while communication demands expressive ones, producing a non-monotonic compositionality trajectory under unfiltered self-training that is sustained only by task-grounded filtering.

What carries the argument

The compression-communication tradeoff in iterated learning theory, which predicts that repeated transmission selects for compressible yet still communicative language structures.

If this is right

Compositionality rises initially then falls under unfiltered self-training of LLMs.
Task-grounded filtering sustains the rise while random filtering does not.
LLM regularization gradients match human behavioral data from cultural evolution with R squared of 0.94.
All five predictions from iterated learning theory hold with large effect sizes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tradeoff may govern degradation patterns in other iterative training processes beyond language models.
Pipeline designs could deliberately incorporate cultural evolution principles to slow collapse.
Other predictions from iterated learning theory could be tested directly in LLM self-training loops.

Load-bearing premise

The non-monotonic compositionality trajectory is caused by the compression-communication tradeoff rather than alternative mechanisms such as distributional shifts or architecture effects.

What would settle it

A monotonic decline in compositionality during unfiltered self-training, or the same non-monotonic pattern under random rather than task-grounded filtering, would falsify the iterated learning account.

Figures

Figures reproduced from arXiv: 2605.23054 by Dongxin Guo, Jikun Wu, Siu Ming Yiu.

**Figure 1.** Figure 1: Self-training pipeline. Key design choice: at every generation n, fine-tuning starts from the same base model M0 (blue, frozen across the experiment), not from the previous fine-tuned model Mn−1. The transmission chain runs through the data (D0→D1→ . . . →D10; dashed feedback on the left), not through the model parameters. This isolates datainduced degradation from parameter drift and mirrors the iterate… view at source ↗

**Figure 2.** Figure 2: Compositional systematicity (σ) across generations (LLaMA-2). Four conditions test P4. The non-monotonic trajectory persists with regularized seed data (orange dashed), ruling out the noise-removal alternative. Random filtering (gray dotted) fails to sustain compositionality; only quality filtering (blue) succeeds. Shaded bands: ±1 SD across 5 seeds. ture reorganization, not noise cleanup, drives the ini… view at source ↗

**Figure 3.** Figure 3: Construction diversity entropy (normalized [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Frequency-dependent regularization gradient: [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

read the original abstract

Model collapse, the progressive degradation of LLMs trained on their own outputs, has been characterized statistically but lacks a linguistic explanation for which structures degrade, in what order, and why. We show that iterated learning theory from cultural evolution fills this gap. We derive five falsifiable predictions, distinguish those uniquely discriminative for the theory from confirmatory ones, and test them by self-training LLaMA-2-7B and Mistral-7B over 10 generations in English, German, and Turkish. The critical discriminative finding: compositionality follows a non-monotonic trajectory (initially rising, then falling) under unfiltered self-training. This signature persists with maximally regular seed data (ruling out noise removal) and is sustained only by task-grounded filtering, not random filtering, providing the first LLM-scale evidence for the compression-communication tradeoff. All predictions are confirmed with large effect sizes (Hedges' $g > 1.6$; $\mathrm{BF}_{10} > 100$), and LLM regularization gradients closely match human behavioral data ($R^2 = 0.94$). These results reframe model collapse as a cultural transmission phenomenon and yield concrete principles for self-training pipeline design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The non-monotonic compositionality pattern under self-training is a clear empirical observation, but the link to the compression-communication tradeoff is not yet isolated from simpler distributional effects.

read the letter

The paper's main point is that compositionality in LLM outputs rises then falls across generations of unfiltered self-training, and this matches a prediction from iterated learning theory. They test this on LLaMA-2-7B and Mistral-7B in three languages, report large effect sizes, and show the pattern survives regular seed data while depending on task-grounded filtering rather than random filtering. They also note a close match to human behavioral data with R² = 0.94. That non-monotonic signature and the multi-model, multi-language setup are the concrete new pieces beyond earlier statistical descriptions of collapse.

Referee Report

2 major / 1 minor

Summary. The manuscript argues that model collapse in LLMs can be explained via iterated learning theory from cultural evolution. It derives five falsifiable predictions from the compression-communication tradeoff, tests them by self-training LLaMA-2-7B and Mistral-7B over 10 generations in English, German, and Turkish, and identifies a non-monotonic compositionality trajectory (initial rise then fall) under unfiltered self-training as the key discriminative result. This pattern is claimed to persist with regular seeds and task-grounded filtering (but not random filtering), with all predictions confirmed at large effect sizes (Hedges' g > 1.6, BF10 > 100) and LLM gradients matching human data at R² = 0.94.

Significance. If the results hold, the work supplies the first LLM-scale test of iterated learning predictions and reframes model collapse as a cultural transmission process rather than purely statistical degradation. The derivation of discriminative vs. confirmatory predictions and the quantitative match to human behavioral data are notable strengths that could guide both theory and practical self-training design.

major comments (2)

[Abstract and experimental controls section] Abstract and experimental controls section: The claim that the non-monotonic compositionality trajectory specifically evidences the compression-communication tradeoff is load-bearing, yet the reported controls (maximally regular seeds ruling out noise removal; task-grounded vs. random filtering) do not isolate iterative distributional shifts from the hypothesized mechanism. A baseline that applies equivalent distributional change without iteration would be required to rule out alternative accounts such as regularization from self-generated data alone.
[Results section on filtering] Results section on filtering: The assertion that only task-grounded filtering sustains the non-monotonic signature (while random filtering does not) is presented as support for the theory, but without explicit criteria or equations defining 'task-grounded' filtering, it is unclear whether this manipulation specifically targets the communication pressure rather than other properties of the data distribution.

minor comments (1)

[Abstract] The abstract reports aggregate statistics (Hedges' g, BF10, R²) but does not reference the specific tables or figures containing the per-language or per-model breakdowns.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments raise important questions about experimental controls and definitional clarity. We respond to each below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract and experimental controls section] Abstract and experimental controls section: The claim that the non-monotonic compositionality trajectory specifically evidences the compression-communication tradeoff is load-bearing, yet the reported controls (maximally regular seeds ruling out noise removal; task-grounded vs. random filtering) do not isolate iterative distributional shifts from the hypothesized mechanism. A baseline that applies equivalent distributional change without iteration would be required to rule out alternative accounts such as regularization from self-generated data alone.

Authors: We agree that a non-iterative baseline applying matched distributional shifts would provide stronger isolation of the iterative mechanism. Our current controls (maximally regular seeds and the contrast between task-grounded versus random filtering) address several alternative explanations, including noise removal and generic regularization from synthetic data. However, they do not fully rule out non-iterative distributional effects. In the revision we will add an explicit limitations paragraph acknowledging this gap and outlining how such a baseline could be implemented in future work. We maintain that the non-monotonic signature under iterated self-training remains a distinctive prediction of the theory, but we will not claim the existing design fully isolates the mechanism. revision: partial
Referee: [Results section on filtering] Results section on filtering: The assertion that only task-grounded filtering sustains the non-monotonic signature (while random filtering does not) is presented as support for the theory, but without explicit criteria or equations defining 'task-grounded' filtering, it is unclear whether this manipulation specifically targets the communication pressure rather than other properties of the data distribution.

Authors: We accept this criticism. The manuscript currently describes task-grounded filtering at a high level without formal criteria or equations. In the revised Methods section we will supply explicit definitions, including the mathematical formulation used to select data that preserves communicative utility (e.g., task performance thresholds and semantic coherence metrics), and we will clarify how these criteria operationalize the communication pressure in the compression-communication tradeoff. revision: yes

Circularity Check

0 steps flagged

No circularity: predictions derived from external iterated learning theory and tested on independent LLM runs

full rationale

The paper states it derives five falsifiable predictions from iterated learning theory in cultural evolution (an external body of work) and tests them via new self-training experiments on LLaMA-2-7B and Mistral-7B. The non-monotonic compositionality result is presented as an empirical outcome of those runs, with controls for seed regularity and filtering type, plus a match to human behavioral data (R^2=0.94). No quoted step shows a prediction reducing to a fitted parameter, self-definition, or load-bearing self-citation chain; the derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that iterated learning theory transfers directly to LLM self-training; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Iterated learning theory from cultural evolution applies to LLM self-training and generates falsifiable predictions about linguistic structure degradation.
Invoked to derive the five predictions and interpret the non-monotonic compositionality result.

pith-pipeline@v0.9.0 · 5741 in / 1290 out tokens · 27667 ms · 2026-05-25T05:28:29.248549+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · 4 internal anchors

[1]

Henry Brighton and Simon Kirby , title =. Artif. Life , volume =. 2006 , burl =. doi:10.1162/ARTL.2006.12.2.229 , timestamp =

work page doi:10.1162/artl.2006.12.2.229 2006
[2]

2019 , issn =

The cognitive roots of regularization in language , journal =. 2019 , issn =. doi:https://doi.org/10.1016/j.cognition.2018.12.002 , url =

work page doi:10.1016/j.cognition.2018.12.002 2019
[3]

Anderson and Yarin Gal , title =

Ilia Shumailov and Zakhar Shumaylov and Yiren Zhao and Nicolas Papernot and Ross J. Anderson and Yarin Gal , title =. Nature , volume =. 2024 , burl =. doi:10.1038/S41586-024-07566-Y , timestamp =

work page doi:10.1038/s41586-024-07566-y 2024
[4]

Self-Consuming Generative Models Go

Sina Alemohammad and Josue Casco. Self-Consuming Generative Models Go. The Twelfth International Conference on Learning Representations,. 2024 , burl =

work page 2024
[5]

A Tale of Tails: Model Collapse as a Change of Scaling Laws , booktitle =

Elvis Dohmatob and Yunzhen Feng and Pu Yang and Fran. A Tale of Tails: Model Collapse as a Change of Scaling Laws , booktitle =. 2024 , burl =

work page 2024
[6]

Model Collapse Demystified: The Case of Regression , booktitle =

Elvis Dohmatob and Yunzhen Feng and Julia Kempe , editor =. Model Collapse Demystified: The Case of Regression , booktitle =. 2024 , burl =

work page 2024
[7]

The Thirteenth International Conference on Learning Representations,

Elvis Dohmatob and Yunzhen Feng and Arjun Subramonian and Julia Kempe , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025
[8]

Data Feedback Loops: Model-driven Amplification of Dataset Biases , booktitle =

Rohan Taori and Tatsunori Hashimoto , editor =. Data Feedback Loops: Model-driven Amplification of Dataset Biases , booktitle =. 2023 , burl =

work page 2023
[9]

arXiv preprint , volume =

Martin Briesch and Dominik Sobania and Franz Rothlauf , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2311.16822 , beprinttype =

work page doi:10.48550/arxiv.2311.16822 2023
[10]

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification , booktitle =

Yunzhen Feng and Elvis Dohmatob and Pu Yang and Fran. Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification , booktitle =. 2025 , burl =

work page 2025
[11]

Self-Correcting Self-Consuming Loops for Generative Model Training , booktitle =

Nate Gillman and Michael Freeman and Daksh Aggarwal and Chia. Self-Correcting Self-Consuming Loops for Generative Model Training , booktitle =. 2024 , burl =

work page 2024
[12]

Aaron Clauset and Cosma Rohilla Shalizi and Mark E. J. Newman , title =. 2009 , burl =. doi:10.1137/070710111 , timestamp =

work page doi:10.1137/070710111 2009
[13]

The Pitfalls of Simplicity Bias in Neural Networks , booktitle =

Harshay Shah and Kaustav Tamuly and Aditi Raghunathan and Prateek Jain and Praneeth Netrapalli , editor =. The Pitfalls of Simplicity Bias in Neural Networks , booktitle =. 2020 , burl =

work page 2020
[14]

Compositionality and Generalization In Emergent Languages , booktitle =

Rahma Chaabouni and Eugene Kharitonov and Diane Bouchacourt and Emmanuel Dupoux and Marco Baroni , editor =. Compositionality and Generalization In Emergent Languages , booktitle =. 2020 , burl =. doi:10.18653/V1/2020.ACL-MAIN.407 , timestamp =

work page doi:10.18653/v1/2020.acl-main.407 2020
[15]

15th European Signal Processing Conference,

Olivier Roy and Martin Vetterli , title =. 15th European Signal Processing Conference,. 2007 , burl =

work page 2007
[16]

Lost in the Middle: How Language Models Use Long Contexts

R. Thomas McCoy and Robert Frank and Tal Linzen , title =. Trans. Assoc. Comput. Linguistics , volume =. 2020 , burl =. doi:10.1162/TACL\_A\_00304 , timestamp =

work page internal anchor Pith review doi:10.1162/tacl 2020
[17]

Mission: Impossible Language Models , booktitle =

Julie Kallini and Isabel Papadimitriou and Richard Futrell and Kyle Mahowald and Christopher Potts , editor =. Mission: Impossible Language Models , booktitle =. 2024 , burl =. doi:10.18653/V1/2024.ACL-LONG.787 , timestamp =

work page doi:10.18653/v1/2024.acl-long.787 2024
[18]

Algebraic structures in natural language , pages=

What artificial neural networks can tell us about human language acquisition , author=. Algebraic structures in natural language , pages=. 2022 , publisher=

work page 2022
[19]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie. LLaMA: Open and Efficient Foundation Language Models , journal =. 2023 , burl =. doi:10.48550/ARXIV.2302.13971 , beprinttype =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2302.13971 2023
[20]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing,

Najoung Kim and Tal Linzen , editor =. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing,. 2020 , burl =. doi:10.18653/V1/2020.EMNLP-MAIN.731 , timestamp =

work page doi:10.18653/v1/2020.emnlp-main.731 2020
[21]

Lake and Marco Baroni , title =

Brenden M. Lake and Marco Baroni , title =. Nat. , volume =. 2023 , burl =. doi:10.1038/S41586-023-06668-3 , timestamp =

work page doi:10.1038/s41586-023-06668-3 2023
[22]

Dieuwke Hupkes and Verna Dankers and Mathijs Mul and Elia Bruni , title =. J. Artif. Intell. Res. , volume =. 2020 , burl =. doi:10.1613/JAIR.1.11674 , timestamp =

work page doi:10.1613/jair.1.11674 2020
[23]

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data , year =

Daniel Keysers and Nathanael Schärli and Nathan Scales and Hylke Buisman and Daniel Furrer and Sergii Kashubin and Nikola Momchev and Danila Sinopalnikov and Lukasz Stafiniak and Tibor Tihon and Dmitry Tsarkov and Xiao Wang and Marc van Zee and Olivier Bousquet , booktitle =. Measuring Compositional Generalization: A Comprehensive Method on Realistic Data...

work page
[24]

5th International Conference on Learning Representations,

Angeliki Lazaridou and Alexander Peysakhovich and Marco Baroni , title =. 5th International Conference on Learning Representations,. 2017 , burl =

work page 2017
[25]

arXiv preprint , volume =

Angeliki Lazaridou and Marco Baroni , title =. arXiv preprint , volume =. 2020 , burl =

work page 2020
[26]

Cohen and Simon Kirby , title =

Yi Ren and Shangmin Guo and Matthieu Labeau and Shay B. Cohen and Simon Kirby , title =. 8th International Conference on Learning Representations,. 2020 , burl =

work page 2020
[27]

Goodman , editor =

Jesse Mu and Noah D. Goodman , editor =. Emergent Communication of Generalizations , booktitle =. 2021 , burl =

work page 2021
[28]

8th International Conference on Learning Representations,

Ari Holtzman and Jan Buys and Li Du and Maxwell Forbes and Yejin Choi , title =. 8th International Conference on Learning Representations,. 2020 , burl =

work page 2020
[29]

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai and Saurav Kadavath and Sandipan Kundu and Amanda Askell and Jackson Kernion and Andy Jones and Anna Chen and Anna Goldie and Azalia Mirhoseini and Cameron McKinnon and Carol Chen and Catherine Olsson and Christopher Olah and Danny Hernandez and Dawn Drain and Deep Ganguli and Dustin Li and Eli Tran. Constitutional. arXiv preprint , volume =. 2...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.08073 2022
[30]

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models , booktitle =

Zixiang Chen and Yihe Deng and Huizhuo Yuan and Kaixuan Ji and Quanquan Gu , editor =. Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models , booktitle =. 2024 , burl =

work page 2024
[31]

Scaling Instruction-Finetuned Language Models , journal =

Hyung Won Chung and Le Hou and Shayne Longpre and Barret Zoph and Yi Tay and William Fedus and Yunxuan Li and Xuezhi Wang and Mostafa Dehghani and Siddhartha Brahma and Albert Webson and Shixiang Shane Gu and Zhuyun Dai and Mirac Suzgun and Xinyun Chen and Aakanksha Chowdhery and Alex Castro. Scaling Instruction-Finetuned Language Models , journal =. 2024...

work page 2024
[32]

Griffiths and Michael L

Thomas L. Griffiths and Michael L. Kalish , title =. Cogn. Sci. , volume =. 2007 , burl =. doi:10.1080/15326900701326576 , timestamp =

work page doi:10.1080/15326900701326576 2007
[33]

and Griffiths, Thomas L

Kalish, Michael L. and Griffiths, Thomas L. and Lewandowsky, Stephan , year =. Iterated learning: Intergenerational knowledge transmission reveals inductive biases , volume =. Psychonomic Bulletin & Review , publisher =. doi:10.3758/bf03194066 , number =

work page doi:10.3758/bf03194066
[34]

Iterated learning and the evolution of language , volume =

Kirby, Simon and Griffiths, Tom and Smith, Kenny , year =. Iterated learning and the evolution of language , volume =. doi:10.1016/j.conb.2014.07.014 , journal =

work page doi:10.1016/j.conb.2014.07.014 2014
[35]

Compression and communication in the cultural evolution of linguistic structure , volume =

Kirby, Simon and Tamariz, Monica and Cornish, Hannah and Smith, Kenny , year =. Compression and communication in the cultural evolution of linguistic structure , volume =. doi:10.1016/j.cognition.2015.03.016 , journal =

work page doi:10.1016/j.cognition.2015.03.016 2015
[36]

and Chater, Nick , year =

Christiansen, Morten H. and Chater, Nick , year =. Language as Shaped by the Brain , burl =. doi:10.7551/mitpress/9780262034319.003.0002 , booktitle =

work page doi:10.7551/mitpress/9780262034319.003.0002
[37]

and Chater, Nick , year =

Christiansen, Morten H. and Chater, Nick , year =. The Now-or-Never bottleneck: A fundamental constraint on language , volume =. doi:10.1017/s0140525x1500031x , journal =

work page doi:10.1017/s0140525x1500031x
[38]

Eliminating unpredictable variation through iterated learning , volume =

Smith, Kenny and Wonnacott, Elizabeth , year =. Eliminating unpredictable variation through iterated learning , volume =. Cognition , publisher =. doi:10.1016/j.cognition.2010.06.004 , number =

work page doi:10.1016/j.cognition.2010.06.004 2010
[39]

and Newport, Elissa L

Hudson Kam, Carla L. and Newport, Elissa L. , year =. Regularizing Unpredictable Variation: The Roles of Adult and Child Learners in Language Formation and Change , volume =. Language Learning and Development , publisher =. doi:10.1080/15475441.2005.9684215 , number =

work page doi:10.1080/15475441.2005.9684215 2005
[40]

and Newport, Elissa L

Hudson Kam, Carla L. and Newport, Elissa L. , year =. Getting it right by getting it wrong: When learners change languages , volume =. Cognitive Psychology , publisher =. doi:10.1016/j.cogpsych.2009.01.001 , number =

work page doi:10.1016/j.cogpsych.2009.01.001 2009
[41]

The Evolution of Language: Proceedings of the 11th International Conference (

Morgan, Emily and Levy, Roger , title =. The Evolution of Language: Proceedings of the 11th International Conference (. 2016 , editor =

work page 2016
[42]

Language learners privilege structured meaning over surface frequency , volume =

Culbertson, Jennifer and Adger, David , year =. Language learners privilege structured meaning over surface frequency , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.1320525111 , number =

work page doi:10.1073/pnas.1320525111
[43]

How Bad is Training on Synthetic Data?

Mohamed El Amine Seddik and Suei. How Bad is Training on Synthetic Data?. arXiv preprint , volume =. 2024 , burl =. doi:10.48550/ARXIV.2404.05090 , beprinttype =

work page doi:10.48550/arxiv.2404.05090 2024
[44]

Sutherland , editor =

Yi Ren and Shangmin Guo and Linlu Qiu and Bailin Wang and Danica J. Sutherland , editor =. Bias Amplification in Language Model Evolution: An Iterated Learning Perspective , booktitle =. 2024 , burl =

work page 2024
[45]

Cultural evolution in populations of Large Language Models , journal =

Perez, Jérémy and Léger, Corentin and Ovando-Tellez, Marcela and Foulon, Chris and Dussauld, Joan and Oudeyer, Pierre-Yves and Moulin-Frier, Clément , keywords =. Cultural evolution in populations of Large Language Models , journal =. 2024 , copyright =

work page 2024
[46]

Deep neural networks and humans both benefit from compositional language structure , volume =

Galke, Lukas and Ram, Yoav and Raviv, Limor , year =. Deep neural networks and humans both benefit from compositional language structure , volume =. Nature Communications , publisher =. doi:10.1038/s41467-024-55158-1 , number =

work page doi:10.1038/s41467-024-55158-1
[47]

Emogen: Emotional image content generation with text-to-image diffusion models,

Chenhao Zheng and Jieyu Zhang and Aniruddha Kembhavi and Ranjay Krishna , title =. 2024 , burl =. doi:10.1109/CVPR52733.2024.01308 , timestamp =

work page doi:10.1109/cvpr52733.2024.01308 2024
[48]

Scale-Free Networks: Complex Webs in Nature and Technology

Goldberg, Adele , year =. Constructions at Work: The Nature of Generalization in Language , isbn =. doi:10.1093/acprof:oso/9780199268511.001.0001 , publisher =

work page doi:10.1093/acprof:oso/9780199268511.001.0001
[49]

Language, Usage and Cognition , isbn =

Bybee, Joan , year =. Language, Usage and Cognition , isbn =. doi:10.1017/cbo9780511750526 , publisher =

work page doi:10.1017/cbo9780511750526
[50]

Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B

Kyle Mahowald and Anna A. Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2301.06627 , beprinttype =

work page doi:10.48550/arxiv.2301.06627 2023
[51]

, year =

Piantadosi, Steven T. , year =. Zipf’s word frequency law in natural language: A critical review and future directions , volume =. Psychonomic Bulletin & Review , publisher =. doi:10.3758/s13423-014-0585-6 , number =

work page doi:10.3758/s13423-014-0585-6
[52]

Proceedings of the National Academy of Sciences , year =

Shain, Cory and Meister, Clara and Pimentel, Tiago and Cotterell, Ryan and Levy, Roger , title =. Proceedings of the National Academy of Sciences , year =

work page
[53]

, year =

Hedges, Larry V. , year =. Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators , volume =. Journal of Educational Statistics , publisher =. doi:10.2307/1164588 , number =

work page doi:10.2307/1164588
[54]

Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , volume =

Lakens, Daniël , year =. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , volume =. doi:10.3389/fpsyg.2013.00863 , journal =

work page doi:10.3389/fpsyg.2013.00863 2013
[55]

Minors of a Class of Riordan Arrays Related to Weighted Partial Motzkin Paths

Hesterberg, Tim C. , year =. What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum , volume =. The American Statistician , publisher =. doi:10.1080/00031305.2015.1089789 , number =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2015.1089789 2015
[56]

Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D

McCoy, R. Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D. and Griffiths, Thomas L. , year =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.2322420121 , number =

work page doi:10.1073/pnas.2322420121
[57]

Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs , journal =

Misra, Kanishka and Mahowald, Kyle , keywords =. Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs , journal =. 2024 , copyright =

work page 2024
[58]

Language Structure Is Partly Determined by Social Structure , volume =

Lupyan, Gary and Dale, Rick , editor =. Language Structure Is Partly Determined by Social Structure , volume =. PLoS ONE , publisher =. 2010 , month =. doi:10.1371/journal.pone.0008559 , number =

work page doi:10.1371/journal.pone.0008559 2010
[59]

Larger communities create more systematic languages , volume =

Raviv, Limor and Meyer, Antje and Lev-Ari, Shiri , year =. Larger communities create more systematic languages , volume =. Proceedings of the Royal Society B: Biological Sciences , publisher =. doi:10.1098/rspb.2019.1262 , number =

work page doi:10.1098/rspb.2019.1262 2019
[60]

Jiang, Albert Q. and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and Casas, Diego de las and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and Lavaud, Lélio Renard and Lachaux, Marie-Anne and Stock, Pierre and Scao, Teven Le and Lavril, Thibaut and Wang, Thomas and Lacroix, T...

work page 2023
[61]

Physics of Language Models: Part 1, Learning Hierarchical Language Structures , journal =

Zeyuan Allen. Physics of Language Models: Part 1, Learning Hierarchical Language Structures , journal =. 2025 , burl =

work page 2025
[62]

Locally Typical Sampling , volume =

Meister, Clara and Pimentel, Tiago and Wiher, Gian and Cotterell, Ryan , year =. Locally Typical Sampling , volume =. doi:10.1162/tacl_a_00536 , journal =

work page doi:10.1162/tacl_a_00536
[63]

Griffiths and Mark Johnson , title =

Sharon Goldwater and Thomas L. Griffiths and Mark Johnson , title =. J. Mach. Learn. Res. , volume =. 2011 , burl =. doi:10.5555/1953048.2021075 , timestamp =

work page doi:10.5555/1953048.2021075 2011
[64]

Proceedings of the National Academy of Sciences , volume =

Simon Kirby and Hannah Cornish and Kenny Smith , title =. Proceedings of the National Academy of Sciences , volume =. 2008 , doi =

work page 2008
[65]

2009 , issn =

The evolution of frequency distributions: Relating regularization to inductive biases through iterated learning , journal =. 2009 , issn =. doi:https://doi.org/10.1016/j.cognition.2009.02.012 , url =

work page doi:10.1016/j.cognition.2009.02.012 2009
[66]

and Hay, Jennifer , title =

Beckner, Clay and Pierrehumbert, Janet B. and Hay, Jennifer , title =. Journal of Language Evolution , volume =. 2017 , month =. doi:10.1093/jole/lzx001 , url =

work page doi:10.1093/jole/lzx001 2017
[67]

2010 , journal =

Culbertson, Jennifer , title =. 2010 , journal =

work page 2010
[68]

First Conference on Language Modeling , year=

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data , author=. First Conference on Language Modeling , year=

work page
[69]

The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

Guo, Yanzhu and Shang, Guokan and Vazirgiannis, Michalis and Clavel, Chlo \'e. The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text. Findings of the Association for Computational Linguistics: NAACL 2024. 2024. doi:10.18653/v1/2024.findings-naacl.228

work page doi:10.18653/v1/2024.findings-naacl.228 2024
[70]

arXiv preprint , volume =

Rylan Schaeffer and Joshua Kazdan and Alvan Caleb Arulandu and Sanmi Koyejo , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2503.03150 , beprinttype =

work page doi:10.48550/arxiv.2503.03150 2025
[71]

Nature , title =

Ilia Shumailov and Zakhar Shumaylov and Yiren Zhao and Nicolas Papernot and Ross Anderson and Yarin Gal , doi =. Nature , title =

work page
[72]

Searching for Structure: Investigating Emergent Communication with Large Language Models

Kouwenhoven, Tom and Peeperkorn, Max and Verhoef, Tessa. Searching for Structure: Investigating Emergent Communication with Large Language Models. Proceedings of the 31st International Conference on Computational Linguistics. 2025

work page 2025
[73]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

Tom Kouwenhoven and Max Peeperkorn and Roy De Kleijn and Tessa Verhoef , title =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , url =. doi:10.24963/IJCAI.2025/1144 , timestamp =

work page doi:10.24963/ijcai.2025/1144 2025
[74]

arXiv preprint , volume =

Lukas Galke and Yoav Ram and Limor Raviv , title =. arXiv preprint , volume =. 2023 , url =. doi:10.48550/ARXIV.2302.12239 , eprinttype =. 2302.12239 , biburl =

work page doi:10.48550/arxiv.2302.12239 2023
[75]

arXiv preprint , volume =

Eliciting the Priors of Large Language Models using Iterated In-Context Learning , author=. arXiv preprint , volume =. 2024 , beprint=

work page 2024
[76]

Thomas and Griffiths, Thomas L

McCoy, R. Thomas and Griffiths, Thomas L. , title =. Nature Communications , year =. doi:10.1038/s41467-025-59957-y , pmid =

work page doi:10.1038/s41467-025-59957-y
[77]

Proceedings of the 43rd Annual Meeting of the Cognitive Science Society , year =

Nikolaus, Mitja and Maes, Juliette and Fourtassi, Abdellah , title =. Proceedings of the 43rd Annual Meeting of the Cognitive Science Society , year =

work page
[78]

Lake and Marco Baroni , editor =

Brenden M. Lake and Marco Baroni , editor =. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , booktitle =. 2018 , url =

work page 2018
[79]

Entropy , year =

Gutierrez-Vasques, Ximena and Mijangos, Victor , title =. Entropy , year =. doi:10.3390/e22010048 , pmid =

work page doi:10.3390/e22010048
[80]

UC xn: Typologically Informed Annotation of Constructions Atop U niversal D ependencies

Weissweiler, Leonie and B. UC xn: Typologically Informed Annotation of Constructions Atop U niversal D ependencies. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024

work page 2024

[1] [1]

Henry Brighton and Simon Kirby , title =. Artif. Life , volume =. 2006 , burl =. doi:10.1162/ARTL.2006.12.2.229 , timestamp =

work page doi:10.1162/artl.2006.12.2.229 2006

[2] [2]

2019 , issn =

The cognitive roots of regularization in language , journal =. 2019 , issn =. doi:https://doi.org/10.1016/j.cognition.2018.12.002 , url =

work page doi:10.1016/j.cognition.2018.12.002 2019

[3] [3]

Anderson and Yarin Gal , title =

Ilia Shumailov and Zakhar Shumaylov and Yiren Zhao and Nicolas Papernot and Ross J. Anderson and Yarin Gal , title =. Nature , volume =. 2024 , burl =. doi:10.1038/S41586-024-07566-Y , timestamp =

work page doi:10.1038/s41586-024-07566-y 2024

[4] [4]

Self-Consuming Generative Models Go

Sina Alemohammad and Josue Casco. Self-Consuming Generative Models Go. The Twelfth International Conference on Learning Representations,. 2024 , burl =

work page 2024

[5] [5]

A Tale of Tails: Model Collapse as a Change of Scaling Laws , booktitle =

Elvis Dohmatob and Yunzhen Feng and Pu Yang and Fran. A Tale of Tails: Model Collapse as a Change of Scaling Laws , booktitle =. 2024 , burl =

work page 2024

[6] [6]

Model Collapse Demystified: The Case of Regression , booktitle =

Elvis Dohmatob and Yunzhen Feng and Julia Kempe , editor =. Model Collapse Demystified: The Case of Regression , booktitle =. 2024 , burl =

work page 2024

[7] [7]

The Thirteenth International Conference on Learning Representations,

Elvis Dohmatob and Yunzhen Feng and Arjun Subramonian and Julia Kempe , title =. The Thirteenth International Conference on Learning Representations,. 2025 , burl =

work page 2025

[8] [8]

Data Feedback Loops: Model-driven Amplification of Dataset Biases , booktitle =

Rohan Taori and Tatsunori Hashimoto , editor =. Data Feedback Loops: Model-driven Amplification of Dataset Biases , booktitle =. 2023 , burl =

work page 2023

[9] [9]

arXiv preprint , volume =

Martin Briesch and Dominik Sobania and Franz Rothlauf , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2311.16822 , beprinttype =

work page doi:10.48550/arxiv.2311.16822 2023

[10] [10]

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification , booktitle =

Yunzhen Feng and Elvis Dohmatob and Pu Yang and Fran. Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification , booktitle =. 2025 , burl =

work page 2025

[11] [11]

Self-Correcting Self-Consuming Loops for Generative Model Training , booktitle =

Nate Gillman and Michael Freeman and Daksh Aggarwal and Chia. Self-Correcting Self-Consuming Loops for Generative Model Training , booktitle =. 2024 , burl =

work page 2024

[12] [12]

Aaron Clauset and Cosma Rohilla Shalizi and Mark E. J. Newman , title =. 2009 , burl =. doi:10.1137/070710111 , timestamp =

work page doi:10.1137/070710111 2009

[13] [13]

The Pitfalls of Simplicity Bias in Neural Networks , booktitle =

Harshay Shah and Kaustav Tamuly and Aditi Raghunathan and Prateek Jain and Praneeth Netrapalli , editor =. The Pitfalls of Simplicity Bias in Neural Networks , booktitle =. 2020 , burl =

work page 2020

[14] [14]

Compositionality and Generalization In Emergent Languages , booktitle =

Rahma Chaabouni and Eugene Kharitonov and Diane Bouchacourt and Emmanuel Dupoux and Marco Baroni , editor =. Compositionality and Generalization In Emergent Languages , booktitle =. 2020 , burl =. doi:10.18653/V1/2020.ACL-MAIN.407 , timestamp =

work page doi:10.18653/v1/2020.acl-main.407 2020

[15] [15]

15th European Signal Processing Conference,

Olivier Roy and Martin Vetterli , title =. 15th European Signal Processing Conference,. 2007 , burl =

work page 2007

[16] [16]

Lost in the Middle: How Language Models Use Long Contexts

R. Thomas McCoy and Robert Frank and Tal Linzen , title =. Trans. Assoc. Comput. Linguistics , volume =. 2020 , burl =. doi:10.1162/TACL\_A\_00304 , timestamp =

work page internal anchor Pith review doi:10.1162/tacl 2020

[17] [17]

Mission: Impossible Language Models , booktitle =

Julie Kallini and Isabel Papadimitriou and Richard Futrell and Kyle Mahowald and Christopher Potts , editor =. Mission: Impossible Language Models , booktitle =. 2024 , burl =. doi:10.18653/V1/2024.ACL-LONG.787 , timestamp =

work page doi:10.18653/v1/2024.acl-long.787 2024

[18] [18]

Algebraic structures in natural language , pages=

What artificial neural networks can tell us about human language acquisition , author=. Algebraic structures in natural language , pages=. 2022 , publisher=

work page 2022

[19] [19]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie. LLaMA: Open and Efficient Foundation Language Models , journal =. 2023 , burl =. doi:10.48550/ARXIV.2302.13971 , beprinttype =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2302.13971 2023

[20] [20]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing,

Najoung Kim and Tal Linzen , editor =. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing,. 2020 , burl =. doi:10.18653/V1/2020.EMNLP-MAIN.731 , timestamp =

work page doi:10.18653/v1/2020.emnlp-main.731 2020

[21] [21]

Lake and Marco Baroni , title =

Brenden M. Lake and Marco Baroni , title =. Nat. , volume =. 2023 , burl =. doi:10.1038/S41586-023-06668-3 , timestamp =

work page doi:10.1038/s41586-023-06668-3 2023

[22] [22]

Dieuwke Hupkes and Verna Dankers and Mathijs Mul and Elia Bruni , title =. J. Artif. Intell. Res. , volume =. 2020 , burl =. doi:10.1613/JAIR.1.11674 , timestamp =

work page doi:10.1613/jair.1.11674 2020

[23] [23]

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data , year =

Daniel Keysers and Nathanael Schärli and Nathan Scales and Hylke Buisman and Daniel Furrer and Sergii Kashubin and Nikola Momchev and Danila Sinopalnikov and Lukasz Stafiniak and Tibor Tihon and Dmitry Tsarkov and Xiao Wang and Marc van Zee and Olivier Bousquet , booktitle =. Measuring Compositional Generalization: A Comprehensive Method on Realistic Data...

work page

[24] [24]

5th International Conference on Learning Representations,

Angeliki Lazaridou and Alexander Peysakhovich and Marco Baroni , title =. 5th International Conference on Learning Representations,. 2017 , burl =

work page 2017

[25] [25]

arXiv preprint , volume =

Angeliki Lazaridou and Marco Baroni , title =. arXiv preprint , volume =. 2020 , burl =

work page 2020

[26] [26]

Cohen and Simon Kirby , title =

Yi Ren and Shangmin Guo and Matthieu Labeau and Shay B. Cohen and Simon Kirby , title =. 8th International Conference on Learning Representations,. 2020 , burl =

work page 2020

[27] [27]

Goodman , editor =

Jesse Mu and Noah D. Goodman , editor =. Emergent Communication of Generalizations , booktitle =. 2021 , burl =

work page 2021

[28] [28]

8th International Conference on Learning Representations,

Ari Holtzman and Jan Buys and Li Du and Maxwell Forbes and Yejin Choi , title =. 8th International Conference on Learning Representations,. 2020 , burl =

work page 2020

[29] [29]

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai and Saurav Kadavath and Sandipan Kundu and Amanda Askell and Jackson Kernion and Andy Jones and Anna Chen and Anna Goldie and Azalia Mirhoseini and Cameron McKinnon and Carol Chen and Catherine Olsson and Christopher Olah and Danny Hernandez and Dawn Drain and Deep Ganguli and Dustin Li and Eli Tran. Constitutional. arXiv preprint , volume =. 2...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.08073 2022

[30] [30]

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models , booktitle =

Zixiang Chen and Yihe Deng and Huizhuo Yuan and Kaixuan Ji and Quanquan Gu , editor =. Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models , booktitle =. 2024 , burl =

work page 2024

[31] [31]

Scaling Instruction-Finetuned Language Models , journal =

Hyung Won Chung and Le Hou and Shayne Longpre and Barret Zoph and Yi Tay and William Fedus and Yunxuan Li and Xuezhi Wang and Mostafa Dehghani and Siddhartha Brahma and Albert Webson and Shixiang Shane Gu and Zhuyun Dai and Mirac Suzgun and Xinyun Chen and Aakanksha Chowdhery and Alex Castro. Scaling Instruction-Finetuned Language Models , journal =. 2024...

work page 2024

[32] [32]

Griffiths and Michael L

Thomas L. Griffiths and Michael L. Kalish , title =. Cogn. Sci. , volume =. 2007 , burl =. doi:10.1080/15326900701326576 , timestamp =

work page doi:10.1080/15326900701326576 2007

[33] [33]

and Griffiths, Thomas L

Kalish, Michael L. and Griffiths, Thomas L. and Lewandowsky, Stephan , year =. Iterated learning: Intergenerational knowledge transmission reveals inductive biases , volume =. Psychonomic Bulletin & Review , publisher =. doi:10.3758/bf03194066 , number =

work page doi:10.3758/bf03194066

[34] [34]

Iterated learning and the evolution of language , volume =

Kirby, Simon and Griffiths, Tom and Smith, Kenny , year =. Iterated learning and the evolution of language , volume =. doi:10.1016/j.conb.2014.07.014 , journal =

work page doi:10.1016/j.conb.2014.07.014 2014

[35] [35]

Compression and communication in the cultural evolution of linguistic structure , volume =

Kirby, Simon and Tamariz, Monica and Cornish, Hannah and Smith, Kenny , year =. Compression and communication in the cultural evolution of linguistic structure , volume =. doi:10.1016/j.cognition.2015.03.016 , journal =

work page doi:10.1016/j.cognition.2015.03.016 2015

[36] [36]

and Chater, Nick , year =

Christiansen, Morten H. and Chater, Nick , year =. Language as Shaped by the Brain , burl =. doi:10.7551/mitpress/9780262034319.003.0002 , booktitle =

work page doi:10.7551/mitpress/9780262034319.003.0002

[37] [37]

and Chater, Nick , year =

Christiansen, Morten H. and Chater, Nick , year =. The Now-or-Never bottleneck: A fundamental constraint on language , volume =. doi:10.1017/s0140525x1500031x , journal =

work page doi:10.1017/s0140525x1500031x

[38] [38]

Eliminating unpredictable variation through iterated learning , volume =

Smith, Kenny and Wonnacott, Elizabeth , year =. Eliminating unpredictable variation through iterated learning , volume =. Cognition , publisher =. doi:10.1016/j.cognition.2010.06.004 , number =

work page doi:10.1016/j.cognition.2010.06.004 2010

[39] [39]

and Newport, Elissa L

Hudson Kam, Carla L. and Newport, Elissa L. , year =. Regularizing Unpredictable Variation: The Roles of Adult and Child Learners in Language Formation and Change , volume =. Language Learning and Development , publisher =. doi:10.1080/15475441.2005.9684215 , number =

work page doi:10.1080/15475441.2005.9684215 2005

[40] [40]

and Newport, Elissa L

Hudson Kam, Carla L. and Newport, Elissa L. , year =. Getting it right by getting it wrong: When learners change languages , volume =. Cognitive Psychology , publisher =. doi:10.1016/j.cogpsych.2009.01.001 , number =

work page doi:10.1016/j.cogpsych.2009.01.001 2009

[41] [41]

The Evolution of Language: Proceedings of the 11th International Conference (

Morgan, Emily and Levy, Roger , title =. The Evolution of Language: Proceedings of the 11th International Conference (. 2016 , editor =

work page 2016

[42] [42]

Language learners privilege structured meaning over surface frequency , volume =

Culbertson, Jennifer and Adger, David , year =. Language learners privilege structured meaning over surface frequency , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.1320525111 , number =

work page doi:10.1073/pnas.1320525111

[43] [43]

How Bad is Training on Synthetic Data?

Mohamed El Amine Seddik and Suei. How Bad is Training on Synthetic Data?. arXiv preprint , volume =. 2024 , burl =. doi:10.48550/ARXIV.2404.05090 , beprinttype =

work page doi:10.48550/arxiv.2404.05090 2024

[44] [44]

Sutherland , editor =

Yi Ren and Shangmin Guo and Linlu Qiu and Bailin Wang and Danica J. Sutherland , editor =. Bias Amplification in Language Model Evolution: An Iterated Learning Perspective , booktitle =. 2024 , burl =

work page 2024

[45] [45]

Cultural evolution in populations of Large Language Models , journal =

Perez, Jérémy and Léger, Corentin and Ovando-Tellez, Marcela and Foulon, Chris and Dussauld, Joan and Oudeyer, Pierre-Yves and Moulin-Frier, Clément , keywords =. Cultural evolution in populations of Large Language Models , journal =. 2024 , copyright =

work page 2024

[46] [46]

Deep neural networks and humans both benefit from compositional language structure , volume =

Galke, Lukas and Ram, Yoav and Raviv, Limor , year =. Deep neural networks and humans both benefit from compositional language structure , volume =. Nature Communications , publisher =. doi:10.1038/s41467-024-55158-1 , number =

work page doi:10.1038/s41467-024-55158-1

[47] [47]

Emogen: Emotional image content generation with text-to-image diffusion models,

Chenhao Zheng and Jieyu Zhang and Aniruddha Kembhavi and Ranjay Krishna , title =. 2024 , burl =. doi:10.1109/CVPR52733.2024.01308 , timestamp =

work page doi:10.1109/cvpr52733.2024.01308 2024

[48] [48]

Scale-Free Networks: Complex Webs in Nature and Technology

Goldberg, Adele , year =. Constructions at Work: The Nature of Generalization in Language , isbn =. doi:10.1093/acprof:oso/9780199268511.001.0001 , publisher =

work page doi:10.1093/acprof:oso/9780199268511.001.0001

[49] [49]

Language, Usage and Cognition , isbn =

Bybee, Joan , year =. Language, Usage and Cognition , isbn =. doi:10.1017/cbo9780511750526 , publisher =

work page doi:10.1017/cbo9780511750526

[50] [50]

Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B

Kyle Mahowald and Anna A. Ivanova and Idan Asher Blank and Nancy Kanwisher and Joshua B. Tenenbaum and Evelina Fedorenko , title =. arXiv preprint , volume =. 2023 , burl =. doi:10.48550/ARXIV.2301.06627 , beprinttype =

work page doi:10.48550/arxiv.2301.06627 2023

[51] [51]

, year =

Piantadosi, Steven T. , year =. Zipf’s word frequency law in natural language: A critical review and future directions , volume =. Psychonomic Bulletin & Review , publisher =. doi:10.3758/s13423-014-0585-6 , number =

work page doi:10.3758/s13423-014-0585-6

[52] [52]

Proceedings of the National Academy of Sciences , year =

Shain, Cory and Meister, Clara and Pimentel, Tiago and Cotterell, Ryan and Levy, Roger , title =. Proceedings of the National Academy of Sciences , year =

work page

[53] [53]

, year =

Hedges, Larry V. , year =. Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators , volume =. Journal of Educational Statistics , publisher =. doi:10.2307/1164588 , number =

work page doi:10.2307/1164588

[54] [54]

Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , volume =

Lakens, Daniël , year =. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , volume =. doi:10.3389/fpsyg.2013.00863 , journal =

work page doi:10.3389/fpsyg.2013.00863 2013

[55] [55]

Minors of a Class of Riordan Arrays Related to Weighted Partial Motzkin Paths

Hesterberg, Tim C. , year =. What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum , volume =. The American Statistician , publisher =. doi:10.1080/00031305.2015.1089789 , number =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1080/00031305.2015.1089789 2015

[56] [56]

Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D

McCoy, R. Thomas and Yao, Shunyu and Friedman, Dan and Hardy, Mathew D. and Griffiths, Thomas L. , year =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.2322420121 , number =

work page doi:10.1073/pnas.2322420121

[57] [57]

Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs , journal =

Misra, Kanishka and Mahowald, Kyle , keywords =. Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs , journal =. 2024 , copyright =

work page 2024

[58] [58]

Language Structure Is Partly Determined by Social Structure , volume =

Lupyan, Gary and Dale, Rick , editor =. Language Structure Is Partly Determined by Social Structure , volume =. PLoS ONE , publisher =. 2010 , month =. doi:10.1371/journal.pone.0008559 , number =

work page doi:10.1371/journal.pone.0008559 2010

[59] [59]

Larger communities create more systematic languages , volume =

Raviv, Limor and Meyer, Antje and Lev-Ari, Shiri , year =. Larger communities create more systematic languages , volume =. Proceedings of the Royal Society B: Biological Sciences , publisher =. doi:10.1098/rspb.2019.1262 , number =

work page doi:10.1098/rspb.2019.1262 2019

[60] [60]

Jiang, Albert Q. and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and Casas, Diego de las and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and Lavaud, Lélio Renard and Lachaux, Marie-Anne and Stock, Pierre and Scao, Teven Le and Lavril, Thibaut and Wang, Thomas and Lacroix, T...

work page 2023

[61] [61]

Physics of Language Models: Part 1, Learning Hierarchical Language Structures , journal =

Zeyuan Allen. Physics of Language Models: Part 1, Learning Hierarchical Language Structures , journal =. 2025 , burl =

work page 2025

[62] [62]

Locally Typical Sampling , volume =

Meister, Clara and Pimentel, Tiago and Wiher, Gian and Cotterell, Ryan , year =. Locally Typical Sampling , volume =. doi:10.1162/tacl_a_00536 , journal =

work page doi:10.1162/tacl_a_00536

[63] [63]

Griffiths and Mark Johnson , title =

Sharon Goldwater and Thomas L. Griffiths and Mark Johnson , title =. J. Mach. Learn. Res. , volume =. 2011 , burl =. doi:10.5555/1953048.2021075 , timestamp =

work page doi:10.5555/1953048.2021075 2011

[64] [64]

Proceedings of the National Academy of Sciences , volume =

Simon Kirby and Hannah Cornish and Kenny Smith , title =. Proceedings of the National Academy of Sciences , volume =. 2008 , doi =

work page 2008

[65] [65]

2009 , issn =

The evolution of frequency distributions: Relating regularization to inductive biases through iterated learning , journal =. 2009 , issn =. doi:https://doi.org/10.1016/j.cognition.2009.02.012 , url =

work page doi:10.1016/j.cognition.2009.02.012 2009

[66] [66]

and Hay, Jennifer , title =

Beckner, Clay and Pierrehumbert, Janet B. and Hay, Jennifer , title =. Journal of Language Evolution , volume =. 2017 , month =. doi:10.1093/jole/lzx001 , url =

work page doi:10.1093/jole/lzx001 2017

[67] [67]

2010 , journal =

Culbertson, Jennifer , title =. 2010 , journal =

work page 2010

[68] [68]

First Conference on Language Modeling , year=

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data , author=. First Conference on Language Modeling , year=

work page

[69] [69]

The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

Guo, Yanzhu and Shang, Guokan and Vazirgiannis, Michalis and Clavel, Chlo \'e. The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text. Findings of the Association for Computational Linguistics: NAACL 2024. 2024. doi:10.18653/v1/2024.findings-naacl.228

work page doi:10.18653/v1/2024.findings-naacl.228 2024

[70] [70]

arXiv preprint , volume =

Rylan Schaeffer and Joshua Kazdan and Alvan Caleb Arulandu and Sanmi Koyejo , title =. arXiv preprint , volume =. 2025 , url =. doi:10.48550/ARXIV.2503.03150 , beprinttype =

work page doi:10.48550/arxiv.2503.03150 2025

[71] [71]

Nature , title =

Ilia Shumailov and Zakhar Shumaylov and Yiren Zhao and Nicolas Papernot and Ross Anderson and Yarin Gal , doi =. Nature , title =

work page

[72] [72]

Searching for Structure: Investigating Emergent Communication with Large Language Models

Kouwenhoven, Tom and Peeperkorn, Max and Verhoef, Tessa. Searching for Structure: Investigating Emergent Communication with Large Language Models. Proceedings of the 31st International Conference on Computational Linguistics. 2025

work page 2025

[73] [73]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,

Tom Kouwenhoven and Max Peeperkorn and Roy De Kleijn and Tessa Verhoef , title =. Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence,. 2025 , url =. doi:10.24963/IJCAI.2025/1144 , timestamp =

work page doi:10.24963/ijcai.2025/1144 2025

[74] [74]

arXiv preprint , volume =

Lukas Galke and Yoav Ram and Limor Raviv , title =. arXiv preprint , volume =. 2023 , url =. doi:10.48550/ARXIV.2302.12239 , eprinttype =. 2302.12239 , biburl =

work page doi:10.48550/arxiv.2302.12239 2023

[75] [75]

arXiv preprint , volume =

Eliciting the Priors of Large Language Models using Iterated In-Context Learning , author=. arXiv preprint , volume =. 2024 , beprint=

work page 2024

[76] [76]

Thomas and Griffiths, Thomas L

McCoy, R. Thomas and Griffiths, Thomas L. , title =. Nature Communications , year =. doi:10.1038/s41467-025-59957-y , pmid =

work page doi:10.1038/s41467-025-59957-y

[77] [77]

Proceedings of the 43rd Annual Meeting of the Cognitive Science Society , year =

Nikolaus, Mitja and Maes, Juliette and Fourtassi, Abdellah , title =. Proceedings of the 43rd Annual Meeting of the Cognitive Science Society , year =

work page

[78] [78]

Lake and Marco Baroni , editor =

Brenden M. Lake and Marco Baroni , editor =. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , booktitle =. 2018 , url =

work page 2018

[79] [79]

Entropy , year =

Gutierrez-Vasques, Ximena and Mijangos, Victor , title =. Entropy , year =. doi:10.3390/e22010048 , pmid =

work page doi:10.3390/e22010048

[80] [80]

UC xn: Typologically Informed Annotation of Constructions Atop U niversal D ependencies

Weissweiler, Leonie and B. UC xn: Typologically Informed Annotation of Constructions Atop U niversal D ependencies. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024

work page 2024