Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs
Pith reviewed 2026-05-25 05:31 UTC · model grok-4.3
The pith
Language models acquire knowledge of unacceptable sentences through statistical competition between alternative forms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar. Across four experiments, LLM surprisal patterns correlate with human acceptability judgments at r = 0.79, are driven by competing-form frequency rather than overall verb frequency, scale as a power law with model size, and respond causally to a controlled fine-tuning intervention that manipulates competing-form frequencies while reverse-direction controls rule out frequency-sensitivity confounds.
What carries the argument
Statistical preemption, the process in which higher frequency of a conventional form reduces acceptance of unattested structural alternatives.
If this is right
- LLM surprisal matches human acceptability judgments on dative, causative, and locative constructions.
- The effect is carried by competing-form frequency, confirmed by non-circular partial correlations.
- Preemption sensitivity increases with model size according to a power law.
- Manipulating competing-form frequencies in fine-tuning shifts preemption behavior in the predicted direction.
Where Pith is reading between the lines
- If the mechanism holds, deliberate curation of training data frequencies could be used to strengthen or weaken particular linguistic constraints.
- The power-law scaling suggests that further increases in model size will continue to sharpen distinctions between conventional and preempted forms.
- The same frequency-competition logic could be tested on other domains where models must learn implicit negative constraints, such as factual consistency.
Load-bearing premise
The controlled fine-tuning intervention with reverse-direction controls isolates the effect of competing-form frequency on preemption behavior without introducing unrelated changes to model representations.
What would settle it
A fine-tuning run that raises the frequency of competing forms yet produces no corresponding drop in the model's acceptance of the preempted alternatives would falsify the causal claim.
Figures
read the original abstract
How do learners acquire knowledge of what is unacceptable without negative evidence? Construction Grammar proposes statistical preemption: exposure to a conventional form (e.g., "donated the books to the library") preempts structurally possible but unattested alternatives ("*donated the library the books"). We present a computational study that, for the first time, directly dissociates statistical preemption from the competing entrenchment hypothesis in large language models within a single converging design. Across four experiments spanning 120 English verb-construction pairings (dative, causative, locative), we show that (1) LLM surprisal patterns correlate strongly with human acceptability judgments ($r = 0.79$), validated against three independent behavioral datasets; (2) these patterns are driven by competing-form frequency rather than overall verb frequency, confirmed by non-circular partial correlations; (3) preemption sensitivity scales as a power law with model size; and (4) a controlled fine-tuning intervention causally demonstrates that manipulating competing-form frequencies shifts preemption behavior in the predicted direction, with reverse-direction controls ruling out frequency-sensitivity confounds. These results provide converging evidence that neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that neural language models acquire negative linguistic knowledge (unacceptability of certain constructions) through statistical preemption driven by distributional competition between forms, as posited by Construction Grammar. This is supported by four experiments on 120 English verb-construction pairings (dative, causative, locative) showing: (1) strong correlations between LLM surprisal and human acceptability judgments (r=0.79) across three behavioral datasets; (2) these patterns driven by competing-form frequency via non-circular partial correlations rather than overall verb frequency; (3) power-law scaling of preemption sensitivity with model size; and (4) causal evidence from a controlled fine-tuning intervention that shifts preemption behavior when manipulating competing-form frequencies, with reverse-direction controls.
Significance. If the results hold, the work supplies the first direct, converging dissociation of statistical preemption from entrenchment within LLMs, offering computational evidence for a core Construction Grammar mechanism. Notable strengths include validation against multiple independent human datasets, explicit non-circular partial correlations, power-law scaling analysis, and a causal intervention design with reverse controls.
major comments (1)
- [Experiment 4] Experiment 4 (fine-tuning intervention): The central causal claim rests on the finding that increasing frequency of one competing form shifts acceptability away from the alternative, with reverse-direction controls said to rule out generic frequency confounds. However, the description provides no post-intervention probes (e.g., on unrelated verbs, non-alternating constructions, or lexical frequency effects) to confirm that the manipulation did not produce global changes in frequency sensitivity or token representations; without such checks, the observed preemption shift is not fully diagnostic of the targeted distributional competition mechanism.
minor comments (2)
- The abstract states that patterns are 'validated against three independent behavioral datasets,' but the manuscript would benefit from a brief table or section listing the datasets, their sizes, and any selection criteria to allow readers to assess potential stimulus overlap.
- Power-law scaling is reported for preemption sensitivity with model size; including the exact model sizes tested, the fitted exponent, and confidence intervals in the main text or a supplementary table would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on Experiment 4. We address it directly below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Experiment 4] Experiment 4 (fine-tuning intervention): The central causal claim rests on the finding that increasing frequency of one competing form shifts acceptability away from the alternative, with reverse-direction controls said to rule out generic frequency confounds. However, the description provides no post-intervention probes (e.g., on unrelated verbs, non-alternating constructions, or lexical frequency effects) to confirm that the manipulation did not produce global changes in frequency sensitivity or token representations; without such checks, the observed preemption shift is not fully diagnostic of the targeted distributional competition mechanism.
Authors: We agree that post-intervention probes would further strengthen the diagnosticity of the causal claim. The reverse-direction controls already demonstrate directional specificity: increasing the frequency of one form decreases acceptability of its competitor (and vice versa), which would be unlikely under a generic increase in frequency sensitivity. Nevertheless, to address the concern directly, the revised manuscript will include additional analyses evaluating the fine-tuned models on unrelated verb-construction pairs and non-alternating constructions. These controls will verify that the intervention does not produce broad changes in frequency sensitivity or token representations outside the targeted preemption pairs. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper grounds its central claim in external human acceptability judgments (r=0.79 across three independent datasets), non-circular partial correlations that explicitly separate competing-form frequency from overall verb frequency, power-law scaling with model size, and a controlled fine-tuning intervention with reverse-direction controls. No derivation step reduces by construction to its inputs, no self-citation is load-bearing for the uniqueness of the mechanism, and no ansatz or renaming is smuggled in. The design is self-contained against external benchmarks and explicitly rules out the main confounds it addresses.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM surprisal serves as a valid proxy for human acceptability judgments
- domain assumption Competing-form frequency can be isolated from overall verb frequency as the driver of preemption
Reference graph
Works this paper leans on
-
[1]
Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge.Cogn. Sci., 41(5):1202–1241. Beth Levin. 1993.English verb classes and alternations : a preliminaryinvestigation. University of Chicago press. Roger Levy. 2008. Expectation-based syntactic compre- hension.Cognition, 106(3):1126–1177. Bai Li, Zining Zhu, Guillaume ...
-
[2]
Assessing the ability of lstms to learn syntax- sensitive dependencies.Trans. Assoc. Comput. Lin- guistics, 4:521–535. Kyle Mahowald, Anna A. Ivanova, Idan Asher Blank, Nancy Kanwisher, Joshua B. Tenenbaum, and Evelina Fedorenko. 2023. Dissociating language and thought in large language models: a cognitive perspective.arXiv preprint, arXiv.2301.06627. Reb...
-
[3]
A systematic framework for generating novel experimental hypotheses from language models
Strong prediction: Language model surprisal explains multiple n400 effects.Neurobiology of Lan- guage, 5(1):107–135. Kanishka Misra and Najoung Kim. 2024. Generat- ing novel experimental hypotheses from language models: A case study on cross-dative generalization. arXiv preprint, arXiv./2408.05086. Kanishka Misra and Kyle Mahowald. 2024. Language models l...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[4]
The learnability of abstract syntactic principles. Cognition, 118(3):306–338. Steven Pinker. 1989.Learnability and Cognition: The Acquisition of Argument Structure. Learning, Devel- opment, and Conceptual Change. MIT Press, Cam- bridge, MA. Anna Samara, Elizabeth Wonnacott, Gaurav Saxena, Ramya Maitreyee, Judit Fazekas, and Ben Ambridge
work page 1989
-
[5]
Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo
Learners restrict their linguistic generaliza- tions using preemption but not entrenchment: Evi- dence from artificial-language-learning studies with adults and children.Psychological Review, 132(1):1– 17. Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo
-
[6]
Wesley Scivetti and Nathan Schneider
Are emergent abilities of large language mod- els a mirage? InAdvances in Neural Information Processing Systems 36: Annual Conference on Neu- ral Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. Wesley Scivetti and Nathan Schneider. 2025. Construc- tion identification and disambiguation using BERT: A case st...
-
[7]
Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P
Construction grammar provides unique in- sight into neural language models.arXiv preprint, arXiv.2302.02178. Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P. Levy. 2023. Testing the predictions of surprisal theory in 11 languages. Trans. Assoc. Comput. Linguistics, 11:1451–1470. Isabell Winkler, Madlen Glauer, Tilmann Bets...
-
[8]
/ *She donated the museum the paintings
She donated the paintings to the museum. / *She donated the museum the paintings
-
[9]
/ *The professor donated the uni- versity his collection
The professor donated his collection to the university. / *The professor donated the uni- versity his collection
-
[10]
/ *My neighbor donated the shelter her old clothes
My neighbor donated her old clothes to the shelter. / *My neighbor donated the shelter her old clothes
-
[11]
/ *The company donated the school comput- ers
The company donated computers to the school. / *The company donated the school comput- ers
-
[12]
/ *His family donated the foundation their savings
His family donated their savings to the foun- dation. / *His family donated the foundation their savings. Example forgive(None, dative):
-
[13]
/ She gave the teacher the flowers
She gave the flowers to the teacher. / She gave the teacher the flowers
-
[14]
/ The professor gave the student his notes
The professor gave his notes to the student. / The professor gave the student his notes
-
[15]
/ My neighbor gave the friend her keys
My neighbor gave her keys to the friend. / My neighbor gave the friend her keys
-
[16]
/ The company gave the employee a bonus
The company gave a bonus to the employee. / The company gave the employee a bonus
-
[17]
/ His family gave the charity the money
His family gave the money to the charity. / His family gave the charity the money. C Full Results for All Models D Regression Diagnostics The mixed-effects model from Experiment 2 in- cludes random intercepts and random slopes for PREEMPTby model identity. D.1 Robustness: Low-Collinearity Subset Re-estimating with only verbs where |r(PREEMPT,ENTRENCH)|<0....
-
[18]
(r= 0.77 with DAIS for LLaMA-2 7B) and critical-region surprisal (r= 0.75). D.3 Additional Circularity Controls We report three supplementary controls address- ing corpus-model circularity.Control 1: Raw fre- quency baseline.Replacing PREEMPT( v) with raw co-occurrence f(v,Cx conv) yields R2 = 0.41 (vs. R2 = 0.68; ∆AIC= 34.2 , p < .001 ).Control 2: N-gram...
work page 1993
-
[19]
Sentence selection.All sentences contain- ing a lemmatized form of each target verb are extracted using spaCy’sen_core_web_trf model
-
[20]
Dependency parsing and pattern match- ing.Each candidate sentence is parsed; construction-specific templates (below) are ap- plied to assign one of: conv, unconv, or reject (ambiguous/non-matching)
-
[21]
Three-layer filtering.(a) POS-tag agree- ment check: matrix verb must carry ver- bal POS; (b) dependency-pattern strict match; (c) whitelist of construction-defining preposition lemmas (e.g.,to/forfor prepositional datives; onto/intovs.withfor content- vs. container- locatives)
-
[22]
She donated the books to the library,
Aggregation.Per-verb counts are summed across the corpus to produce f(v,Cx conv) and f(v,Cx unconv). G.2 Construction-Specific Templates Dative.Prepositional dative(PD): V +dobj (theme) +prep [to/for] + pobj(recipient/beneficiary).Double-object Dative Causative Locative ModelS W N S W N S W N GPT-2 124M 1.53 0.74 0.29 1.32 0.63 0.25 1.01 0.51 0.20 GPT-2 3...
-
[23]
Boilerplate filtering.Sentences from Com- mon Crawl boilerplate (cookie notices, naviga- tion text, repeated headers) were detected via Dolma’s quality filter and removed before pars- ing
-
[24]
Length filtering.Sentences shorter than 4 to- kens or longer than 60 tokens were excluded (the first risk fragmented parses, the second long-distance dependencies the parser handles poorly)
-
[25]
POS consistency.Sentences in which the target verb’s tag conflicted with the lemma’s expected tag (e.g.,drivetagged as NN rather than VB) were rejected
-
[26]
Parser-confidence threshold.For each candi- date construction match, we required the rel- evant dependency edge to have a parser confi- dence (as estimated by ensembling 5 parses with stochastic dropout) above 0.75. Low-confidence matches were rejected. The combined effect of these filters is to reduce the candidate sentence pool by approximately 30– 45%,...
work page 2020
-
[27]
provides ready-made data for German; WALS (Dryer and Haspelmath, 2013) and Grambank (Skirgård et al., 2023) features can guide language selection. N.4 Prioritized Language Sample Based on typological diversity and resource avail- ability, we recommend initial testing on: Turk- ish (agglutinative), Mandarin (isolating), German (fusional, V2), Finnish (aggl...
work page 2013
-
[28]
This yields r= 0.77 with DAIS (vs
Log-odds: PREEMPT log(v) = log f(v,Cx conv)+1 f(v,Cx unconv)+1. This yields r= 0.77 with DAIS (vs. r= 0.79 for our Laplace- smoothed proportion), indicating that the specific functional form matters little
-
[29]
Conditional probability: PREEMPT cond(v) = f(v,Cx conv) f(v) , omitting the unconventional form entirely. This yields r= 0.74 , slightly lower, suggesting that the ratio formulation (which capturesrelative competition) is more informative than the simple proportion. Q Detailed Comparison with Yao et al. (2025) Yao et al. (2025) conducted a controlled-rear...
work page 2025
-
[30]
Preemption–entrenchment dissociation: Yao et al. did not test whether the observed ef- fects are driven by competing-form frequency (preemption) or overall verb frequency (entrenchment). Our Experiment 2 provides this dissociation
-
[31]
did not correlate model behavior with human acceptability data
Human behavioral ground truth:Yao et al. did not correlate model behavior with human acceptability data. Our item-level correlations with DAIS, R&G, and T&G provide indepen- dent validation
-
[32]
Non-circular validation:Both our study and Yao et al.’s involve corpus-model comparisons. We address the resulting circularity concern with non-circular partial correlations against human data (§5.4)
-
[33]
Reverse-direction control:Our Experi- ment 4 includes a reverse-direction condition that Yao et al.’s design does not, addressing the tautology concern about frequency manip- ulation in frequency-sensitive models
-
[34]
focused exclusively on the dative alternation; we ex- tend to causative and locative constructions
Multi-construction scope:Yao et al. focused exclusively on the dative alternation; we ex- tend to causative and locative constructions
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.