Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

Dongxin Guo; Jikun Wu; Siu Ming Yiu

arxiv: 2605.23039 · v1 · pith:BLOHP6G6new · submitted 2026-05-21 · 💻 cs.CL · cs.AI· cs.LG

Do Language Models Know What Not to Say? Causal Evidence for Statistical Preemption in LLMs

Dongxin Guo , Jikun Wu , Siu Ming Yiu This is my paper

Pith reviewed 2026-05-25 05:31 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG

keywords statistical preemptionconstruction grammarlarge language modelsnegative linguistic knowledgeacceptability judgmentsfine-tuning interventiondative constructions

0 comments

The pith

Language models acquire knowledge of unacceptable sentences through statistical competition between alternative forms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Construction Grammar claims that learners acquire negative linguistic knowledge via statistical preemption: hearing a conventional form blocks structurally similar but unattested alternatives on the basis of relative frequency. The paper tests this claim in large language models by running four experiments on 120 English verb-construction pairs that directly separate preemption from overall frequency effects. Model surprisal tracks human acceptability judgments, depends on competing-form frequency, grows as a power law with scale, and shifts causally when fine-tuning alters those frequencies. A sympathetic reader cares because the results supply a concrete computational account of how negative constraints can emerge from positive data alone.

Core claim

Neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar. Across four experiments, LLM surprisal patterns correlate with human acceptability judgments at r = 0.79, are driven by competing-form frequency rather than overall verb frequency, scale as a power law with model size, and respond causally to a controlled fine-tuning intervention that manipulates competing-form frequencies while reverse-direction controls rule out frequency-sensitivity confounds.

What carries the argument

Statistical preemption, the process in which higher frequency of a conventional form reduces acceptance of unattested structural alternatives.

If this is right

LLM surprisal matches human acceptability judgments on dative, causative, and locative constructions.
The effect is carried by competing-form frequency, confirmed by non-circular partial correlations.
Preemption sensitivity increases with model size according to a power law.
Manipulating competing-form frequencies in fine-tuning shifts preemption behavior in the predicted direction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the mechanism holds, deliberate curation of training data frequencies could be used to strengthen or weaken particular linguistic constraints.
The power-law scaling suggests that further increases in model size will continue to sharpen distinctions between conventional and preempted forms.
The same frequency-competition logic could be tested on other domains where models must learn implicit negative constraints, such as factual consistency.

Load-bearing premise

The controlled fine-tuning intervention with reverse-direction controls isolates the effect of competing-form frequency on preemption behavior without introducing unrelated changes to model representations.

What would settle it

A fine-tuning run that raises the frequency of competing forms yet produces no corresponding drop in the model's acceptance of the preempted alternatives would falsify the causal claim.

Figures

Figures reproduced from arXiv: 2605.23039 by Dongxin Guo, Jikun Wu, Siu Ming Yiu.

**Figure 2.** Figure 2: Preemption–entrenchment dissociation (LLaMA-2 7B). +Competing verbs (blue, filled) show a strong relationship between PREEMPT(v) and ∆S; – Competing verbs (red, open) cluster near zero regardless of frequency. This dissociation is the key result: verb restrictions track the frequency of competing conventional forms, not overall verb frequency. 5.3 Regression Analysis We fit mixed-effects regression models… view at source ↗

**Figure 3.** Figure 3: Scaling of preemption sensitivity. Blue line: [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Causal intervention effects across 5 random [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

How do learners acquire knowledge of what is unacceptable without negative evidence? Construction Grammar proposes statistical preemption: exposure to a conventional form (e.g., "donated the books to the library") preempts structurally possible but unattested alternatives ("*donated the library the books"). We present a computational study that, for the first time, directly dissociates statistical preemption from the competing entrenchment hypothesis in large language models within a single converging design. Across four experiments spanning 120 English verb-construction pairings (dative, causative, locative), we show that (1) LLM surprisal patterns correlate strongly with human acceptability judgments ($r = 0.79$), validated against three independent behavioral datasets; (2) these patterns are driven by competing-form frequency rather than overall verb frequency, confirmed by non-circular partial correlations; (3) preemption sensitivity scales as a power law with model size; and (4) a controlled fine-tuning intervention causally demonstrates that manipulating competing-form frequencies shifts preemption behavior in the predicted direction, with reverse-direction controls ruling out frequency-sensitivity confounds. These results provide converging evidence that neural language models acquire negative linguistic knowledge through distributional competition, the core mechanism posited by Construction Grammar.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows LLMs track negative knowledge via competing-form frequencies with decent correlations and a causal fine-tuning step, but the intervention's specificity is the main open question.

read the letter

The core finding is that LLM surprisal on dative, causative, and locative alternations lines up with human acceptability judgments across three datasets, that competing-form frequency drives the pattern more than raw verb frequency, that this sensitivity scales with model size, and that fine-tuning one form shifts behavior away from its competitor while reverse controls are meant to block generic frequency effects. That last piece is the main novelty: a single design that tries to separate statistical preemption from entrenchment and then tests it causally inside the same models. The correlations and partial-correlation controls are straightforward and the scaling result is clean enough to be useful. The fine-tuning experiment is the part that actually moves the claim from correlational to interventional. The abstract is explicit that the reverse-direction runs are there to rule out broad frequency sensitivity, which directly addresses the stress-test worry. Still, without seeing the post-intervention probes on unrelated verbs or non-alternating constructions, it is hard to be fully confident that the shift is narrow rather than a side effect of changed frequency weighting overall. The partial correlations are described as non-circular, which helps, but the strength of the preemption claim rests on how tightly those controls actually isolate the mechanism. This is the kind of work that matters for people who want computational tests of construction-grammar ideas or who study what LLMs actually learn about ungrammaticality. It is not a finished story on acquisition, but the design is careful enough and the question is live enough that it should go to referees rather than get desk-rejected. The main revision points would be fuller reporting of the fine-tuning checks and any remaining variance not captured by the competing-form measure.

Referee Report

1 major / 2 minor

Summary. The paper claims that neural language models acquire negative linguistic knowledge (unacceptability of certain constructions) through statistical preemption driven by distributional competition between forms, as posited by Construction Grammar. This is supported by four experiments on 120 English verb-construction pairings (dative, causative, locative) showing: (1) strong correlations between LLM surprisal and human acceptability judgments (r=0.79) across three behavioral datasets; (2) these patterns driven by competing-form frequency via non-circular partial correlations rather than overall verb frequency; (3) power-law scaling of preemption sensitivity with model size; and (4) causal evidence from a controlled fine-tuning intervention that shifts preemption behavior when manipulating competing-form frequencies, with reverse-direction controls.

Significance. If the results hold, the work supplies the first direct, converging dissociation of statistical preemption from entrenchment within LLMs, offering computational evidence for a core Construction Grammar mechanism. Notable strengths include validation against multiple independent human datasets, explicit non-circular partial correlations, power-law scaling analysis, and a causal intervention design with reverse controls.

major comments (1)

[Experiment 4] Experiment 4 (fine-tuning intervention): The central causal claim rests on the finding that increasing frequency of one competing form shifts acceptability away from the alternative, with reverse-direction controls said to rule out generic frequency confounds. However, the description provides no post-intervention probes (e.g., on unrelated verbs, non-alternating constructions, or lexical frequency effects) to confirm that the manipulation did not produce global changes in frequency sensitivity or token representations; without such checks, the observed preemption shift is not fully diagnostic of the targeted distributional competition mechanism.

minor comments (2)

The abstract states that patterns are 'validated against three independent behavioral datasets,' but the manuscript would benefit from a brief table or section listing the datasets, their sizes, and any selection criteria to allow readers to assess potential stimulus overlap.
Power-law scaling is reported for preemption sensitivity with model size; including the exact model sizes tested, the fitted exponent, and confidence intervals in the main text or a supplementary table would improve reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on Experiment 4. We address it directly below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Experiment 4] Experiment 4 (fine-tuning intervention): The central causal claim rests on the finding that increasing frequency of one competing form shifts acceptability away from the alternative, with reverse-direction controls said to rule out generic frequency confounds. However, the description provides no post-intervention probes (e.g., on unrelated verbs, non-alternating constructions, or lexical frequency effects) to confirm that the manipulation did not produce global changes in frequency sensitivity or token representations; without such checks, the observed preemption shift is not fully diagnostic of the targeted distributional competition mechanism.

Authors: We agree that post-intervention probes would further strengthen the diagnosticity of the causal claim. The reverse-direction controls already demonstrate directional specificity: increasing the frequency of one form decreases acceptability of its competitor (and vice versa), which would be unlikely under a generic increase in frequency sensitivity. Nevertheless, to address the concern directly, the revised manuscript will include additional analyses evaluating the fine-tuned models on unrelated verb-construction pairs and non-alternating constructions. These controls will verify that the intervention does not produce broad changes in frequency sensitivity or token representations outside the targeted preemption pairs. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper grounds its central claim in external human acceptability judgments (r=0.79 across three independent datasets), non-circular partial correlations that explicitly separate competing-form frequency from overall verb frequency, power-law scaling with model size, and a controlled fine-tuning intervention with reverse-direction controls. No derivation step reduces by construction to its inputs, no self-citation is load-bearing for the uniqueness of the mechanism, and no ansatz or renaming is smuggled in. The design is self-contained against external benchmarks and explicitly rules out the main confounds it addresses.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper is an empirical computational study; it relies on standard domain assumptions rather than new mathematical derivations or postulated entities.

axioms (2)

domain assumption LLM surprisal serves as a valid proxy for human acceptability judgments
Invoked to validate model patterns against three behavioral datasets.
domain assumption Competing-form frequency can be isolated from overall verb frequency as the driver of preemption
Central to the partial-correlation and fine-tuning claims.

pith-pipeline@v0.9.0 · 5757 in / 1307 out tokens · 29315 ms · 2026-05-25T05:31:18.645597+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Sci., 41(5):1202–1241

Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge.Cogn. Sci., 41(5):1202–1241. Beth Levin. 1993.English verb classes and alternations : a preliminaryinvestigation. University of Chicago press. Roger Levy. 2008. Expectation-based syntactic compre- hension.Cognition, 106(3):1126–1177. Bai Li, Zining Zhu, Guillaume ...

work page arXiv 1993
[2]

Assessing the ability of lstms to learn syntax- sensitive dependencies.Trans. Assoc. Comput. Lin- guistics, 4:521–535. Kyle Mahowald, Anna A. Ivanova, Idan Asher Blank, Nancy Kanwisher, Joshua B. Tenenbaum, and Evelina Fedorenko. 2023. Dissociating language and thought in large language models: a cognitive perspective.arXiv preprint, arXiv.2301.06627. Reb...

work page arXiv 2023
[3]

A systematic framework for generating novel experimental hypotheses from language models

Strong prediction: Language model surprisal explains multiple n400 effects.Neurobiology of Lan- guage, 5(1):107–135. Kanishka Misra and Najoung Kim. 2024. Generat- ing novel experimental hypotheses from language models: A case study on cross-dative generalization. arXiv preprint, arXiv./2408.05086. Kanishka Misra and Kyle Mahowald. 2024. Language models l...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

Cognition, 118(3):306–338

The learnability of abstract syntactic principles. Cognition, 118(3):306–338. Steven Pinker. 1989.Learnability and Cognition: The Acquisition of Argument Structure. Learning, Devel- opment, and Conceptual Change. MIT Press, Cam- bridge, MA. Anna Samara, Elizabeth Wonnacott, Gaurav Saxena, Ramya Maitreyee, Judit Fazekas, and Ben Ambridge

work page 1989
[5]

Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo

Learners restrict their linguistic generaliza- tions using preemption but not entrenchment: Evi- dence from artificial-language-learning studies with adults and children.Psychological Review, 132(1):1– 17. Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo

work page
[6]

Wesley Scivetti and Nathan Schneider

Are emergent abilities of large language mod- els a mirage? InAdvances in Neural Information Processing Systems 36: Annual Conference on Neu- ral Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. Wesley Scivetti and Nathan Schneider. 2025. Construc- tion identification and disambiguation using BERT: A case st...

work page arXiv 2023
[7]

Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P

Construction grammar provides unique in- sight into neural language models.arXiv preprint, arXiv.2302.02178. Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P. Levy. 2023. Testing the predictions of surprisal theory in 11 languages. Trans. Assoc. Comput. Linguistics, 11:1451–1470. Isabell Winkler, Madlen Glauer, Tilmann Bets...

work page arXiv 2023
[8]

/ *She donated the museum the paintings

She donated the paintings to the museum. / *She donated the museum the paintings

work page
[9]

/ *The professor donated the uni- versity his collection

The professor donated his collection to the university. / *The professor donated the uni- versity his collection

work page
[10]

/ *My neighbor donated the shelter her old clothes

My neighbor donated her old clothes to the shelter. / *My neighbor donated the shelter her old clothes

work page
[11]

/ *The company donated the school comput- ers

The company donated computers to the school. / *The company donated the school comput- ers

work page
[12]

/ *His family donated the foundation their savings

His family donated their savings to the foun- dation. / *His family donated the foundation their savings. Example forgive(None, dative):

work page
[13]

/ She gave the teacher the flowers

She gave the flowers to the teacher. / She gave the teacher the flowers

work page
[14]

/ The professor gave the student his notes

The professor gave his notes to the student. / The professor gave the student his notes

work page
[15]

/ My neighbor gave the friend her keys

My neighbor gave her keys to the friend. / My neighbor gave the friend her keys

work page
[16]

/ The company gave the employee a bonus

The company gave a bonus to the employee. / The company gave the employee a bonus

work page
[17]

/ His family gave the charity the money

His family gave the money to the charity. / His family gave the charity the money. C Full Results for All Models D Regression Diagnostics The mixed-effects model from Experiment 2 in- cludes random intercepts and random slopes for PREEMPTby model identity. D.1 Robustness: Low-Collinearity Subset Re-estimating with only verbs where |r(PREEMPT,ENTRENCH)|<0....

work page
[18]

(r= 0.77 with DAIS for LLaMA-2 7B) and critical-region surprisal (r= 0.75). D.3 Additional Circularity Controls We report three supplementary controls address- ing corpus-model circularity.Control 1: Raw fre- quency baseline.Replacing PREEMPT( v) with raw co-occurrence f(v,Cx conv) yields R2 = 0.41 (vs. R2 = 0.68; ∆AIC= 34.2 , p < .001 ).Control 2: N-gram...

work page 1993
[19]

Sentence selection.All sentences contain- ing a lemmatized form of each target verb are extracted using spaCy’sen_core_web_trf model

work page
[20]

Dependency parsing and pattern match- ing.Each candidate sentence is parsed; construction-specific templates (below) are ap- plied to assign one of: conv, unconv, or reject (ambiguous/non-matching)

work page
[21]

container- locatives)

Three-layer filtering.(a) POS-tag agree- ment check: matrix verb must carry ver- bal POS; (b) dependency-pattern strict match; (c) whitelist of construction-defining preposition lemmas (e.g.,to/forfor prepositional datives; onto/intovs.withfor content- vs. container- locatives)

work page
[22]

She donated the books to the library,

Aggregation.Per-verb counts are summed across the corpus to produce f(v,Cx conv) and f(v,Cx unconv). G.2 Construction-Specific Templates Dative.Prepositional dative(PD): V +dobj (theme) +prep [to/for] + pobj(recipient/beneficiary).Double-object Dative Causative Locative ModelS W N S W N S W N GPT-2 124M 1.53 0.74 0.29 1.32 0.63 0.25 1.01 0.51 0.20 GPT-2 3...

work page
[23]

Boilerplate filtering.Sentences from Com- mon Crawl boilerplate (cookie notices, naviga- tion text, repeated headers) were detected via Dolma’s quality filter and removed before pars- ing

work page
[24]

Length filtering.Sentences shorter than 4 to- kens or longer than 60 tokens were excluded (the first risk fragmented parses, the second long-distance dependencies the parser handles poorly)

work page
[25]

POS consistency.Sentences in which the target verb’s tag conflicted with the lemma’s expected tag (e.g.,drivetagged as NN rather than VB) were rejected

work page
[26]

construc- tions

Parser-confidence threshold.For each candi- date construction match, we required the rel- evant dependency edge to have a parser confi- dence (as estimated by ensembling 5 parses with stochastic dropout) above 0.75. Low-confidence matches were rejected. The combined effect of these filters is to reduce the candidate sentence pool by approximately 30– 45%,...

work page 2020
[27]

provides ready-made data for German; WALS (Dryer and Haspelmath, 2013) and Grambank (Skirgård et al., 2023) features can guide language selection. N.4 Prioritized Language Sample Based on typological diversity and resource avail- ability, we recommend initial testing on: Turk- ish (agglutinative), Mandarin (isolating), German (fusional, V2), Finnish (aggl...

work page 2013
[28]

This yields r= 0.77 with DAIS (vs

Log-odds: PREEMPT log(v) = log f(v,Cx conv)+1 f(v,Cx unconv)+1. This yields r= 0.77 with DAIS (vs. r= 0.79 for our Laplace- smoothed proportion), indicating that the specific functional form matters little

work page
[29]

This yields r= 0.74 , slightly lower, suggesting that the ratio formulation (which capturesrelative competition) is more informative than the simple proportion

Conditional probability: PREEMPT cond(v) = f(v,Cx conv) f(v) , omitting the unconventional form entirely. This yields r= 0.74 , slightly lower, suggesting that the ratio formulation (which capturesrelative competition) is more informative than the simple proportion. Q Detailed Comparison with Yao et al. (2025) Yao et al. (2025) conducted a controlled-rear...

work page 2025
[30]

did not test whether the observed ef- fects are driven by competing-form frequency (preemption) or overall verb frequency (entrenchment)

Preemption–entrenchment dissociation: Yao et al. did not test whether the observed ef- fects are driven by competing-form frequency (preemption) or overall verb frequency (entrenchment). Our Experiment 2 provides this dissociation

work page
[31]

did not correlate model behavior with human acceptability data

Human behavioral ground truth:Yao et al. did not correlate model behavior with human acceptability data. Our item-level correlations with DAIS, R&G, and T&G provide indepen- dent validation

work page
[32]

We address the resulting circularity concern with non-circular partial correlations against human data (§5.4)

Non-circular validation:Both our study and Yao et al.’s involve corpus-model comparisons. We address the resulting circularity concern with non-circular partial correlations against human data (§5.4)

work page
[33]

Reverse-direction control:Our Experi- ment 4 includes a reverse-direction condition that Yao et al.’s design does not, addressing the tautology concern about frequency manip- ulation in frequency-sensitive models

work page
[34]

focused exclusively on the dative alternation; we ex- tend to causative and locative constructions

Multi-construction scope:Yao et al. focused exclusively on the dative alternation; we ex- tend to causative and locative constructions

work page

[1] [1]

Sci., 41(5):1202–1241

Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge.Cogn. Sci., 41(5):1202–1241. Beth Levin. 1993.English verb classes and alternations : a preliminaryinvestigation. University of Chicago press. Roger Levy. 2008. Expectation-based syntactic compre- hension.Cognition, 106(3):1126–1177. Bai Li, Zining Zhu, Guillaume ...

work page arXiv 1993

[2] [2]

Assessing the ability of lstms to learn syntax- sensitive dependencies.Trans. Assoc. Comput. Lin- guistics, 4:521–535. Kyle Mahowald, Anna A. Ivanova, Idan Asher Blank, Nancy Kanwisher, Joshua B. Tenenbaum, and Evelina Fedorenko. 2023. Dissociating language and thought in large language models: a cognitive perspective.arXiv preprint, arXiv.2301.06627. Reb...

work page arXiv 2023

[3] [3]

A systematic framework for generating novel experimental hypotheses from language models

Strong prediction: Language model surprisal explains multiple n400 effects.Neurobiology of Lan- guage, 5(1):107–135. Kanishka Misra and Najoung Kim. 2024. Generat- ing novel experimental hypotheses from language models: A case study on cross-dative generalization. arXiv preprint, arXiv./2408.05086. Kanishka Misra and Kyle Mahowald. 2024. Language models l...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[4] [4]

Cognition, 118(3):306–338

The learnability of abstract syntactic principles. Cognition, 118(3):306–338. Steven Pinker. 1989.Learnability and Cognition: The Acquisition of Argument Structure. Learning, Devel- opment, and Conceptual Change. MIT Press, Cam- bridge, MA. Anna Samara, Elizabeth Wonnacott, Gaurav Saxena, Ramya Maitreyee, Judit Fazekas, and Ben Ambridge

work page 1989

[5] [5]

Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo

Learners restrict their linguistic generaliza- tions using preemption but not entrenchment: Evi- dence from artificial-language-learning studies with adults and children.Psychological Review, 132(1):1– 17. Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo

work page

[6] [6]

Wesley Scivetti and Nathan Schneider

Are emergent abilities of large language mod- els a mirage? InAdvances in Neural Information Processing Systems 36: Annual Conference on Neu- ral Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. Wesley Scivetti and Nathan Schneider. 2025. Construc- tion identification and disambiguation using BERT: A case st...

work page arXiv 2023

[7] [7]

Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P

Construction grammar provides unique in- sight into neural language models.arXiv preprint, arXiv.2302.02178. Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, and Roger P. Levy. 2023. Testing the predictions of surprisal theory in 11 languages. Trans. Assoc. Comput. Linguistics, 11:1451–1470. Isabell Winkler, Madlen Glauer, Tilmann Bets...

work page arXiv 2023

[8] [8]

/ *She donated the museum the paintings

She donated the paintings to the museum. / *She donated the museum the paintings

work page

[9] [9]

/ *The professor donated the uni- versity his collection

The professor donated his collection to the university. / *The professor donated the uni- versity his collection

work page

[10] [10]

/ *My neighbor donated the shelter her old clothes

My neighbor donated her old clothes to the shelter. / *My neighbor donated the shelter her old clothes

work page

[11] [11]

/ *The company donated the school comput- ers

The company donated computers to the school. / *The company donated the school comput- ers

work page

[12] [12]

/ *His family donated the foundation their savings

His family donated their savings to the foun- dation. / *His family donated the foundation their savings. Example forgive(None, dative):

work page

[13] [13]

/ She gave the teacher the flowers

She gave the flowers to the teacher. / She gave the teacher the flowers

work page

[14] [14]

/ The professor gave the student his notes

The professor gave his notes to the student. / The professor gave the student his notes

work page

[15] [15]

/ My neighbor gave the friend her keys

My neighbor gave her keys to the friend. / My neighbor gave the friend her keys

work page

[16] [16]

/ The company gave the employee a bonus

The company gave a bonus to the employee. / The company gave the employee a bonus

work page

[17] [17]

/ His family gave the charity the money

His family gave the money to the charity. / His family gave the charity the money. C Full Results for All Models D Regression Diagnostics The mixed-effects model from Experiment 2 in- cludes random intercepts and random slopes for PREEMPTby model identity. D.1 Robustness: Low-Collinearity Subset Re-estimating with only verbs where |r(PREEMPT,ENTRENCH)|<0....

work page

[18] [18]

(r= 0.77 with DAIS for LLaMA-2 7B) and critical-region surprisal (r= 0.75). D.3 Additional Circularity Controls We report three supplementary controls address- ing corpus-model circularity.Control 1: Raw fre- quency baseline.Replacing PREEMPT( v) with raw co-occurrence f(v,Cx conv) yields R2 = 0.41 (vs. R2 = 0.68; ∆AIC= 34.2 , p < .001 ).Control 2: N-gram...

work page 1993

[19] [19]

Sentence selection.All sentences contain- ing a lemmatized form of each target verb are extracted using spaCy’sen_core_web_trf model

work page

[20] [20]

Dependency parsing and pattern match- ing.Each candidate sentence is parsed; construction-specific templates (below) are ap- plied to assign one of: conv, unconv, or reject (ambiguous/non-matching)

work page

[21] [21]

container- locatives)

Three-layer filtering.(a) POS-tag agree- ment check: matrix verb must carry ver- bal POS; (b) dependency-pattern strict match; (c) whitelist of construction-defining preposition lemmas (e.g.,to/forfor prepositional datives; onto/intovs.withfor content- vs. container- locatives)

work page

[22] [22]

She donated the books to the library,

Aggregation.Per-verb counts are summed across the corpus to produce f(v,Cx conv) and f(v,Cx unconv). G.2 Construction-Specific Templates Dative.Prepositional dative(PD): V +dobj (theme) +prep [to/for] + pobj(recipient/beneficiary).Double-object Dative Causative Locative ModelS W N S W N S W N GPT-2 124M 1.53 0.74 0.29 1.32 0.63 0.25 1.01 0.51 0.20 GPT-2 3...

work page

[23] [23]

Boilerplate filtering.Sentences from Com- mon Crawl boilerplate (cookie notices, naviga- tion text, repeated headers) were detected via Dolma’s quality filter and removed before pars- ing

work page

[24] [24]

Length filtering.Sentences shorter than 4 to- kens or longer than 60 tokens were excluded (the first risk fragmented parses, the second long-distance dependencies the parser handles poorly)

work page

[25] [25]

POS consistency.Sentences in which the target verb’s tag conflicted with the lemma’s expected tag (e.g.,drivetagged as NN rather than VB) were rejected

work page

[26] [26]

construc- tions

Parser-confidence threshold.For each candi- date construction match, we required the rel- evant dependency edge to have a parser confi- dence (as estimated by ensembling 5 parses with stochastic dropout) above 0.75. Low-confidence matches were rejected. The combined effect of these filters is to reduce the candidate sentence pool by approximately 30– 45%,...

work page 2020

[27] [27]

provides ready-made data for German; WALS (Dryer and Haspelmath, 2013) and Grambank (Skirgård et al., 2023) features can guide language selection. N.4 Prioritized Language Sample Based on typological diversity and resource avail- ability, we recommend initial testing on: Turk- ish (agglutinative), Mandarin (isolating), German (fusional, V2), Finnish (aggl...

work page 2013

[28] [28]

This yields r= 0.77 with DAIS (vs

Log-odds: PREEMPT log(v) = log f(v,Cx conv)+1 f(v,Cx unconv)+1. This yields r= 0.77 with DAIS (vs. r= 0.79 for our Laplace- smoothed proportion), indicating that the specific functional form matters little

work page

[29] [29]

This yields r= 0.74 , slightly lower, suggesting that the ratio formulation (which capturesrelative competition) is more informative than the simple proportion

Conditional probability: PREEMPT cond(v) = f(v,Cx conv) f(v) , omitting the unconventional form entirely. This yields r= 0.74 , slightly lower, suggesting that the ratio formulation (which capturesrelative competition) is more informative than the simple proportion. Q Detailed Comparison with Yao et al. (2025) Yao et al. (2025) conducted a controlled-rear...

work page 2025

[30] [30]

did not test whether the observed ef- fects are driven by competing-form frequency (preemption) or overall verb frequency (entrenchment)

Preemption–entrenchment dissociation: Yao et al. did not test whether the observed ef- fects are driven by competing-form frequency (preemption) or overall verb frequency (entrenchment). Our Experiment 2 provides this dissociation

work page

[31] [31]

did not correlate model behavior with human acceptability data

Human behavioral ground truth:Yao et al. did not correlate model behavior with human acceptability data. Our item-level correlations with DAIS, R&G, and T&G provide indepen- dent validation

work page

[32] [32]

We address the resulting circularity concern with non-circular partial correlations against human data (§5.4)

Non-circular validation:Both our study and Yao et al.’s involve corpus-model comparisons. We address the resulting circularity concern with non-circular partial correlations against human data (§5.4)

work page

[33] [33]

Reverse-direction control:Our Experi- ment 4 includes a reverse-direction condition that Yao et al.’s design does not, addressing the tautology concern about frequency manip- ulation in frequency-sensitive models

work page

[34] [34]

focused exclusively on the dative alternation; we ex- tend to causative and locative constructions

Multi-construction scope:Yao et al. focused exclusively on the dative alternation; we ex- tend to causative and locative constructions

work page