Quantifying the cross-linguistic effects of syncretism on agreement attraction
Pith reviewed 2026-05-21 04:27 UTC · model grok-4.3
The pith
Surprisal and attention entropy from large language models capture how syncretism modulates agreement attraction across languages.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results (no modulation), and partially capture Russian patterns.
What carries the argument
Surprisal and attention entropy from large language models, extracted as proxies for human processing difficulty during agreement attraction.
If this is right
- The same LLM measures can be applied to additional languages to predict whether syncretism will modulate attraction.
- Differences in how LLMs encode syncretic forms offer a candidate mechanism for the observed cross-linguistic variation.
- Computational proxies of this kind can reduce reliance on new behavioral experiments when testing morphological influences on agreement.
Where Pith is reading between the lines
- The partial match in Russian suggests that additional LLM features or finer-grained morphological annotations may be needed to fully align with human data.
- Similar proxy measures could be tested on other morphosyntactic error types, such as case or gender attraction, to see whether the approach generalizes.
Load-bearing premise
That surprisal and attention entropy extracted from large language models are valid and sufficient proxies for human processing difficulty underlying agreement attraction errors.
What would settle it
Finding no reliable correlation between the LLM surprisal or attention-entropy values and human agreement-attraction error rates in a language where behavioral data already exist.
Figures
read the original abstract
Agreement attraction errors, in which a verb erroneously agrees with an intervening noun rather than its grammatical head, are amplified by morphological syncretism in some languages (English, German, Russian) but not others (Turkish, Armenian), a cross-linguistic pattern without a principled account. We use surprisal and attention entropy from large language models as processing proxies to investigate this variation across four languages. LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results (no modulation), and partially capture Russian patterns. We discuss further directions for better understanding why syncretism affects agreement attraction differently across languages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript investigates cross-linguistic variation in agreement attraction errors modulated by morphological syncretism, using surprisal and attention entropy extracted from large language models as processing proxies. It claims that these LLM-derived measures replicate the behavioral modulation effect in English and German, align with the null effect in Turkish, and partially capture patterns in Russian, offering a computational account for why syncretism affects attraction differently across languages.
Significance. If the proxies prove robust, the work provides a scalable, parameter-free method to model cross-linguistic processing phenomena that lack a unified theoretical explanation. The use of off-the-shelf LLMs without fitting to target data is a methodological strength that avoids circularity. This could bridge computational and psycholinguistic approaches to agreement, though its impact hinges on ruling out confounds from model training data.
major comments (2)
- [Methods] Methods section: The manuscript does not report per-language perplexity on the experimental items, token coverage of critical syncretic forms, or results from a balanced multilingual model. Without these, the observed pattern (modulation in EN/DE, null in TR, partial in RU) cannot be distinguished from an artifact of training data imbalance, which is load-bearing for the central claim that the proxies track the linguistic variable.
- [Results] Results section: The claim of replication and partial alignment lacks reported effect sizes, statistical tests, or controls for the LLM measures; the abstract and summary provide no quantitative details on how surprisal/attention entropy differences were compared to behavioral data, undermining assessment of the cross-linguistic alignment.
minor comments (2)
- [Abstract] Abstract: The phrase 'partially capture Russian patterns' is vague and should be expanded to specify which aspects align or diverge from the behavioral findings.
- [Methods] Notation: Clarify whether attention entropy is computed over the full attention matrix or specific heads/layers when comparing across languages.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the robustness of our computational approach to modeling cross-linguistic agreement attraction. We have revised the manuscript to incorporate additional reporting and quantitative details as requested, strengthening the distinction between linguistic effects and potential model artifacts.
read point-by-point responses
-
Referee: [Methods] Methods section: The manuscript does not report per-language perplexity on the experimental items, token coverage of critical syncretic forms, or results from a balanced multilingual model. Without these, the observed pattern (modulation in EN/DE, null in TR, partial in RU) cannot be distinguished from an artifact of training data imbalance, which is load-bearing for the central claim that the proxies track the linguistic variable.
Authors: We agree that these details are important for ruling out training data confounds. In the revised Methods section, we now include per-language perplexity scores computed on the experimental items, token coverage statistics for the critical syncretic forms across languages, and comparative results from a balanced multilingual model (XLM-R). These additions confirm that the modulation patterns in English and German, the null effect in Turkish, and the partial alignment in Russian are consistent with the syncretism variable rather than data imbalance. revision: yes
-
Referee: [Results] Results section: The claim of replication and partial alignment lacks reported effect sizes, statistical tests, or controls for the LLM measures; the abstract and summary provide no quantitative details on how surprisal/attention entropy differences were compared to behavioral data, undermining assessment of the cross-linguistic alignment.
Authors: We acknowledge that more quantitative rigor is needed to evaluate the strength of the alignments. The revised Results section now reports effect sizes for the surprisal and attention entropy differences, includes statistical tests (e.g., mixed-effects models) comparing LLM proxies to behavioral data, and adds controls for baseline measures. The abstract and summary have been updated with quantitative details on the replication in English/German, alignment with the Turkish null result, and partial match in Russian. revision: yes
Circularity Check
No circularity: LLM proxies applied off-the-shelf to replicate external behavioral patterns
full rationale
The paper extracts surprisal and attention entropy from pre-trained LLMs as fixed processing proxies and compares the resulting patterns to independently collected behavioral data on agreement attraction in English, German, Turkish, and Russian. No parameters are fitted to the target agreement data, no self-citation chain justifies the core measures, and the replication does not reduce to the input findings by construction. The derivation therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We use surprisal and attention entropy from large language models as processing proxies to investigate this variation across four languages.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[4]
Maxim Bazhukov, Ekaterina Voloshina, Sergey Pletenev, Arseny Anisimov, Oleg Serikov, and Svetlana Toldova. 2024. https://doi.org/10.18653/v1/2024.conll-1.22 Of models and men: Probing neural networks for agreement attraction with psycholinguistic data . In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 280--290
-
[5]
Kevin Clark, Urvashi Khandelwal, Omer Levy, and Christopher D Manning. 2019. https://arxiv.org/abs/1906.04341 What does BERT look at? A n analysis of BERT 's attention . arXiv Prepr. arXiv:1906,04341
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[6]
Marie-Catherine De Marneffe, Christopher D Manning, Joakim Nivre, and Daniel Zeman. 2021. https://doi.org/10.1016/b978-0-323-95504-1.01150-9 Universal dependencies . Computational Linguistics, 47:255--308
-
[8]
Brian Dillon and Maayan Keshev. 2025. https://doi.org/10.1017/9781009179362.035 Syntactic dependency formation in sentence processing: A comparative perspective . In Sjef Barbiers, Norbert Corver, and Maria Polinsky, editors, Cambridge Handbook of Comparative Syntax. Cambridge UP
-
[10]
Kathleen M Eberhard, J Cooper Cutting, and Kathryn Bock. 2005. https://doi.org/10.1037/0033-295x.112.3.531 Making syntax of sense: number agreement in sentence production. Psychological Review, 112:531
-
[14]
Christopher Hammerly, Adrian Staub, and Brian Dillon. 2019. https://doi.org/10.31234/osf.io/6f34y The grammaticality asymmetry in agreement attraction reflects response bias: Experimental and modeling evidence . Cognitive Psychology, 110:70--104
-
[17]
Karin R Humphreys and Kathryn Bock. 2005. https://doi.org/10.3758/bf03196759 Notional number agreement in E nglish . Psychonomic Bulletin & Review, 12:689--695
-
[23]
Danny Merkx and Stefan L. Frank. 2021. https://doi.org/10.18653/v1/2021.cmcl-1.2 Human sentence processing: Recurrence or attention? Proceedings of the W orkshop on C ognitive M odeling and C omputational L inguistics
-
[26]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, and Others . 2019. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9
work page 2019
- [27]
-
[30]
Eric S Solomon and Neal J Pearlmutter. 2004. https://doi.org/10.1016/j.cogpsych.2003.10.001 Semantic integration and syntactic planning in language production . Cognitive Psychology, 49:1--46
-
[31]
Patrick Sturt and Nayoung Kwon. 2024. https://doi.org/10.1080/23273798.2023.2269282 Agreement attraction in comprehension: do active dependencies and distractor position play a role? Language, Cognition and Neuroscience, 39:279--301
-
[36]
Ethan Wilcox, Jon Gauthier, Jennifer Hu, Peng Qian, and Roger Levy. 2020. On the predictive power of neural language models for human real-time comprehension behavior. In Proceedings of the A nnual M eeting of the C ognitive S cience S ociety
work page 2020
-
[37]
Himanshu Yadav, Garrett Smith, Sebastian Reich, and Shravan Vasishth. 2023. https://doi.org/10.31234/osf.io/s4c9t Number feature distortion modulates cue-based retrieval in reading . Journal of Memory and Language, 129:104400
- [38]
-
[39]
Neural Language Models Capture Some, but Not All, Agreement Attraction Effects , booktitle =
Arehalli, Suhas and Linzen, Tal , year =. Neural Language Models Capture Some, but Not All, Agreement Attraction Effects , booktitle =. doi:10.31234/osf.io/97qcg , langid =
-
[40]
Does Case Marking Affect Agreement Attraction in Comprehension? , author =. 2020 , journal =. doi:10.1016/j.jml.2020.104087 , langid =
-
[41]
Badecker, William and Kuminiak, Frantisek , year =. Morphology, Agreement and Working Memory Retrieval in Sentence Production: Evidence from Gender and Case in. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2006.08.004 , langid =
-
[42]
Processing Agreement in Hindi: When Agreement Feeds Attraction , shorttitle =
Bhatia, Sakshi and Dillon, Brian , year =. Processing Agreement in Hindi: When Agreement Feeds Attraction , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2022.104322 , langid =
- [43]
-
[44]
Bleotu, Adina Camelia and Dillon, Brian , year =. Romanian (Subject-like). Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2023.104445 , langid =
-
[45]
Broken Agreement , author =. 1991 , journal =. doi:10.1016/0010-0285(91)90003-7 , langid =
- [46]
-
[47]
A Grammar-Book Treebank of Turkish , booktitle =
Çöltekin, Çağrı , year =. A Grammar-Book Treebank of Turkish , booktitle =
- [48]
-
[49]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , year =. CoRR , shortjournal =. doi:10.48550/arXiv.1810.04805 , url =. 1810.04805 , eprinttype =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1810.04805
-
[50]
Humphreys, Karin R and Bock, Kathryn , journal=. Notional number agreement in. 2005 , publisher=
work page 2005
- [51]
-
[52]
Cognitive Psychology , volume=
The grammaticality asymmetry in agreement attraction reflects response bias: Experimental and modeling evidence , author=. Cognitive Psychology , volume=. 2019 , publisher=
work page 2019
-
[53]
Language, Cognition and Neuroscience , volume=
Agreement attraction in comprehension: do active dependencies and distractor position play a role? , author=. Language, Cognition and Neuroscience , volume=. 2024 , publisher=
work page 2024
-
[54]
Proceedings of the 28th Conference on Computational Natural Language Learning , pages=
Of models and men: Probing neural networks for agreement attraction with psycholinguistic data , author=. Proceedings of the 28th Conference on Computational Natural Language Learning , pages=. 2024 , doi=
work page 2024
-
[55]
Journal of Memory and Language , volume=
Number feature distortion modulates cue-based retrieval in reading , author=. Journal of Memory and Language , volume=. 2023 , publisher=
work page 2023
-
[56]
Cognitive Psychology , volume=
Semantic integration and syntactic planning in language production , author=. Cognitive Psychology , volume=. 2004 , publisher=
work page 2004
-
[57]
Dillon, Brian and Mishler, Alan and Sloggett, Shayne and Phillips, Colin , year =. Contrasting Intrusion Profiles for Agreement and Anaphora: Experimental and Modeling Evidence , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2013.04.003 , langid =
-
[58]
Syntactic Dependency Formation in Sentence Processing: A Comparative Perspective , shorttitle =
Dillon, Brian and Keshev, Maayan , year =. Syntactic Dependency Formation in Sentence Processing: A Comparative Perspective , shorttitle =. Cambridge Handbook of Comparative Syntax , shortjournal =
-
[59]
Data Conversion and Consistency of Monolingual Corpora: Russian
Droganova, Kira and Lyashevskaya, Olga and Zeman, Daniel , year =. Data Conversion and Consistency of Monolingual Corpora: Russian. Proceedings of the 17th
-
[60]
The Nature of Case Interference in Online Sentence Processing in Russian , booktitle =
Fedorenko, Evelina and Babyonyshev, Maria and Gibson, Edward , editor =. The Nature of Case Interference in Online Sentence Processing in Russian , booktitle =. 2004 , volume =
work page 2004
-
[61]
Agreement and Movement: A Syntactic Analysis of Attraction , shorttitle =
Franck, Julie and Lassi, Glenda and Frauenfelder, Ulrich H and Rizzi, Luigi , year =. Agreement and Movement: A Syntactic Analysis of Attraction , shorttitle =. Cognition , shortjournal =. doi:10.1016/j.cognition.2005.10.003 , langid =
-
[62]
Goodkind, Adam and Bicknell, Klinton , year =. Predictive Power of Word Surprisal for Reading Times Is a Linear Function of Language Model Quality , booktitle=. doi:10.18653/v1/W18-0102 , langid =
-
[63]
Hale, John , year =. A Probabilistic. Second meeting of the. doi:10.3115/1073336.1073357 , langid =
-
[64]
Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number , author =. 2023 , journal =. 2310.15151 , eprinttype =
-
[65]
A Theory of Repetition and Retrieval in Language Production , author =. 2021 , journal =. doi:10.1037/rev0000305 , langid =
-
[66]
Morphophonological Influences on the Construction of Subject–Verb Agreement , author =. 2003 , journal =. doi:10.3758/BF03195814 , langid =
-
[67]
Kesgin, H Toprak and Yuce, M Kaan and Dogan, Eren and Uzun, M Egemen and Uz, Atahan and Seyrek, H Emre and Zeer, Ahmed and Amasyali, M Fatih , year =. Introducing. arXiv Prepr. arXiv:2404,17336 , eprint =
-
[68]
Köse, Mehmet and Yıldız, Olcay Taner , year =
-
[69]
Kuzgun, Aslı and Cesur, Neslihan and Yıldız, Olcay Taner and Kuyrukçu, Oğuzhan and Yenice, Arife Betül and Arıcan, Bilge Nas and Sanıyar, Ezgi , year =
-
[70]
Only Case-Syncretic Nouns Attract:
Lacina, Radim and Laurinavichyute, Anna and Chromỳ, Jan , year =. Only Case-Syncretic Nouns Attract:. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2025.104623 , langid =
-
[71]
Straight from the Horse's Mouth: Agreement Attraction Effects with
Lago, Sol and Gračanin–Yuksek, Martina and Şafak, Duygu Fatma and Demir, Orhan and Kırkıcı, Bilal and Felser, Claudia , year =. Straight from the Horse's Mouth: Agreement Attraction Effects with. Linguistic Approaches to Bilingualism , shortjournal =. doi:10.1075/lab.17019.lag , langid =
-
[72]
A Psycholinguistic Evaluation of Language Models’ Sensitivity to Argument Roles , booktitle =
Lee, Eun-Kyoung Rosa and Nair, Sathvik and Feldman, Naomi , year =. A Psycholinguistic Evaluation of Language Models’ Sensitivity to Argument Roles , booktitle =. doi:10.18653/v1/2024.findings-emnlp.186 , langid =
-
[73]
Expectation-Based Syntactic Comprehension , author =. 2008 , journal =. doi:10.1016/j.cognition.2007.05.006 , langid =
-
[74]
An Activation–Based Model of Sentence Processing as Skilled Memory Retrieval , author =. 2005 , journal =. doi:10.1207/s15516709cog0000_25 , langid =
-
[75]
Case Matching and Conflicting Bindings Interference , booktitle =
Logačev, Pavel and Vasishth, Shravan , editor =. Case Matching and Conflicting Bindings Interference , booktitle =. 2012 , pages =
work page 2012
-
[76]
Universal Dependencies for Russian: A New Syntactic Dependencies Tagset , shorttitle =
Lyashevskaya, Olga and Droganova, Kira and Zeman, Daniel and Alexeeva, Maria and Gavrilova, Tatiana and Mustafina, Nina and Shakurova, Elena , year =. Universal Dependencies for Russian: A New Syntactic Dependencies Tagset , shorttitle =
- [77]
-
[78]
Misra, Kanishka , year =. Minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models , shorttitle =. arXiv Prepr. arXiv:2203,13112 , eprint =
-
[79]
The Effect of Case Marking on Subject–Verb Agreement Errors in English , booktitle =
Nicol, Janet and Antón-Méndez, Inés , editor =. The Effect of Case Marking on Subject–Verb Agreement Errors in English , booktitle =. 2009 , pages =
work page 2009
-
[80]
Minimal Interference from Possessor Phrases in the Production of Subject-Verb Agreement , author =. 2016 , journal =. doi:10.3389/fpsyg.2016.00548 , langid =
-
[81]
Entropy-and Distance-Based Predictors from
Oh, Byung-Doh and Schuler, William , year =. Entropy-and Distance-Based Predictors from. Proceedings of the 2022. doi:10.18653/v1/2022.emnlp-main.632 , langid =
-
[82]
Language Models Are Unsupervised Multitask Learners , author =. 2019 , journal =
work page 2019
-
[83]
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction , author =. 2021 , journal =. 2105.06965 , eprinttype =
-
[84]
Agreement Errors Are Predicted by Rational Inference in Sentence Processing , author =. 2021 , doi =
work page 2021
-
[85]
Ryu, Soo Hyun and Lewis, Richard L , year =. Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-Based Interference on Surprisal and Attention , shorttitle =. arXiv Prepr. arXiv:2104,12874 , eprint =
- [86]
-
[87]
Forms and Features: The Role of Syncretism in Number Agreement Attraction , shorttitle =
Slioussar, Natalia , year =. Forms and Features: The Role of Syncretism in Number Agreement Attraction , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2018.03.006 , langid =
-
[88]
The Effect of Word Predictability on Reading Time Is Logarithmic , author =. 2013 , journal =. doi:10.1016/j.cognition.2013.02.013 , langid =
-
[89]
A Principled Approach to Feature Selection in Models of Sentence Processing , author =. 2020 , journal =. doi:10.1111/cogs.12918 , langid =
-
[90]
Timkey, William and Linzen, Tal , year =. A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing , booktitle =. doi:10.18653/v1/2023.findings-emnlp.582 , langid =
-
[91]
Resources for Turkish Dependency Parsing: Introducing the
Türk, Utku and Atmaca, Furkan and Özateş, Şaziye Betül and Berk, Gözde and Bedir, Seyyit Talha and Köksal, Abdullatif and Başaran, Balkız Öztürk and Güngör, Tunga and Özgür, Arzucan , year =. Resources for Turkish Dependency Parsing: Introducing the. Language Resources and Evaluation , shortjournal =
-
[92]
Türk, Utku and Logačev, Pavel , year =. Agreement Attraction in. Language, Cognition and Neuroscience , shortjournal =. doi:10.1080/23273798.2024.2324766 , langid =
-
[93]
Voita, Elena and Talbot, David and Moiseev, Fedor and Sennrich, Rico and Titov, Ivan , year =. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , shorttitle =. Proceedings of the 57th. doi:10.18653/v1/P19-1580 , url =
-
[94]
Wagers, Matthew W. and Lau, Ellen F. and Phillips, Colin , year =. Agreement Attraction in Comprehension: Representations and Processes , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2009.04.002 , langid =
-
[95]
Wilcox, Ethan and Gauthier, Jon and Hu, Jennifer and Qian, Peng and Levy, Roger , year =. On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , booktitle =
-
[96]
Testing the Predictions of Surprisal Theory in 11 Languages , author =. 2023 , journal =. doi:10.1162/tacl_a_00612 , langid =
-
[97]
Transformers: State-of-the-Art Natural Language Processing
Wolf, Thomas and Debut, Lysandre and Sanh, Victor and Chaumond, Julien and Delangue, Clement and Moi, Anthony and Cistac, Pierric and Rault, Tim and Louf, Remi and Funtowicz, Morgan and Davison, Joe and Shleifer, Sam and family=Platen, given=Patrick, prefix=von, useprefix=false and Ma, Clara and Jernite, Yacine and Plu, Julien and Xu, Canwen and Scao, Tev...
-
[98]
Zeldes, Amir , year =. The. Language Resources and Evaluation , shortjournal =. doi:10.1007/s10579-016-9343-x , langid =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.