pith. sign in

arxiv: 2605.21403 · v1 · pith:U3LBSM43new · submitted 2026-05-20 · 💻 cs.CL

Quantifying the cross-linguistic effects of syncretism on agreement attraction

Pith reviewed 2026-05-21 04:27 UTC · model grok-4.3

classification 💻 cs.CL
keywords agreement attractionsyncretismlarge language modelssurprisalattention entropycross-linguistic variationsentence processing
0
0 comments X

The pith

Surprisal and attention entropy from large language models capture how syncretism modulates agreement attraction across languages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to explain a cross-linguistic pattern in which morphological syncretism increases agreement attraction errors in English, German, and Russian but shows no such effect in Turkish. It does so by treating surprisal and attention entropy extracted from LLMs as stand-ins for the processing load that produces these errors in humans. The measures match the observed behavioral data in English and German, reproduce the Turkish absence of modulation, and track Russian results only in part. A reader would care because the work supplies a uniform computational test for why the same morphological feature produces different processing outcomes in different languages.

Core claim

LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results (no modulation), and partially capture Russian patterns.

What carries the argument

Surprisal and attention entropy from large language models, extracted as proxies for human processing difficulty during agreement attraction.

If this is right

  • The same LLM measures can be applied to additional languages to predict whether syncretism will modulate attraction.
  • Differences in how LLMs encode syncretic forms offer a candidate mechanism for the observed cross-linguistic variation.
  • Computational proxies of this kind can reduce reliance on new behavioral experiments when testing morphological influences on agreement.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The partial match in Russian suggests that additional LLM features or finer-grained morphological annotations may be needed to fully align with human data.
  • Similar proxy measures could be tested on other morphosyntactic error types, such as case or gender attraction, to see whether the approach generalizes.

Load-bearing premise

That surprisal and attention entropy extracted from large language models are valid and sufficient proxies for human processing difficulty underlying agreement attraction errors.

What would settle it

Finding no reliable correlation between the LLM surprisal or attention-entropy values and human agreement-attraction error rates in a language where behavioral data already exist.

Figures

Figures reproduced from arXiv: 2605.21403 by Eva Neu, Utku Turk.

Figure 1
Figure 1. Figure 1: German model-derived surprisal and attention entropy measures (means with SE bars). [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: English model-derived surprisal and attention entropy measures (means with SE bars). [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Russian model-derived surprisal and attention entropy measures (means with SE bars). [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Turkish model-derived surprisal and attention entropy measures (means with SE bars). [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Agreement attraction errors, in which a verb erroneously agrees with an intervening noun rather than its grammatical head, are amplified by morphological syncretism in some languages (English, German, Russian) but not others (Turkish, Armenian), a cross-linguistic pattern without a principled account. We use surprisal and attention entropy from large language models as processing proxies to investigate this variation across four languages. LLM-derived measures replicate behavioral findings in English and German (syncretism modulates attraction), align with Turkish null results (no modulation), and partially capture Russian patterns. We discuss further directions for better understanding why syncretism affects agreement attraction differently across languages.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript investigates cross-linguistic variation in agreement attraction errors modulated by morphological syncretism, using surprisal and attention entropy extracted from large language models as processing proxies. It claims that these LLM-derived measures replicate the behavioral modulation effect in English and German, align with the null effect in Turkish, and partially capture patterns in Russian, offering a computational account for why syncretism affects attraction differently across languages.

Significance. If the proxies prove robust, the work provides a scalable, parameter-free method to model cross-linguistic processing phenomena that lack a unified theoretical explanation. The use of off-the-shelf LLMs without fitting to target data is a methodological strength that avoids circularity. This could bridge computational and psycholinguistic approaches to agreement, though its impact hinges on ruling out confounds from model training data.

major comments (2)
  1. [Methods] Methods section: The manuscript does not report per-language perplexity on the experimental items, token coverage of critical syncretic forms, or results from a balanced multilingual model. Without these, the observed pattern (modulation in EN/DE, null in TR, partial in RU) cannot be distinguished from an artifact of training data imbalance, which is load-bearing for the central claim that the proxies track the linguistic variable.
  2. [Results] Results section: The claim of replication and partial alignment lacks reported effect sizes, statistical tests, or controls for the LLM measures; the abstract and summary provide no quantitative details on how surprisal/attention entropy differences were compared to behavioral data, undermining assessment of the cross-linguistic alignment.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'partially capture Russian patterns' is vague and should be expanded to specify which aspects align or diverge from the behavioral findings.
  2. [Methods] Notation: Clarify whether attention entropy is computed over the full attention matrix or specific heads/layers when comparing across languages.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the robustness of our computational approach to modeling cross-linguistic agreement attraction. We have revised the manuscript to incorporate additional reporting and quantitative details as requested, strengthening the distinction between linguistic effects and potential model artifacts.

read point-by-point responses
  1. Referee: [Methods] Methods section: The manuscript does not report per-language perplexity on the experimental items, token coverage of critical syncretic forms, or results from a balanced multilingual model. Without these, the observed pattern (modulation in EN/DE, null in TR, partial in RU) cannot be distinguished from an artifact of training data imbalance, which is load-bearing for the central claim that the proxies track the linguistic variable.

    Authors: We agree that these details are important for ruling out training data confounds. In the revised Methods section, we now include per-language perplexity scores computed on the experimental items, token coverage statistics for the critical syncretic forms across languages, and comparative results from a balanced multilingual model (XLM-R). These additions confirm that the modulation patterns in English and German, the null effect in Turkish, and the partial alignment in Russian are consistent with the syncretism variable rather than data imbalance. revision: yes

  2. Referee: [Results] Results section: The claim of replication and partial alignment lacks reported effect sizes, statistical tests, or controls for the LLM measures; the abstract and summary provide no quantitative details on how surprisal/attention entropy differences were compared to behavioral data, undermining assessment of the cross-linguistic alignment.

    Authors: We acknowledge that more quantitative rigor is needed to evaluate the strength of the alignments. The revised Results section now reports effect sizes for the surprisal and attention entropy differences, includes statistical tests (e.g., mixed-effects models) comparing LLM proxies to behavioral data, and adds controls for baseline measures. The abstract and summary have been updated with quantitative details on the replication in English/German, alignment with the Turkish null result, and partial match in Russian. revision: yes

Circularity Check

0 steps flagged

No circularity: LLM proxies applied off-the-shelf to replicate external behavioral patterns

full rationale

The paper extracts surprisal and attention entropy from pre-trained LLMs as fixed processing proxies and compares the resulting patterns to independently collected behavioral data on agreement attraction in English, German, Turkish, and Russian. No parameters are fitted to the target agreement data, no self-citation chain justifies the core measures, and the replication does not reduce to the input findings by construction. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the work rests on the unstated premise that LLM internal states map to human processing costs.

pith-pipeline@v0.9.0 · 5626 in / 1048 out tokens · 25174 ms · 2026-05-21T04:27:53.397741+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · 2 internal anchors

  1. [4]

    Maxim Bazhukov, Ekaterina Voloshina, Sergey Pletenev, Arseny Anisimov, Oleg Serikov, and Svetlana Toldova. 2024. https://doi.org/10.18653/v1/2024.conll-1.22 Of models and men: Probing neural networks for agreement attraction with psycholinguistic data . In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 280--290

  2. [5]

    Kevin Clark, Urvashi Khandelwal, Omer Levy, and Christopher D Manning. 2019. https://arxiv.org/abs/1906.04341 What does BERT look at? A n analysis of BERT 's attention . arXiv Prepr. arXiv:1906,04341

  3. [6]

    Marie-Catherine De Marneffe, Christopher D Manning, Joakim Nivre, and Daniel Zeman. 2021. https://doi.org/10.1016/b978-0-323-95504-1.01150-9 Universal dependencies . Computational Linguistics, 47:255--308

  4. [8]

    Brian Dillon and Maayan Keshev. 2025. https://doi.org/10.1017/9781009179362.035 Syntactic dependency formation in sentence processing: A comparative perspective . In Sjef Barbiers, Norbert Corver, and Maria Polinsky, editors, Cambridge Handbook of Comparative Syntax. Cambridge UP

  5. [10]

    Kathleen M Eberhard, J Cooper Cutting, and Kathryn Bock. 2005. https://doi.org/10.1037/0033-295x.112.3.531 Making syntax of sense: number agreement in sentence production. Psychological Review, 112:531

  6. [14]

    Christopher Hammerly, Adrian Staub, and Brian Dillon. 2019. https://doi.org/10.31234/osf.io/6f34y The grammaticality asymmetry in agreement attraction reflects response bias: Experimental and modeling evidence . Cognitive Psychology, 110:70--104

  7. [17]

    Karin R Humphreys and Kathryn Bock. 2005. https://doi.org/10.3758/bf03196759 Notional number agreement in E nglish . Psychonomic Bulletin & Review, 12:689--695

  8. [23]

    Danny Merkx and Stefan L. Frank. 2021. https://doi.org/10.18653/v1/2021.cmcl-1.2 Human sentence processing: Recurrence or attention? Proceedings of the W orkshop on C ognitive M odeling and C omputational L inguistics

  9. [26]

    Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, and Others . 2019. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9

  10. [27]

    Soo Hyun Ryu and Richard L Lewis. 2021. https://arxiv.org/abs/2104.12874 Accounting for agreement phenomena in sentence comprehension with transformer language models: Effects of similarity-based interference on surprisal and attention . arXiv Prepr. arXiv:2104,12874

  11. [30]

    Eric S Solomon and Neal J Pearlmutter. 2004. https://doi.org/10.1016/j.cogpsych.2003.10.001 Semantic integration and syntactic planning in language production . Cognitive Psychology, 49:1--46

  12. [31]

    Patrick Sturt and Nayoung Kwon. 2024. https://doi.org/10.1080/23273798.2023.2269282 Agreement attraction in comprehension: do active dependencies and distractor position play a role? Language, Cognition and Neuroscience, 39:279--301

  13. [36]

    Ethan Wilcox, Jon Gauthier, Jennifer Hu, Peng Qian, and Roger Levy. 2020. On the predictive power of neural language models for human real-time comprehension behavior. In Proceedings of the A nnual M eeting of the C ognitive S cience S ociety

  14. [37]

    Himanshu Yadav, Garrett Smith, Sebastian Reich, and Shravan Vasishth. 2023. https://doi.org/10.31234/osf.io/s4c9t Number feature distortion modulates cue-based retrieval in reading . Journal of Memory and Language, 129:104400

  15. [38]

    Proceedings of the 16th

    Ahrenberg, Lars , year =. Proceedings of the 16th

  16. [39]

    Neural Language Models Capture Some, but Not All, Agreement Attraction Effects , booktitle =

    Arehalli, Suhas and Linzen, Tal , year =. Neural Language Models Capture Some, but Not All, Agreement Attraction Effects , booktitle =. doi:10.31234/osf.io/97qcg , langid =

  17. [40]

    2020 , journal =

    Does Case Marking Affect Agreement Attraction in Comprehension? , author =. 2020 , journal =. doi:10.1016/j.jml.2020.104087 , langid =

  18. [41]

    Morphology, Agreement and Working Memory Retrieval in Sentence Production: Evidence from Gender and Case in

    Badecker, William and Kuminiak, Frantisek , year =. Morphology, Agreement and Working Memory Retrieval in Sentence Production: Evidence from Gender and Case in. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2006.08.004 , langid =

  19. [42]

    Processing Agreement in Hindi: When Agreement Feeds Attraction , shorttitle =

    Bhatia, Sakshi and Dillon, Brian , year =. Processing Agreement in Hindi: When Agreement Feeds Attraction , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2022.104322 , langid =

  20. [43]

    2013 , journal =

    Inflectional Synthesis of the Verb , author =. 2013 , journal =

  21. [44]

    Romanian (Subject-like)

    Bleotu, Adina Camelia and Dillon, Brian , year =. Romanian (Subject-like). Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2023.104445 , langid =

  22. [45]

    1991 , journal =

    Broken Agreement , author =. 1991 , journal =. doi:10.1016/0010-0285(91)90003-7 , langid =

  23. [46]

    What Does

    Clark, Kevin and Khandelwal, Urvashi and Levy, Omer and Manning, Christopher D , year =. What Does. arXiv Prepr. arXiv:1906,04341 , eprint =

  24. [47]

    A Grammar-Book Treebank of Turkish , booktitle =

    Çöltekin, Çağrı , year =. A Grammar-Book Treebank of Turkish , booktitle =

  25. [48]

    2021 , journal =

    Universal Dependencies , author =. 2021 , journal =

  26. [49]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , year =. CoRR , shortjournal =. doi:10.48550/arXiv.1810.04805 , url =. 1810.04805 , eprinttype =

  27. [50]

    Notional number agreement in

    Humphreys, Karin R and Bock, Kathryn , journal=. Notional number agreement in. 2005 , publisher=

  28. [51]

    , author=

    Making syntax of sense: number agreement in sentence production. , author=. Psychological Review , volume=. 2005 , publisher=

  29. [52]

    Cognitive Psychology , volume=

    The grammaticality asymmetry in agreement attraction reflects response bias: Experimental and modeling evidence , author=. Cognitive Psychology , volume=. 2019 , publisher=

  30. [53]

    Language, Cognition and Neuroscience , volume=

    Agreement attraction in comprehension: do active dependencies and distractor position play a role? , author=. Language, Cognition and Neuroscience , volume=. 2024 , publisher=

  31. [54]

    Proceedings of the 28th Conference on Computational Natural Language Learning , pages=

    Of models and men: Probing neural networks for agreement attraction with psycholinguistic data , author=. Proceedings of the 28th Conference on Computational Natural Language Learning , pages=. 2024 , doi=

  32. [55]

    Journal of Memory and Language , volume=

    Number feature distortion modulates cue-based retrieval in reading , author=. Journal of Memory and Language , volume=. 2023 , publisher=

  33. [56]

    Cognitive Psychology , volume=

    Semantic integration and syntactic planning in language production , author=. Cognitive Psychology , volume=. 2004 , publisher=

  34. [57]

    Contrasting Intrusion Profiles for Agreement and Anaphora: Experimental and Modeling Evidence , shorttitle =

    Dillon, Brian and Mishler, Alan and Sloggett, Shayne and Phillips, Colin , year =. Contrasting Intrusion Profiles for Agreement and Anaphora: Experimental and Modeling Evidence , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2013.04.003 , langid =

  35. [58]

    Syntactic Dependency Formation in Sentence Processing: A Comparative Perspective , shorttitle =

    Dillon, Brian and Keshev, Maayan , year =. Syntactic Dependency Formation in Sentence Processing: A Comparative Perspective , shorttitle =. Cambridge Handbook of Comparative Syntax , shortjournal =

  36. [59]

    Data Conversion and Consistency of Monolingual Corpora: Russian

    Droganova, Kira and Lyashevskaya, Olga and Zeman, Daniel , year =. Data Conversion and Consistency of Monolingual Corpora: Russian. Proceedings of the 17th

  37. [60]

    The Nature of Case Interference in Online Sentence Processing in Russian , booktitle =

    Fedorenko, Evelina and Babyonyshev, Maria and Gibson, Edward , editor =. The Nature of Case Interference in Online Sentence Processing in Russian , booktitle =. 2004 , volume =

  38. [61]

    Agreement and Movement: A Syntactic Analysis of Attraction , shorttitle =

    Franck, Julie and Lassi, Glenda and Frauenfelder, Ulrich H and Rizzi, Luigi , year =. Agreement and Movement: A Syntactic Analysis of Attraction , shorttitle =. Cognition , shortjournal =. doi:10.1016/j.cognition.2005.10.003 , langid =

  39. [62]

    , author Bicknell, K

    Goodkind, Adam and Bicknell, Klinton , year =. Predictive Power of Word Surprisal for Reading Times Is a Linear Function of Language Model Quality , booktitle=. doi:10.18653/v1/W18-0102 , langid =

  40. [63]

    A Probabilistic

    Hale, John , year =. A Probabilistic. Second meeting of the. doi:10.3115/1073336.1073357 , langid =

  41. [64]

    2023 , journal =

    Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number , author =. 2023 , journal =. 2310.15151 , eprinttype =

  42. [65]

    2021 , journal =

    A Theory of Repetition and Retrieval in Language Production , author =. 2021 , journal =. doi:10.1037/rev0000305 , langid =

  43. [66]

    2003 , journal =

    Morphophonological Influences on the Construction of Subject–Verb Agreement , author =. 2003 , journal =. doi:10.3758/BF03195814 , langid =

  44. [67]

    Introducing

    Kesgin, H Toprak and Yuce, M Kaan and Dogan, Eren and Uzun, M Egemen and Uz, Atahan and Seyrek, H Emre and Zeer, Ahmed and Amasyali, M Fatih , year =. Introducing. arXiv Prepr. arXiv:2404,17336 , eprint =

  45. [68]

    Köse, Mehmet and Yıldız, Olcay Taner , year =

  46. [69]

    Kuzgun, Aslı and Cesur, Neslihan and Yıldız, Olcay Taner and Kuyrukçu, Oğuzhan and Yenice, Arife Betül and Arıcan, Bilge Nas and Sanıyar, Ezgi , year =

  47. [70]

    Only Case-Syncretic Nouns Attract:

    Lacina, Radim and Laurinavichyute, Anna and Chromỳ, Jan , year =. Only Case-Syncretic Nouns Attract:. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2025.104623 , langid =

  48. [71]

    Straight from the Horse's Mouth: Agreement Attraction Effects with

    Lago, Sol and Gračanin–Yuksek, Martina and Şafak, Duygu Fatma and Demir, Orhan and Kırkıcı, Bilal and Felser, Claudia , year =. Straight from the Horse's Mouth: Agreement Attraction Effects with. Linguistic Approaches to Bilingualism , shortjournal =. doi:10.1075/lab.17019.lag , langid =

  49. [72]

    A Psycholinguistic Evaluation of Language Models’ Sensitivity to Argument Roles , booktitle =

    Lee, Eun-Kyoung Rosa and Nair, Sathvik and Feldman, Naomi , year =. A Psycholinguistic Evaluation of Language Models’ Sensitivity to Argument Roles , booktitle =. doi:10.18653/v1/2024.findings-emnlp.186 , langid =

  50. [73]

    Cognition , year =

    Expectation-Based Syntactic Comprehension , author =. 2008 , journal =. doi:10.1016/j.cognition.2007.05.006 , langid =

  51. [74]

    2005 , journal =

    An Activation–Based Model of Sentence Processing as Skilled Memory Retrieval , author =. 2005 , journal =. doi:10.1207/s15516709cog0000_25 , langid =

  52. [75]

    Case Matching and Conflicting Bindings Interference , booktitle =

    Logačev, Pavel and Vasishth, Shravan , editor =. Case Matching and Conflicting Bindings Interference , booktitle =. 2012 , pages =

  53. [76]

    Universal Dependencies for Russian: A New Syntactic Dependencies Tagset , shorttitle =

    Lyashevskaya, Olga and Droganova, Kira and Zeman, Daniel and Alexeeva, Maria and Gavrilova, Tatiana and Mustafina, Nina and Shakurova, Elena , year =. Universal Dependencies for Russian: A New Syntactic Dependencies Tagset , shorttitle =

  54. [77]

    , year =

    Merkx, Danny and Frank, Stefan L. , year =. Human Sentence Processing: Recurrence or Attention? , shorttitle =. Proceedings of the

  55. [78]

    Minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models , shorttitle =

    Misra, Kanishka , year =. Minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models , shorttitle =. arXiv Prepr. arXiv:2203,13112 , eprint =

  56. [79]

    The Effect of Case Marking on Subject–Verb Agreement Errors in English , booktitle =

    Nicol, Janet and Antón-Méndez, Inés , editor =. The Effect of Case Marking on Subject–Verb Agreement Errors in English , booktitle =. 2009 , pages =

  57. [80]

    2016 , journal =

    Minimal Interference from Possessor Phrases in the Production of Subject-Verb Agreement , author =. 2016 , journal =. doi:10.3389/fpsyg.2016.00548 , langid =

  58. [81]

    Entropy-and Distance-Based Predictors from

    Oh, Byung-Doh and Schuler, William , year =. Entropy-and Distance-Based Predictors from. Proceedings of the 2022. doi:10.18653/v1/2022.emnlp-main.632 , langid =

  59. [82]

    2019 , journal =

    Language Models Are Unsupervised Multitask Learners , author =. 2019 , journal =

  60. [83]

    2021 , journal =

    Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction , author =. 2021 , journal =. 2105.06965 , eprinttype =

  61. [84]

    2021 , doi =

    Agreement Errors Are Predicted by Rational Inference in Sentence Processing , author =. 2021 , doi =

  62. [85]

    Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-Based Interference on Surprisal and Attention , shorttitle =

    Ryu, Soo Hyun and Lewis, Richard L , year =. Accounting for Agreement Phenomena in Sentence Comprehension with Transformer Language Models: Effects of Similarity-Based Interference on Surprisal and Attention , shorttitle =. arXiv Prepr. arXiv:2104,12874 , eprint =

  63. [86]

    , year =

    Silveira, Natalia and Dozat, Timothy and family=Marneffe, given=Marie-Catherine, prefix=de, useprefix=false and Bowman, Samuel and Connor, Miriam and Bauer, John and Manning, Christopher D. , year =. A Gold Standard Dependency Corpus for English , booktitle =

  64. [87]

    Forms and Features: The Role of Syncretism in Number Agreement Attraction , shorttitle =

    Slioussar, Natalia , year =. Forms and Features: The Role of Syncretism in Number Agreement Attraction , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2018.03.006 , langid =

  65. [88]

    and Levy, Roger , year =

    The Effect of Word Predictability on Reading Time Is Logarithmic , author =. 2013 , journal =. doi:10.1016/j.cognition.2013.02.013 , langid =

  66. [89]

    2020 , journal =

    A Principled Approach to Feature Selection in Models of Sentence Processing , author =. 2020 , journal =. doi:10.1111/cogs.12918 , langid =

  67. [90]

    A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing , booktitle =

    Timkey, William and Linzen, Tal , year =. A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing , booktitle =. doi:10.18653/v1/2023.findings-emnlp.582 , langid =

  68. [91]

    Resources for Turkish Dependency Parsing: Introducing the

    Türk, Utku and Atmaca, Furkan and Özateş, Şaziye Betül and Berk, Gözde and Bedir, Seyyit Talha and Köksal, Abdullatif and Başaran, Balkız Öztürk and Güngör, Tunga and Özgür, Arzucan , year =. Resources for Turkish Dependency Parsing: Introducing the. Language Resources and Evaluation , shortjournal =

  69. [92]

    Agreement Attraction in

    Türk, Utku and Logačev, Pavel , year =. Agreement Attraction in. Language, Cognition and Neuroscience , shortjournal =. doi:10.1080/23273798.2024.2324766 , langid =

  70. [93]

    Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , shorttitle =

    Voita, Elena and Talbot, David and Moiseev, Fedor and Sennrich, Rico and Titov, Ivan , year =. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , shorttitle =. Proceedings of the 57th. doi:10.18653/v1/P19-1580 , url =

  71. [94]

    and Lau, Ellen F

    Wagers, Matthew W. and Lau, Ellen F. and Phillips, Colin , year =. Agreement Attraction in Comprehension: Representations and Processes , shorttitle =. Journal of Memory and Language , shortjournal =. doi:10.1016/j.jml.2009.04.002 , langid =

  72. [95]

    On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , booktitle =

    Wilcox, Ethan and Gauthier, Jon and Hu, Jennifer and Qian, Peng and Levy, Roger , year =. On the Predictive Power of Neural Language Models for Human Real-Time Comprehension Behavior , booktitle =

  73. [96]

    2023 , journal =

    Testing the Predictions of Surprisal Theory in 11 Languages , author =. 2023 , journal =. doi:10.1162/tacl_a_00612 , langid =

  74. [97]

    Transformers: State-of-the-Art Natural Language Processing

    Wolf, Thomas and Debut, Lysandre and Sanh, Victor and Chaumond, Julien and Delangue, Clement and Moi, Anthony and Cistac, Pierric and Rault, Tim and Louf, Remi and Funtowicz, Morgan and Davison, Joe and Shleifer, Sam and family=Platen, given=Patrick, prefix=von, useprefix=false and Ma, Clara and Jernite, Yacine and Plu, Julien and Xu, Canwen and Scao, Tev...

  75. [98]

    Zeldes, Amir , year =. The. Language Resources and Evaluation , shortjournal =. doi:10.1007/s10579-016-9343-x , langid =