pith. sign in

arxiv: 2606.04340 · v1 · pith:DAAZEOW6new · submitted 2026-06-03 · 💻 cs.CL

Noisy memory encoding explains negative polarity illusions

Pith reviewed 2026-06-28 06:57 UTC · model grok-4.3

classification 💻 cs.CL
keywords negative polarity illusionslossy context surprisaldeterminer memoryacceptability judgmentssentence processingresource-rational comprehensionworking memory limitations
0
0 comments X

The pith

Imperfect memory of determiners allows people to accept ungrammatical negative polarity sentences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that negative polarity illusions occur because readers encode complex sentences with noisy memory for determiners. This noise lets them mentally swap similar determiners between main and embedded clause subjects, licensing the word 'ever' in otherwise unlicensed positions. Experiments using new determiner pairs such as 'many' and 'few' produced stronger illusion effects than the classic 'no' case, even without time pressure. The results are presented as evidence that language comprehension involves resource-rational reconstruction from lossy input rather than perfect encoding.

Core claim

The lossy context surprisal theory explains negative polarity illusions because people have poor memory representations of determiners in main-clause and embedded-clause subjects and can entertain a determiner exchange that licenses 'ever'. More similar determiners trigger stronger illusion effects, as shown by acceptability judgments where a sentence with 'Many authors that few critics recommended have ever received...' produced a much stronger effect than the canonical version.

What carries the argument

Lossy context surprisal, the mechanism by which readers maintain imperfect encodings of sentence context and rationally reconstruct the most probable licensing configuration from noisy determiner memory.

If this is right

  • Stronger illusions arise when the two determiners are more similar in meaning or form.
  • The illusion persists even when participants are not under time pressure.
  • Acceptability judgments reflect rational reconstruction from noisy memory rather than strict grammatical licensing.
  • The same noisy-encoding process should produce parallel illusions in other constructions that depend on distant licensers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The account predicts that increasing memory load on the two subject positions should increase illusion rates.
  • Similar effects may appear in other languages whose negative polarity items are licensed by specific determiners.
  • The findings suggest that standard acceptability tasks may systematically underestimate grammatical knowledge when memory noise is high.

Load-bearing premise

The acceptability differences are produced by similarity-based determiner exchange in memory rather than by uncontrolled properties of the new sentence materials or by the judgment task itself.

What would settle it

A controlled replication in which illusion strength remains equal across similar and dissimilar determiner pairs after matching all other lexical and structural properties of the materials.

Figures

Figures reproduced from arXiv: 2606.04340 by Edward Gibson, Yuhan Zhang.

Figure 1
Figure 1. Figure 1: Arithmetic mean of the cosine similarities across three embeddings [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Acceptability judgment ratings across six experiments. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Descriptive correlation between cosine similarity of determiner pairs and poste [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
read the original abstract

A sentence like "The authors that no critics recommended have ever received acknowledgment for a best-selling novel" is sometimes rated as acceptable even though, strictly speaking, it is ungrammatical because the negative polarity word "ever" is not licensed where it is. This behavioral effect is sometimes called a "negative polarity illusion". Here we propose that the lossy context surprisal theory of Hahn et al. (2022) -- whereby people have an imperfect encoding of complex sentences -- might explain this effect. We hypothesize that people have poor memory representation of the determiners in the main-clause and embedded-clause subjects and could entertain a determiner exchange that licenses ever. We propose that more similar determiners in those positions would trigger stronger illusion effects. Acceptability judgment tasks with six novel determiner pairs (e.g., "few" and "many", "few" and "most") support our proposal, showing, specifically, that a novel sentence, "Many authors that few critics recommended have ever received acknowledgment for a best-selling novel", triggered a much stronger illusion than the canonical one even without time pressure. These results offer further support for the suggestion that human language processing is imperfect and resource-rational: in face of working memory limitations, humans rationally reconstruct what is most likely from noisy linguistic input to facilitate downstream processing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that negative polarity illusions (e.g., acceptability of 'The authors that no critics recommended have ever...') arise because lossy context surprisal produces imperfect memory encodings of determiners in main- and embedded-clause subjects, allowing rational reconstruction via a determiner exchange that licenses the NPI. It tests a derived prediction that more similar determiner pairs produce stronger illusions and reports that acceptability judgments with six novel pairs (including 'many'/'few') show a much stronger illusion for 'Many authors that few critics recommended have ever...' than the canonical case, even without time pressure.

Significance. If the results survive controls for lexical and semantic confounds, the work would extend the lossy context surprisal framework of Hahn et al. (2022) to a new empirical domain and supply additional evidence that human sentence processing is resource-rational under noisy memory. The paper tests a derived prediction rather than refitting parameters to illusion data.

major comments (2)
  1. [Abstract] Abstract: the claim that acceptability differences are caused by determiner similarity in memory (enabling the exchange mechanism) is load-bearing for the central hypothesis, yet the novel sentence sets are not reported to have been matched on lexical frequency, semantic features, or plausibility norms; without such controls or a quantitative similarity metric derived from the Hahn et al. model, alternative explanations for the rating differences cannot be ruled out.
  2. [Abstract] Abstract: the reported support for the proposal rests on the specific finding that the 'many/few' pair produced a much stronger illusion, but no details are supplied on participant numbers, statistical tests, effect sizes, or any analysis of lexical confounds, rendering it impossible to assess whether the data actually isolate the memory-encoding pathway.
minor comments (1)
  1. [Abstract] Abstract: the phrasing 'even without time pressure' could be clarified by briefly noting the relation to prior time-pressure manipulations in the NPI-illusion literature.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that acceptability differences are caused by determiner similarity in memory (enabling the exchange mechanism) is load-bearing for the central hypothesis, yet the novel sentence sets are not reported to have been matched on lexical frequency, semantic features, or plausibility norms; without such controls or a quantitative similarity metric derived from the Hahn et al. model, alternative explanations for the rating differences cannot be ruled out.

    Authors: We agree that stimulus controls are critical for isolating the proposed memory-encoding mechanism. The Methods section describes selection of determiner pairs for semantic similarity and use of corpus frequency norms to match lexical frequency, with sentences constructed for comparable naturalness. However, we did not collect new plausibility norms or compute a quantitative similarity metric from the Hahn et al. model. We will revise the manuscript to add a quantitative similarity analysis and report any additional lexical/semantic controls, and we will update the abstract to reference these controls. revision: yes

  2. Referee: [Abstract] Abstract: the reported support for the proposal rests on the specific finding that the 'many/few' pair produced a much stronger illusion, but no details are supplied on participant numbers, statistical tests, effect sizes, or any analysis of lexical confounds, rendering it impossible to assess whether the data actually isolate the memory-encoding pathway.

    Authors: The abstract is concise and therefore omits methodological specifics. The Experiments and Results sections of the full manuscript report participant numbers, the statistical models (including mixed-effects regressions), effect sizes, and explicit discussion of lexical confounds. We will revise the abstract to include a brief statement summarizing the statistical evidence and controls so that the key claims can be evaluated from the abstract alone. revision: yes

Circularity Check

0 steps flagged

No significant circularity; external theory tested via new experiments

full rationale

The paper adopts the lossy context surprisal theory directly from Hahn et al. (2022) as an external premise and generates a novel hypothesis that determiner similarity in subject positions will modulate illusion strength. This hypothesis is evaluated through new acceptability judgment experiments using six novel determiner pairs rather than by fitting any parameters to the target illusion data. No equations, self-definitions, or load-bearing self-citations appear in the derivation; the central claim remains independent of the present paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The account rests on the imported lossy context surprisal theory and on the untested assumption that determiner confusability is the dominant factor in the new materials.

axioms (1)
  • domain assumption Lossy context surprisal theory of Hahn et al. (2022) supplies the memory-encoding mechanism
    Invoked in the abstract as the explanatory framework without re-derivation.

pith-pipeline@v0.9.1-grok · 5754 in / 1194 out tokens · 13337 ms · 2026-06-28T06:57:14.047716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

160 extracted references · 63 canonical work pages · 3 internal anchors

  1. [1]

    Annual Review of Vision Science , volume=

    Using Illusions to Track the Emergence of Visual Perception , author=. Annual Review of Vision Science , volume=. 2024 , publisher=

  2. [2]

    Nature neuroscience , volume=

    The representation of perceived angular size in human primary visual cortex , author=. Nature neuroscience , volume=. 2006 , publisher=

  3. [3]

    2013 , publisher=

    The psychology of visual illusion , author=. 2013 , publisher=

  4. [4]

    MOST" AND

    ON MEASUREMENT AND QUANTIFICATION: THE CASE OF" MOST" AND" MORE THAN HALF" , author=. Language , pages=. 2016 , publisher=

  5. [5]

    Natural Language Semantics , volume=

    Experimental investigations of ambiguity: the case of most , author=. Natural Language Semantics , volume=. 2015 , publisher=

  6. [6]

    Mind & Language , volume=

    The meaning of ‘most’: Semantics, numerosity and psychology , author=. Mind & Language , volume=. 2009 , publisher=

  7. [7]

    Language and linguistics compass , volume=

    The semantics of many, much, few, and little , author=. Language and linguistics compass , volume=. 2018 , publisher=

  8. [8]

    Natural language semantics , volume=

    On the grammar and processing of proportional quantifiers: most versus more than half , author=. Natural language semantics , volume=. 2009 , publisher=

  9. [9]

    Journal of psycholinguistic research , volume=

    Working memory mechanism in proportional quantifier verification , author=. Journal of psycholinguistic research , volume=. 2014 , publisher=

  10. [10]

    Intelligence , volume=

    Most intelligent people are accurate and some fast people are intelligent.: Intelligence, working memory, and semantic processing of quantifiers from a computational perspective , author=. Intelligence , volume=. 2013 , publisher=

  11. [11]

    Ratio , volume=

    QR Out of a Tensed Clause: Evidence from Antecedent-Contained Deletion , author=. Ratio , volume=. 2015 , publisher=

  12. [12]

    1995 , publisher=

    Logical form: From GB to minimalism , author=. 1995 , publisher=

  13. [13]

    Papers from the

    Quantifier scope and syntactic islands , author=. Papers from the... Regional Meeting. Chicago Ling. Soc. Chicago, Ill , number=

  14. [14]

    1985 , publisher=

    Logical form: Its structure and derivation , author=. 1985 , publisher=

  15. [15]

    Semantics in generative grammar , author=

  16. [16]

    Science , volume=

    Integration of visual and linguistic information in spoken language comprehension , author=. Science , volume=. 1995 , publisher=

  17. [17]

    Cognition , volume=

    Achieving incremental semantic interpretation through contextual representation , author=. Cognition , volume=. 1999 , publisher=

  18. [18]

    Linguistics and philosophy , pages=

    Negative polarity and grammatical representation , author=. Linguistics and philosophy , pages=. 1987 , publisher=

  19. [19]

    Language , pages=

    Only, emotive factive verbs, and the dual nature of polarity dependency , author=. Language , pages=. 2006 , publisher=

  20. [20]

    1998 , publisher=

    The atomic components of thought , author=. 1998 , publisher=

  21. [21]

    Talk Presented at the 2024 Linguistic Society of America Annual Meeting , year=

    Revisiting the NPI Illusion Effect: Exploring the Influence of Distance and Licensor Types , author=. Talk Presented at the 2024 Linguistic Society of America Annual Meeting , year=

  22. [22]

    1976 , school=

    The syntactic domain of anaphora , author=. 1976 , school=

  23. [23]

    Few” or “many

    “Few” or “many”? An adaptation level theory account for flexibility in quantifier processing , author=. Frontiers in psychology , volume=. 2020 , publisher=

  24. [24]

    Language, Cognition and Neuroscience , volume=

    The interplay of computational complexity and memory load during quantifier verification , author=. Language, Cognition and Neuroscience , volume=. 2024 , publisher=

  25. [25]

    Cognition , volume=

    Computational complexity explains neural differences in quantifier verification , author=. Cognition , volume=. 2022 , publisher=

  26. [26]

    Cognition , volume=

    Probing the mental representation of quantifiers , author=. Cognition , volume=. 2018 , publisher=

  27. [27]

    Proceedings of ESCOL , volume=

    Many quantifiers , author=. Proceedings of ESCOL , volume=

  28. [28]

    Semantics—Sentence and information structure , pages=

    Negative and positive polarity items , author=. Semantics—Sentence and information structure , pages=

  29. [29]

    Frontiers in Psychology , volume=

    Investigating a neural language model’s replicability of psycholinguistic experiments: A case study of NPI licensing , author=. Frontiers in Psychology , volume=. 2023 , publisher=

  30. [30]

    Frontiers in Psychology , volume=

    Assessing the role of experimental evidence for interface judgment: Licensing of negative polarity items, scalar readings, and focus , author=. Frontiers in Psychology , volume=. 2018 , publisher=

  31. [31]

    Statistics and computing , volume=

    Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC , author=. Statistics and computing , volume=. 2017 , publisher=

  32. [32]

    The American Statistician , volume=

    Package ‘lsmeans’ , author=. The American Statistician , volume=

  33. [33]

    Journal of multivariate analysis , volume=

    Generating random correlation matrices based on vines and extended onion method , author=. Journal of multivariate analysis , volume=. 2009 , publisher=

  34. [34]

    Journal of memory and language , volume=

    Random effects structure for confirmatory hypothesis testing: Keep it maximal , author=. Journal of memory and language , volume=. 2013 , publisher=

  35. [35]

    Journal of statistical software , volume=

    brms: An R package for Bayesian multilevel models using Stan , author=. Journal of statistical software , volume=

  36. [36]

    Advances in Methods and Practices in Psychological Science , volume=

    Ordinal regression models in psychology: A tutorial , author=. Advances in Methods and Practices in Psychological Science , volume=. 2019 , publisher=

  37. [37]

    Cognition , volume=

    A noisy-channel approach to depth-charge illusions , author=. Cognition , volume=. 2023 , publisher=

  38. [38]

    Polarity Sensitivity as (Non) Veridical Dependency , pages=

    Polarity sensitivity as (non) veridical dependency , author=. Polarity Sensitivity as (Non) Veridical Dependency , pages=. 1998 , publisher=

  39. [39]

    , author=

    Polarity Sensitivity as Inherent Scope Relations. , author=. 1979 , publisher=

  40. [40]

    Journal of semantics , volume=

    NPI licensing, Strawson entailment, and context dependency , author=. Journal of semantics , volume=. 1999 , publisher=

  41. [41]

    The Routledge handbook of semantics , pages=

    Negation and polarity , author=. The Routledge handbook of semantics , pages=. 2015 , publisher=

  42. [42]

    Proceedings of the National Academy of Sciences , volume=

    A resource-rational model of human processing of recursive linguistic structure , author=. Proceedings of the National Academy of Sciences , volume=. 2022 , publisher=

  43. [43]

    2026 , url =

    R: A Language and Environment for Statistical Computing , author =. 2026 , url =

  44. [44]

    Advances in neural information processing systems , volume=

    Chain-of-thought prompting elicits reasoning in large language models , author=. Advances in neural information processing systems , volume=

  45. [45]

    arXiv preprint arXiv:2203.13112 , year=

    minicons: Enabling flexible behavioral and representational analyses of transformer language models , author=. arXiv preprint arXiv:2203.13112 , year=

  46. [46]

    arXiv preprint arXiv:2205.01068 , year=

    Opt: Open pre-trained transformer language models , author=. arXiv preprint arXiv:2205.01068 , year=

  47. [47]

    Language and cognitive processes , volume=

    Memory limitations and structural forgetting: The perception of complex ungrammatical sentences as grammatical , author=. Language and cognitive processes , volume=. 1999 , publisher=

  48. [48]

    Tom B. Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwal and Ariel Herbert. Language Models are Few-Shot Learners , journal =. 2020 , url =. 2005.14165 , timestamp =

  49. [49]

    Neural language models as psycholinguistic subjects: Representations of syntactic state , author=. Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: Human language technologies, Volume 1 (Long and Short Papers) , pages=

  50. [50]

    Advances in neural information processing systems , volume=

    Attention is all you need , author=. Advances in neural information processing systems , volume=

  51. [51]

    Psychonomic Bulletin & Review , volume=

    Can we enhance working memory? Bias and effectiveness in cognitive training studies , author=. Psychonomic Bulletin & Review , volume=. 2024 , publisher=

  52. [52]

    Cognitive Science , volume=

    Toward a connectionist model of recursion in human linguistic performance , author=. Cognitive Science , volume=. 1999 , publisher=

  53. [53]

    Proceedings of the Annual Meeting of the Cognitive Science Society , volume=

    The (non) necessity of recursion in natural language processing , author=. Proceedings of the Annual Meeting of the Cognitive Science Society , volume=

  54. [54]

    Language Learning , volume=

    A usage-based approach to recursion in sentence processing , author=. Language Learning , volume=. 2009 , publisher=

  55. [55]

    Behavioral and brain sciences , volume=

    The magical number 4 in short-term memory: A reconsideration of mental storage capacity , author=. Behavioral and brain sciences , volume=. 2001 , publisher=

  56. [56]

    Handbook of Mathematical Psychology , year=

    Finitary models of language users , author=. Handbook of Mathematical Psychology , year=

  57. [57]

    science , volume=

    The faculty of language: what is it, who has it, and how did it evolve? , author=. science , volume=. 2002 , publisher=

  58. [58]

    1965 , publisher=

    Aspects of the Theory of Syntax , author=. 1965 , publisher=

  59. [59]

    Can Transformers Process Recursive Nested Constructions, Like Humans?

    Lakretz, Yair and Desbordes, Th \'e o and Hupkes, Dieuwke and Dehaene, Stanislas. Can Transformers Process Recursive Nested Constructions, Like Humans?. Proceedings of the 29th International Conference on Computational Linguistics. 2022

  60. [60]

    Annual review of psychology , volume=

    Working memory: Theories, models, and controversies , author=. Annual review of psychology , volume=. 2012 , publisher=

  61. [61]

    , author=

    The magical number seven, plus or minus two: Some limits on our capacity for processing information. , author=. Psychological review , volume=. 1956 , publisher=

  62. [62]

    Language and Cognitive processes , volume=

    Grammars, parsers, and memory limitations , author=. Language and Cognitive processes , volume=. 1986 , publisher=

  63. [63]

    Journal of Linguistics , volume=

    Constraints on multiple center-embedding of clauses , author=. Journal of Linguistics , volume=. 2007 , publisher=

  64. [64]

    Journal of Psycholinguistic Research , author =

    Interference in short-term memory:. Journal of Psycholinguistic Research , author =. 1996 , pages =. doi:10.1007/BF01708421 , language =

  65. [65]

    2023 , school=

    Active assignment of quantifier scope guides language processing , author=. 2023 , school=

  66. [66]

    Language, Cognition and Neuroscience , volume=

    Positive polarity items: an illusion of ungrammaticality , author=. Language, Cognition and Neuroscience , volume=. 2025 , publisher=

  67. [67]

    and Hendrick, Randall and Johnson, Marcus , year =

    Gordon, Peter C. and Hendrick, Randall and Johnson, Marcus , year =. Memory interference during language processing. , volume =. Journal of experimental psychology: learning, memory, and cognition , publisher =

  68. [68]

    Cognitive Science , author =

    Consequences of the. Cognitive Science , author =. 2005 , pages =

  69. [69]

    Cognition , author =

    Linguistic complexity: locality of syntactic dependencies , volume =. Cognition , author =. 1998 , keywords =. doi:10.1016/S0010-0277(98)00034-1 , abstract =

  70. [70]

    , year =

    Reali, Florencia and Christiansen, Morten H. , year =. Processing of relative clauses is made easier by frequency of occurrence , volume =. Journal of memory and language , publisher =

  71. [71]

    Perspectives on Psychological Science: A Journal of the Association for Psychological Science , author =

    Working. Perspectives on Psychological Science: A Journal of the Association for Psychological Science , author =. 2016 , keywords =. doi:10.1177/1745691616635612 , abstract =

  72. [72]

    GPT-4o System Card

    OpenAI and Hurst, Aaron and Lerer, Adam and Goucher, Adam P. and Perelman, Adam and Ramesh, Aditya and Clark, Aidan and Ostrow, A. J. and Welihinda, Akila and Hayes, Alan and Radford, Alec and Mądry, Aleksander and Baker-Whitcomb, Alex and Beutel, Alex and Borzunov, Alex and Carney, Alex and Chow, Alex and Kirillov, Alex and Nichol, Alex and Paino, Alex a...

  73. [73]

    Transformers: State-of-the-Art Natural Language Processing

    Wolf, Thomas and others. Transformers: State-of-the-Art Natural Language Processing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2020

  74. [74]

    OpenAI and Agarwal, Sandhini and Ahmad, Lama and Ai, Jason and Altman, Sam and Applebaum, Andy and Arbus, Edwin and Arora, Rahul K. and Bai, Yu and Baker, Bowen and Bao, Haiming and Barak, Boaz and Bennett, Ally and Bertao, Tyler and Brett, Nivedita and Brevdo, Eugene and Brockman, Greg and Bubeck, Sebastien and Chang, Che and Chen, Kai and Chen, Mark and...

  75. [75]

    Information and Control , author =

    Free recall of self-embedded english sentences , volume =. Information and Control , author =. 1964 , pages =. doi:10.1016/S0019-9958(64)90310-9 , language =

  76. [76]

    Proceedings of the AAAI Conference on Artificial Intelligence , author =

    Working. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2024 , keywords =. doi:10.1609/aaai.v38i9.28868 , abstract =

  77. [77]

    Exploring

    Hong, Eunjin and Cho, Sumin and Kim, Juae , editor =. Exploring. Proceedings of the 14th. 2025 , pages =

  78. [78]

    Zhang, Chunhui and Jian, Yiren and Ouyang, Zhongyu and Vosoughi, Soroush , editor =. Working. Proceedings of the 2024. 2024 , pages =. doi:10.18653/v1/2024.emnlp-main.938 , abstract =

  79. [79]

    Journal of Experimental Psychology

    Working memory, attention control, and the. Journal of Experimental Psychology. Learning, Memory, and Cognition , author =. 2007 , keywords =. doi:10.1037/0278-7393.33.3.615 , abstract =

  80. [80]

    Yao, Shunyu and Peng, Binghui and Papadimitriou, Christos and Narasimhan, Karthik , year =. Self-. Proceedings of the 59th. doi:10.18653/v1/2021.acl-long.292 , language =

Showing first 80 references.