Rethinking the Idiomaticity Decomposability Hypothesis: Evidence from Distributional Learning

Aline Villavicencio; Atsuki Yamaguchi; Felix Gers; Golzar Atefi; Maggie Mi; Nafise Sadat Moosavi

arxiv: 2606.03817 · v1 · pith:5VAQBZXPnew · submitted 2026-06-02 · 💻 cs.CL

Rethinking the Idiomaticity Decomposability Hypothesis: Evidence from Distributional Learning

Maggie Mi , Golzar Atefi , Atsuki Yamaguchi , Felix Gers , Aline Villavicencio , Nafise Sadat Moosavi This is my paper

Pith reviewed 2026-06-28 10:02 UTC · model grok-4.3

classification 💻 cs.CL

keywords idiomsdecomposabilitydistributional learningsyntactic flexibilitylanguage modelspretrainingsurprisalfrequency

0 comments

The pith

Contextualised language models show idiom decomposability correlates weakly with human judgments and negatively with syntactic flexibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests the traditional view that idiom decomposability predicts syntactic flexibility by treating contextualised language models as controlled distributional learners from text. It introduces a model-internal decomposability measure that only weakly matches human ratings and instead shows a small negative relationship with how flexibly idioms appear in syntax. Tracking representations across pretraining reveals that idiom stabilisation depends on surprisal, decomposability, and frequency together, not frequency in isolation, with decomposability exerting the strongest effect that grows with training. A sympathetic reader would care because the results shift emphasis toward usage-based accounts that stress cumulative distributional experience over fixed semantic properties of the idiom itself.

Core claim

Using contextualised language models as controlled distributional learners, a model-internal measure of idiom decomposability correlates weakly with human judgments and shows a small but consistent negative relationship with syntactic flexibility. Pretraining analyses show that stabilisation of idiom representations in models is not explained by frequency alone. Instead, surprisal, decomposability, and frequency all contribute, with decomposability showing the strongest training-dependent effect.

What carries the argument

Model-internal measure of decomposability derived from contextualised language models during pretraining.

If this is right

Syntactic flexibility of idioms cannot be attributed primarily to decomposability.
Stabilisation of idiom representations during pretraining depends jointly on surprisal, decomposability, and frequency.
Decomposability exerts its largest influence on representation stability as training data volume increases.
Human judgments of decomposability align only weakly with what models extract from text distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Usage-based accounts that foreground predictability and cumulative exposure may offer a stronger account of idiom behaviour than decomposability alone.
Model-derived measures could be tested as proxies for tracking human learning trajectories in controlled psycholinguistic experiments.
The same pretraining analysis approach could be extended to other classes of multiword expressions to check whether decomposability effects generalise.

Load-bearing premise

That contextualised language models function as valid controlled distributional learners whose internal measures of decomposability and learning dynamics can be directly compared to human idiom processing.

What would settle it

A replication in which the same model-internal decomposability measure produced a strong positive correlation with human judgments and a positive relationship with syntactic flexibility would falsify the reported weak and negative relationships.

Figures

Figures reproduced from arXiv: 2606.03817 by Aline Villavicencio, Atsuki Yamaguchi, Felix Gers, Golzar Atefi, Maggie Mi, Nafise Sadat Moosavi.

**Figure 2.** Figure 2: Representation similarity over pretraining for OLMo-2 7B (top row) and OLMo-3 7B (bottom row), mea [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Correlation results for BERT-base Uncased [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Correlation results for BERT-base Cased 18 [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Correlation results for BERT-large Uncased [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: Correlation results for BERT-large Cased [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Correlation results for ModernBERT Base 21 [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Correlation results for ModernBERT Large [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗

read the original abstract

Idioms can be analysed in terms of their decomposability, the extent to which constituent meanings contribute to the figurative whole. Decomposability is thought to predict syntactic flexibility. Usage-based accounts instead attribute idiom behaviour to distributional experience, such as speaker familiarity and predictability. We examine these views using contextualised language models as controlled distributional learners. We propose a model-internal measure of decomposability and relate it to human ratings, syntactic flexibility, and predictability while tracking idiom learning during pretraining. Model-derived decomposability correlates weakly with human judgments and shows a small but consistent negative relationship with syntactic flexibility. Pretraining analyses show that stabilisation of idiom representations in models is not explained by frequency alone. Instead, surprisal, decomposability, and frequency all contribute, with decomposability showing the strongest training-dependent effect.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports weak model-human decomposability correlation plus pretraining effects from multiple factors, but the model proxy for human judgments is the shaky part.

read the letter

The main result here is that a model-derived decomposability score lines up only weakly with human ratings and shows a small negative tie to syntactic flexibility. Pretraining runs indicate that idiom representation stabilization depends on surprisal, decomposability, and frequency together, with decomposability having the largest training-dependent role rather than frequency alone.

What the work does is apply standard LM probing techniques to an old linguistic contrast between decomposability accounts and usage-based ones. Tracking how representations change across pretraining steps is a reasonable way to bring distributional evidence into the picture, and the claim that multiple factors matter is consistent with the data they describe.

The soft spot is the assumption that the internal model measure captures the same decomposability humans are rating. The weak correlation already hints that the two may not align closely, which undercuts using the model score to argue against decomposability-centric views. If the score is picking up embedding composition quirks or training-objective artifacts instead, then the negative flexibility link and the pretraining rankings could be model-specific rather than general evidence. The abstract gives no details on how the score is built, what statistical controls are used, or how data exclusions were handled, so it is hard to judge whether the reported patterns hold up.

This is aimed at computational linguists and psycholinguists who already work with LMs on figurative language. It deserves a serious referee if the methods turn out to be clearly specified and reproducible, because it supplies concrete numbers on a theoretical question even though the central mapping from model to human remains open to doubt.

Referee Report

3 major / 2 minor

Summary. The paper claims that contextualized language models serve as controlled distributional learners to test the idiomaticity decomposability hypothesis. It introduces a model-internal decomposability measure that correlates only weakly with human judgments and shows a small negative relationship with syntactic flexibility. Pretraining trajectory analyses indicate that stabilization of idiom representations is not explained by frequency alone; instead surprisal, decomposability, and frequency all contribute, with decomposability exhibiting the strongest training-dependent effect.

Significance. If the mapping from model measure to human decomposability holds, the work supplies computational evidence favoring multi-factor usage-based accounts of idiom processing over purely decomposability-based accounts. The pretraining dynamics analysis is a clear strength, as it tracks representational change over training steps rather than relying on static snapshots.

major comments (3)

[§3] §3 (Methods): The model-internal decomposability measure is defined from contextualized representations, yet the manuscript supplies no explicit formula, layer selection, or distance metric. Without this, it is impossible to determine whether the measure isolates constituent contributions in a manner comparable to human decomposability ratings, which is load-bearing for the central claim that the weak correlation challenges the decomposability hypothesis.
[§5] §5 (Pretraining analyses): The assertion that decomposability shows the strongest training-dependent effect is based on unspecified regression models. No coefficients, interaction terms, or model-comparison statistics are reported for the joint contributions of surprisal, decomposability, and frequency, so the relative strength of the decomposability effect cannot be verified.
[§4] §4 (Correlation and flexibility results): The reported negative relationship between model decomposability and syntactic flexibility is presented as evidence against decomposability accounts, but the weak correlation with human ratings (r < 0.3 range implied) raises the possibility that the dissociation is an artifact of the model's embedding geometry rather than a general property of idioms.

minor comments (2)

[Abstract] Abstract: Mentions model choice, measure construction, and statistical controls only at a high level; adding one sentence on each would improve readability.
[Throughout] Notation: The decomposability score would benefit from an explicit equation or pseudocode block to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below, indicating where revisions will be made to improve clarity and verifiability while defending the core claims on the basis of the existing analyses.

read point-by-point responses

Referee: [§3] §3 (Methods): The model-internal decomposability measure is defined from contextualized representations, yet the manuscript supplies no explicit formula, layer selection, or distance metric. Without this, it is impossible to determine whether the measure isolates constituent contributions in a manner comparable to human decomposability ratings, which is load-bearing for the central claim that the weak correlation challenges the decomposability hypothesis.

Authors: We agree that the Methods section would benefit from greater explicitness. The decomposability measure is operationalized as the cosine similarity between the contextualized idiom embedding and the element-wise sum of its constituent embeddings (extracted from the same context), using the final transformer layer. In the revised manuscript we will insert the precise mathematical definition, confirm layer selection, and state the distance metric to permit direct replication and comparison with human ratings. revision: yes
Referee: [§5] §5 (Pretraining analyses): The assertion that decomposability shows the strongest training-dependent effect is based on unspecified regression models. No coefficients, interaction terms, or model-comparison statistics are reported for the joint contributions of surprisal, decomposability, and frequency, so the relative strength of the decomposability effect cannot be verified.

Authors: The pretraining analyses employ linear mixed-effects models with representation stability as the outcome and main effects plus interactions of surprisal, decomposability, and frequency with training step as predictors. We will add the full model equations, coefficient tables (with standard errors and t-values), and model-comparison statistics (AIC, BIC, and likelihood-ratio tests) in the revised §5 so that the relative magnitude of the decomposability-by-step interaction can be directly evaluated. revision: yes
Referee: [§4] §4 (Correlation and flexibility results): The reported negative relationship between model decomposability and syntactic flexibility is presented as evidence against decomposability accounts, but the weak correlation with human ratings (r < 0.3 range implied) raises the possibility that the dissociation is an artifact of the model's embedding geometry rather than a general property of idioms.

Authors: The weak correlation with human ratings is itself a substantive result indicating that distributional decomposability diverges from introspective judgments. The negative association with syntactic flexibility survives controls for frequency and is observed across multiple model families. We will expand the discussion to explicitly consider embedding-geometry artifacts and will add a supplementary analysis contrasting contextual versus static embeddings; however, we maintain that the pattern is not reducible to geometry alone given the training-dynamic evidence. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical correlations from proposed model measure

full rationale

The paper proposes a model-internal decomposability measure and reports its empirical correlations with human ratings, syntactic flexibility, and pretraining dynamics. No equations, fitted parameters, or derivations are shown that reduce any prediction to the same inputs by construction. The central results are observational comparisons rather than self-definitional or self-citation load-bearing steps. The derivation chain is self-contained against external human judgments and does not invoke uniqueness theorems or ansatzes from prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are described in the abstract. The model-internal decomposability measure is introduced but its construction details are absent.

pith-pipeline@v0.9.1-grok · 5687 in / 1139 out tokens · 31447 ms · 2026-06-28T10:02:26.504011+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 28 canonical work pages

[1]

Journal of memory and language , volume=

More than words: Frequency effects for multi-word phrases , author=. Journal of memory and language , volume=. 2010 , publisher=

2010
[2]

time and again

Seeing a phrase “time and again” matters: The role of phrasal frequency in the processing of multiword sequences. , author=. Journal of Experimental Psychology: Learning, Memory, and Cognition , volume=. 2011 , publisher=

2011
[3]

Brain and language , volume=

Evidence for frequency-based constituents in the mental lexicon: Collocations involving the word of , author=. Brain and language , volume=. 2002 , publisher=

2002
[4]

Glossa: a journal of general linguistics , volume=

Sources of variability in the syntactic flexibility of idioms , author=. Glossa: a journal of general linguistics , volume=. 2023 , publisher=

2023
[5]

An invitation to cognitive science: Language , volume=

Lexical semantics and compositionality , author=. An invitation to cognitive science: Language , volume=. 1995 , publisher=

1995
[6]

Journal of memory and language , volume=

Lexical access during the production of idiomatic phrases , author=. Journal of memory and language , volume=. 2006 , publisher=

2006
[7]

Characterizing Idioms: Conventionality and Contingency

Socolof, Michaela and Cheung, Jackie and Wagner, Michael and O ' Donnell, Timothy. Characterizing Idioms: Conventionality and Contingency. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.278

work page doi:10.18653/v1/2022.acl-long.278 2022
[8]

Smith and Hannaneh Hajishirzi , booktitle=

Evan Pete Walsh and Luca Soldaini and Dirk Groeneveld and Kyle Lo and Shane Arora and Akshita Bhagia and Yuling Gu and Shengyi Huang and Matt Jordan and Nathan Lambert and Dustin Schwenk and Oyvind Tafjord and Taira Anderson and David Atkinson and Faeze Brahman and Christopher Clark and Pradeep Dasigi and Nouha Dziri and Allyson Ettinger and Michal Guerqu...

2025
[9]

2001 , publisher=

Understanding figurative language: From metaphor to idioms , author=. 2001 , publisher=

2001
[10]

RUDN Journal of Language Studies, Semiotics and Semantics , volume=

Criteria of Semantic Decomposability of Idioms , author=. RUDN Journal of Language Studies, Semiotics and Semantics , volume=
[11]

Psycholinguistic studies on the syntactic behavior of idioms , journal =

Raymond W Gibbs and Nandini P Nayak , abstract =. Psycholinguistic studies on the syntactic behavior of idioms , journal =. 1989 , issn =. doi:https://doi.org/10.1016/0010-0285(89)90004-2 , url =

work page doi:10.1016/0010-0285(89)90004-2 1989
[12]

Proceedings of the 7th International Corpus Linguistics Conference (

Jakubíček, Miloš and Kilgarriff, Adam and Kovář, Vojtěch and Rychlý, Pavel and Suchomel, Vít , title =. Proceedings of the 7th International Corpus Linguistics Conference (
[13]

Phrasal Substitution of Idiomatic Expressions

Liu, Changsheng and Hwa, Rebecca. Phrasal Substitution of Idiomatic Expressions. Proceedings of the 2016 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. doi:10.18653/v1/N16-1040

work page doi:10.18653/v1/n16-1040 2016
[14]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972
[15]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

IMPLI: Investigating NLI models’ performance on figurative language , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[16]

Publications Manual , year = "1983", publisher =

1983
[17]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981
[18]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of
[19]

Dan Gusfield , title =. 1997

1997
[20]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015
[21]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =
[22]

Idioms , pages=

The Boundaries of the Lexicon: For Morris Halle, in celebration of his 70th birthday , author=. Idioms , pages=. 2014 , publisher=

2014
[23]

BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

work page doi:10.18653/v1/n19-1423 2019
[24]

URL https: //aclanthology.org/2025.acl-long.127/

Warner, Benjamin and Chaffin, Antoine and Clavi. Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.127

work page doi:10.18653/v1/2025.acl-long.127 2025
[25]

Memorization or Reasoning? Exploring the Idiom Understanding of LLM s

Kim, Jisu and Shin, Youngwoo and Hwang, Uiji and Choi, Jihun and Xuan, Richeng and Kim, Taeuk. Memorization or Reasoning? Exploring the Idiom Understanding of LLM s. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1099

work page doi:10.18653/v1/2025.emnlp-main.1099 2025
[26]

and Tanner, Darren , title=

Bulkes, Nyssa Z. and Tanner, Darren , title=. Behavior Research Methods , year=. doi:10.3758/s13428-016-0747-8 , url=

work page doi:10.3758/s13428-016-0747-8
[27]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A

Tenney, Ian and Das, Dipanjan and Pavlick, Ellie. BERT Rediscovers the Classical NLP Pipeline. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1452

work page doi:10.18653/v1/p19-1452 2019
[28]

What Does BERT Learn about the Structure of Language?

Jawahar, Ganesh and Sagot, Beno \^i t and Seddah, Djam \'e. What Does BERT Learn about the Structure of Language?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1356

work page doi:10.18653/v1/p19-1356 2019
[29]

and Titone, Debra A

Libben, Maya R. and Titone, Debra A. , title =. Memory & Cognition , volume =
[30]

Journal of Experimental Psychology: Learning, Memory, and Cognition , volume =

Tabossi, Patrizia and Fanari, Rachele and Wolf, Kristine , title =. Journal of Experimental Psychology: Learning, Memory, and Cognition , volume =
[31]

Language, Usage and Cognition , publisher=

Bybee, Joan , year=. Language, Usage and Cognition , publisher=
[32]

9 Gradient Transformer: Learning to Generate Updates for LLMs doi: 10.1093/acprof:oso/9780199535255.001.0001

Goldberg, Adele , title =. 2005 , month =. doi:10.1093/acprof:oso/9780199268511.001.0001 , url =

work page doi:10.1093/acprof:oso/9780199268511.001.0001 2005
[33]

Riehemann, Susanne , year =
[34]

Sag and Thomas Wasow , booktitle =

Geoffrey Nunberg and Ivan A. Sag and Thomas Wasow , booktitle =. Idioms , year =
[35]

Verbal multiword expressions: Idiomaticity and flexibility

Sheinfux, \ Livnat Herzig\ and Greshler, \ Tali Arad\ and Nurit Melnik and Shuly Wintner. Verbal multiword expressions: Idiomaticity and flexibility. Representation and Parsing of Multiword Expressions. 2019. doi:10.5281/zenodo.2579035

work page doi:10.5281/zenodo.2579035 2019
[36]

Rolling the DICE on Idiomaticity: How LLM s Fail to Grasp Context

Mi, Maggie and Villavicencio, Aline and Moosavi, Nafise Sadat. Rolling the DICE on Idiomaticity: How LLM s Fail to Grasp Context. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.362

work page doi:10.18653/v1/2025.acl-long.362 2025
[37]

2025 , eprint=

Olmo 3 , author=. 2025 , eprint=

2025
[38]

arXiv preprint arXiv:2401.17377 , year=

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens , author=. arXiv preprint arXiv:2401.17377 , year=

arXiv
[39]

Shannon, C. E. , journal=. A mathematical theory of communication , year=
[40]

and Nayak, Nandini P

Gibbs, Raymond W. and Nayak, Nandini P. and Cutting, Cooper , title=. Journal of Memory and Language , year=
[41]

Proceedings of the XIIIth international congress of linguistics , volume=

Idioms: An interim report , author=. Proceedings of the XIIIth international congress of linguistics , volume=. 1983 , organization=

1983
[42]

1978 , address =

Nunberg, Geoffrey , title =. 1978 , address =

1978
[43]

Behavioral and Brain Sciences , volume =

Rules and representations , volume=. Behavioral and Brain Sciences , author=. 1980 , pages=. doi:10.1017/S0140525X00001515 , number=

work page doi:10.1017/s0140525x00001515 1980
[44]

Foundations of language , pages=

Idioms within a transformational grammar , author=. Foundations of language , pages=. 1970 , publisher=

1970
[45]

Heringer

James T. Heringer. Idioms and Lexicalization in English. 1976. doi:10.1163/9789004368842_008

work page doi:10.1163/9789004368842_008 1976
[46]

, title =

Katz, Jerrold J. , title =. A Festschrift for Morris Halle , editor =. 1973 , address =

1973
[47]

Memory & Cognition , year =

Tabossi, Patrizia and Fanari, Rachele and Wolf, Karoline , title =. Memory & Cognition , year =
[48]

Memory & cognition , volume=

Speakers' assumptions about the lexical flexibility of idioms , author=. Memory & cognition , volume=. 1989 , publisher=

1989
[49]

Titone and Cynthia M

Debra A. Titone and Cynthia M. Connine , title =. Metaphor and Symbolic Activity , volume =. 1994 , publisher =. doi:10.1207/s15327868ms0904\_1 , URL =

work page doi:10.1207/s15327868ms0904 1994
[50]

Titone and Cynthia M

Debra A. Titone and Cynthia M. Connine , keywords =. On the compositional and noncompositional nature of idiomatic expressions , journal =. 1999 , note =. doi:https://doi.org/10.1016/S0378-2166(99)00008-9 , url =

work page doi:10.1016/s0378-2166(99)00008-9 1999
[51]

International conference on machine learning , pages=

Similarity of neural network representations revisited , author=. International conference on machine learning , pages=. 2019 , organization=

2019
[52]

Management Science , year=

On the Translocation of Masses , author=. Management Science , year=
[53]

Vaserstein, L. N. , title =. Problemy Peredachi Informatsii , year =
[54]

and Titone, Debra A

Libben, Maya R. and Titone, Debra A. , title =. Memory & Cognition , year =. doi:10.3758/MC.36.6.1103 , url =

work page doi:10.3758/mc.36.6.1103
[55]

Metaphor and Symbol , volume =

Cacciari, Cristina and Levorato, Maria Chiara , title =. Metaphor and Symbol , volume =. 1998 , publisher =. doi:10.1207/s15327868ms1303_1 , issn =

work page doi:10.1207/s15327868ms1303_1 1998
[56]

2012 , publisher=

Interpreting figurative meaning , author=. 2012 , publisher=

2012
[57]

Attention is All you Need , url =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , booktitle =. Attention is All you Need , url =
[58]

Journal of Logic, Language and Information , year =

Boon or burden? The role of compositional meaning in figurative language processing and acquisition , author =. Journal of Logic, Language and Information , year =. doi:10.1007/s10849-019-09282-7 , issn =

work page doi:10.1007/s10849-019-09282-7
[59]

The comprehension of idioms , journal =

Cristina Cacciari and Patrizia Tabossi , abstract =. The comprehension of idioms , journal =. 1988 , issn =. doi:https://doi.org/10.1016/0749-596X(88)90014-9 , url =

work page doi:10.1016/0749-596x(88)90014-9 1988
[60]

From Input Perception to Predictive Insight: Modeling Model Blind Spots Before They Become Errors

Mi, Maggie and Villavicencio, Aline and Moosavi, Nafise Sadat. From Input Perception to Predictive Insight: Modeling Model Blind Spots Before They Become Errors. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1740

work page doi:10.18653/v1/2025.emnlp-main.1740 2025
[61]

and Baldwin, Timothy and Bond, Francis and Copestake, Ann A

Sag, Ivan A. and Baldwin, Timothy and Bond, Francis and Copestake, Ann A. and Flickinger, Dan , title =. Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing , pages =. 2002 , isbn =

2002
[62]

A Probabilistic E arley Parser as a Psycholinguistic Model

Hale, John. A Probabilistic E arley Parser as a Psycholinguistic Model. Second Meeting of the North A merican Chapter of the Association for Computational Linguistics. 2001

2001
[63]

Cognition , volume =

Expectation-based syntactic comprehension , author =. Cognition , volume =. 2008 , month = mar, issn =. doi:10.1016/j.cognition.2007.05.006 , url =

work page doi:10.1016/j.cognition.2007.05.006 2008
[64]

Smith and Roger Levy , keywords =

Nathaniel J. Smith and Roger Levy , keywords =. The effect of word predictability on reading time is logarithmic , journal =. 2013 , issn =. doi:https://doi.org/10.1016/j.cognition.2013.02.013 , url =

work page doi:10.1016/j.cognition.2013.02.013 2013
[65]

A Pragmatic Analysis

Idioms in English , author=. A Pragmatic Analysis. Tubingen: Narr , year=
[66]

Language , volume=

Idioms , author=. Language , volume=. 1994 , publisher=

1994
[67]

Memory & cognition , volume=

The multidetermined nature of idiom processing , author=. Memory & cognition , volume=. 2008 , publisher=

2008
[68]

Language and speech , volume=

Representing idioms: Syntactic and contextual effects on idiom processing , author=. Language and speech , volume=. 2013 , publisher=

2013
[69]

and Bergen, Benjamin K

Chang, Tyler A. and Bergen, Benjamin K. , title =. Transactions of the Association for Computational Linguistics , volume =. 2022 , month =. doi:10.1162/tacl_a_00444 , url =

work page doi:10.1162/tacl_a_00444 2022
[70]

ChatGPT: Optimizing Language Models for Dialogue , year =
[71]

2023 , eprint=

LLaMA: Open and Efficient Foundation Language Models , author=. 2023 , eprint=

2023
[72]

Psychophysiology , volume=

Predictability and decomposability separately contribute to compositional processing of idiomatic language , author=. Psychophysiology , volume=. 2023 , publisher=

2023
[73]

Cleland and Rebecca Bull , keywords =

Emily Nordmann and Alexandra A. Cleland and Rebecca Bull , keywords =. Familiarity breeds dissent: Reliability analyses for British-English idioms on measures of familiarity, meaning, literality, and decomposability , journal =. 2014 , note =. doi:https://doi.org/10.1016/j.actpsy.2014.03.009 , url =

work page doi:10.1016/j.actpsy.2014.03.009 2014
[74]

A Distributional Perspective on Word Learning in Neural Language Models

Ficarra, Filippo and Cotterell, Ryan and Warstadt, Alex. A Distributional Perspective on Word Learning in Neural Language Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.naacl-long.558

work page doi:10.18653/v1/2025.naacl-long.558 2025
[75]

Representation and parsing of multiword expressions: Current trends , volume=

Verbal multiword expressions: Idiomaticity and flexibility , author=. Representation and parsing of multiword expressions: Current trends , volume=. 2019 , publisher=

2019
[76]

Chapter 9 Understanding Idiomatic Expressions: The Contribution of Word Meanings , editor =

Cristina Cacciari and Sam Glucksberg , abstract =. Chapter 9 Understanding Idiomatic Expressions: The Contribution of Word Meanings , editor =. 1991 , booktitle =. doi:https://doi.org/10.1016/S0166-4115(08)61535-6 , url =

work page doi:10.1016/s0166-4115(08)61535-6 1991
[77]

, title=

Partee, Barbara H. , title=. 1995 , publisher=

1995
[78]

Glossa: a journal of general linguistics , volume=

Decomposability and the syntactic flexibility of Hebrew idioms , author=. Glossa: a journal of general linguistics , volume=. 2025 , publisher=

2025

[1] [1]

Journal of memory and language , volume=

More than words: Frequency effects for multi-word phrases , author=. Journal of memory and language , volume=. 2010 , publisher=

2010

[2] [2]

time and again

Seeing a phrase “time and again” matters: The role of phrasal frequency in the processing of multiword sequences. , author=. Journal of Experimental Psychology: Learning, Memory, and Cognition , volume=. 2011 , publisher=

2011

[3] [3]

Brain and language , volume=

Evidence for frequency-based constituents in the mental lexicon: Collocations involving the word of , author=. Brain and language , volume=. 2002 , publisher=

2002

[4] [4]

Glossa: a journal of general linguistics , volume=

Sources of variability in the syntactic flexibility of idioms , author=. Glossa: a journal of general linguistics , volume=. 2023 , publisher=

2023

[5] [5]

An invitation to cognitive science: Language , volume=

Lexical semantics and compositionality , author=. An invitation to cognitive science: Language , volume=. 1995 , publisher=

1995

[6] [6]

Journal of memory and language , volume=

Lexical access during the production of idiomatic phrases , author=. Journal of memory and language , volume=. 2006 , publisher=

2006

[7] [7]

Characterizing Idioms: Conventionality and Contingency

Socolof, Michaela and Cheung, Jackie and Wagner, Michael and O ' Donnell, Timothy. Characterizing Idioms: Conventionality and Contingency. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.278

work page doi:10.18653/v1/2022.acl-long.278 2022

[8] [8]

Smith and Hannaneh Hajishirzi , booktitle=

Evan Pete Walsh and Luca Soldaini and Dirk Groeneveld and Kyle Lo and Shane Arora and Akshita Bhagia and Yuling Gu and Shengyi Huang and Matt Jordan and Nathan Lambert and Dustin Schwenk and Oyvind Tafjord and Taira Anderson and David Atkinson and Faeze Brahman and Christopher Clark and Pradeep Dasigi and Nouha Dziri and Allyson Ettinger and Michal Guerqu...

2025

[9] [9]

2001 , publisher=

Understanding figurative language: From metaphor to idioms , author=. 2001 , publisher=

2001

[10] [10]

RUDN Journal of Language Studies, Semiotics and Semantics , volume=

Criteria of Semantic Decomposability of Idioms , author=. RUDN Journal of Language Studies, Semiotics and Semantics , volume=

[11] [11]

Psycholinguistic studies on the syntactic behavior of idioms , journal =

Raymond W Gibbs and Nandini P Nayak , abstract =. Psycholinguistic studies on the syntactic behavior of idioms , journal =. 1989 , issn =. doi:https://doi.org/10.1016/0010-0285(89)90004-2 , url =

work page doi:10.1016/0010-0285(89)90004-2 1989

[12] [12]

Proceedings of the 7th International Corpus Linguistics Conference (

Jakubíček, Miloš and Kilgarriff, Adam and Kovář, Vojtěch and Rychlý, Pavel and Suchomel, Vít , title =. Proceedings of the 7th International Corpus Linguistics Conference (

[13] [13]

Phrasal Substitution of Idiomatic Expressions

Liu, Changsheng and Hwa, Rebecca. Phrasal Substitution of Idiomatic Expressions. Proceedings of the 2016 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. doi:10.18653/v1/N16-1040

work page doi:10.18653/v1/n16-1040 2016

[14] [14]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

1972

[15] [15]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

IMPLI: Investigating NLI models’ performance on figurative language , author=. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

[16] [16]

Publications Manual , year = "1983", publisher =

1983

[17] [17]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981

[18] [18]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

[19] [19]

Dan Gusfield , title =. 1997

1997

[20] [20]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

2015

[21] [21]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

[22] [22]

Idioms , pages=

The Boundaries of the Lexicon: For Morris Halle, in celebration of his 70th birthday , author=. Idioms , pages=. 2014 , publisher=

2014

[23] [23]

BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

work page doi:10.18653/v1/n19-1423 2019

[24] [24]

URL https: //aclanthology.org/2025.acl-long.127/

Warner, Benjamin and Chaffin, Antoine and Clavi. Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.127

work page doi:10.18653/v1/2025.acl-long.127 2025

[25] [25]

Memorization or Reasoning? Exploring the Idiom Understanding of LLM s

Kim, Jisu and Shin, Youngwoo and Hwang, Uiji and Choi, Jihun and Xuan, Richeng and Kim, Taeuk. Memorization or Reasoning? Exploring the Idiom Understanding of LLM s. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1099

work page doi:10.18653/v1/2025.emnlp-main.1099 2025

[26] [26]

and Tanner, Darren , title=

Bulkes, Nyssa Z. and Tanner, Darren , title=. Behavior Research Methods , year=. doi:10.3758/s13428-016-0747-8 , url=

work page doi:10.3758/s13428-016-0747-8

[27] [27]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A

Tenney, Ian and Das, Dipanjan and Pavlick, Ellie. BERT Rediscovers the Classical NLP Pipeline. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1452

work page doi:10.18653/v1/p19-1452 2019

[28] [28]

What Does BERT Learn about the Structure of Language?

Jawahar, Ganesh and Sagot, Beno \^i t and Seddah, Djam \'e. What Does BERT Learn about the Structure of Language?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1356

work page doi:10.18653/v1/p19-1356 2019

[29] [29]

and Titone, Debra A

Libben, Maya R. and Titone, Debra A. , title =. Memory & Cognition , volume =

[30] [30]

Journal of Experimental Psychology: Learning, Memory, and Cognition , volume =

Tabossi, Patrizia and Fanari, Rachele and Wolf, Kristine , title =. Journal of Experimental Psychology: Learning, Memory, and Cognition , volume =

[31] [31]

Language, Usage and Cognition , publisher=

Bybee, Joan , year=. Language, Usage and Cognition , publisher=

[32] [32]

9 Gradient Transformer: Learning to Generate Updates for LLMs doi: 10.1093/acprof:oso/9780199535255.001.0001

Goldberg, Adele , title =. 2005 , month =. doi:10.1093/acprof:oso/9780199268511.001.0001 , url =

work page doi:10.1093/acprof:oso/9780199268511.001.0001 2005

[33] [33]

Riehemann, Susanne , year =

[34] [34]

Sag and Thomas Wasow , booktitle =

Geoffrey Nunberg and Ivan A. Sag and Thomas Wasow , booktitle =. Idioms , year =

[35] [35]

Verbal multiword expressions: Idiomaticity and flexibility

Sheinfux, \ Livnat Herzig\ and Greshler, \ Tali Arad\ and Nurit Melnik and Shuly Wintner. Verbal multiword expressions: Idiomaticity and flexibility. Representation and Parsing of Multiword Expressions. 2019. doi:10.5281/zenodo.2579035

work page doi:10.5281/zenodo.2579035 2019

[36] [36]

Rolling the DICE on Idiomaticity: How LLM s Fail to Grasp Context

Mi, Maggie and Villavicencio, Aline and Moosavi, Nafise Sadat. Rolling the DICE on Idiomaticity: How LLM s Fail to Grasp Context. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.362

work page doi:10.18653/v1/2025.acl-long.362 2025

[37] [37]

2025 , eprint=

Olmo 3 , author=. 2025 , eprint=

2025

[38] [38]

arXiv preprint arXiv:2401.17377 , year=

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens , author=. arXiv preprint arXiv:2401.17377 , year=

arXiv

[39] [39]

Shannon, C. E. , journal=. A mathematical theory of communication , year=

[40] [40]

and Nayak, Nandini P

Gibbs, Raymond W. and Nayak, Nandini P. and Cutting, Cooper , title=. Journal of Memory and Language , year=

[41] [41]

Proceedings of the XIIIth international congress of linguistics , volume=

Idioms: An interim report , author=. Proceedings of the XIIIth international congress of linguistics , volume=. 1983 , organization=

1983

[42] [42]

1978 , address =

Nunberg, Geoffrey , title =. 1978 , address =

1978

[43] [43]

Behavioral and Brain Sciences , volume =

Rules and representations , volume=. Behavioral and Brain Sciences , author=. 1980 , pages=. doi:10.1017/S0140525X00001515 , number=

work page doi:10.1017/s0140525x00001515 1980

[44] [44]

Foundations of language , pages=

Idioms within a transformational grammar , author=. Foundations of language , pages=. 1970 , publisher=

1970

[45] [45]

Heringer

James T. Heringer. Idioms and Lexicalization in English. 1976. doi:10.1163/9789004368842_008

work page doi:10.1163/9789004368842_008 1976

[46] [46]

, title =

Katz, Jerrold J. , title =. A Festschrift for Morris Halle , editor =. 1973 , address =

1973

[47] [47]

Memory & Cognition , year =

Tabossi, Patrizia and Fanari, Rachele and Wolf, Karoline , title =. Memory & Cognition , year =

[48] [48]

Memory & cognition , volume=

Speakers' assumptions about the lexical flexibility of idioms , author=. Memory & cognition , volume=. 1989 , publisher=

1989

[49] [49]

Titone and Cynthia M

Debra A. Titone and Cynthia M. Connine , title =. Metaphor and Symbolic Activity , volume =. 1994 , publisher =. doi:10.1207/s15327868ms0904\_1 , URL =

work page doi:10.1207/s15327868ms0904 1994

[50] [50]

Titone and Cynthia M

Debra A. Titone and Cynthia M. Connine , keywords =. On the compositional and noncompositional nature of idiomatic expressions , journal =. 1999 , note =. doi:https://doi.org/10.1016/S0378-2166(99)00008-9 , url =

work page doi:10.1016/s0378-2166(99)00008-9 1999

[51] [51]

International conference on machine learning , pages=

Similarity of neural network representations revisited , author=. International conference on machine learning , pages=. 2019 , organization=

2019

[52] [52]

Management Science , year=

On the Translocation of Masses , author=. Management Science , year=

[53] [53]

Vaserstein, L. N. , title =. Problemy Peredachi Informatsii , year =

[54] [54]

and Titone, Debra A

Libben, Maya R. and Titone, Debra A. , title =. Memory & Cognition , year =. doi:10.3758/MC.36.6.1103 , url =

work page doi:10.3758/mc.36.6.1103

[55] [55]

Metaphor and Symbol , volume =

Cacciari, Cristina and Levorato, Maria Chiara , title =. Metaphor and Symbol , volume =. 1998 , publisher =. doi:10.1207/s15327868ms1303_1 , issn =

work page doi:10.1207/s15327868ms1303_1 1998

[56] [56]

2012 , publisher=

Interpreting figurative meaning , author=. 2012 , publisher=

2012

[57] [57]

Attention is All you Need , url =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , booktitle =. Attention is All you Need , url =

[58] [58]

Journal of Logic, Language and Information , year =

Boon or burden? The role of compositional meaning in figurative language processing and acquisition , author =. Journal of Logic, Language and Information , year =. doi:10.1007/s10849-019-09282-7 , issn =

work page doi:10.1007/s10849-019-09282-7

[59] [59]

The comprehension of idioms , journal =

Cristina Cacciari and Patrizia Tabossi , abstract =. The comprehension of idioms , journal =. 1988 , issn =. doi:https://doi.org/10.1016/0749-596X(88)90014-9 , url =

work page doi:10.1016/0749-596x(88)90014-9 1988

[60] [60]

From Input Perception to Predictive Insight: Modeling Model Blind Spots Before They Become Errors

Mi, Maggie and Villavicencio, Aline and Moosavi, Nafise Sadat. From Input Perception to Predictive Insight: Modeling Model Blind Spots Before They Become Errors. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1740

work page doi:10.18653/v1/2025.emnlp-main.1740 2025

[61] [61]

and Baldwin, Timothy and Bond, Francis and Copestake, Ann A

Sag, Ivan A. and Baldwin, Timothy and Bond, Francis and Copestake, Ann A. and Flickinger, Dan , title =. Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing , pages =. 2002 , isbn =

2002

[62] [62]

A Probabilistic E arley Parser as a Psycholinguistic Model

Hale, John. A Probabilistic E arley Parser as a Psycholinguistic Model. Second Meeting of the North A merican Chapter of the Association for Computational Linguistics. 2001

2001

[63] [63]

Cognition , volume =

Expectation-based syntactic comprehension , author =. Cognition , volume =. 2008 , month = mar, issn =. doi:10.1016/j.cognition.2007.05.006 , url =

work page doi:10.1016/j.cognition.2007.05.006 2008

[64] [64]

Smith and Roger Levy , keywords =

Nathaniel J. Smith and Roger Levy , keywords =. The effect of word predictability on reading time is logarithmic , journal =. 2013 , issn =. doi:https://doi.org/10.1016/j.cognition.2013.02.013 , url =

work page doi:10.1016/j.cognition.2013.02.013 2013

[65] [65]

A Pragmatic Analysis

Idioms in English , author=. A Pragmatic Analysis. Tubingen: Narr , year=

[66] [66]

Language , volume=

Idioms , author=. Language , volume=. 1994 , publisher=

1994

[67] [67]

Memory & cognition , volume=

The multidetermined nature of idiom processing , author=. Memory & cognition , volume=. 2008 , publisher=

2008

[68] [68]

Language and speech , volume=

Representing idioms: Syntactic and contextual effects on idiom processing , author=. Language and speech , volume=. 2013 , publisher=

2013

[69] [69]

and Bergen, Benjamin K

Chang, Tyler A. and Bergen, Benjamin K. , title =. Transactions of the Association for Computational Linguistics , volume =. 2022 , month =. doi:10.1162/tacl_a_00444 , url =

work page doi:10.1162/tacl_a_00444 2022

[70] [70]

ChatGPT: Optimizing Language Models for Dialogue , year =

[71] [71]

2023 , eprint=

LLaMA: Open and Efficient Foundation Language Models , author=. 2023 , eprint=

2023

[72] [72]

Psychophysiology , volume=

Predictability and decomposability separately contribute to compositional processing of idiomatic language , author=. Psychophysiology , volume=. 2023 , publisher=

2023

[73] [73]

Cleland and Rebecca Bull , keywords =

Emily Nordmann and Alexandra A. Cleland and Rebecca Bull , keywords =. Familiarity breeds dissent: Reliability analyses for British-English idioms on measures of familiarity, meaning, literality, and decomposability , journal =. 2014 , note =. doi:https://doi.org/10.1016/j.actpsy.2014.03.009 , url =

work page doi:10.1016/j.actpsy.2014.03.009 2014

[74] [74]

A Distributional Perspective on Word Learning in Neural Language Models

Ficarra, Filippo and Cotterell, Ryan and Warstadt, Alex. A Distributional Perspective on Word Learning in Neural Language Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.naacl-long.558

work page doi:10.18653/v1/2025.naacl-long.558 2025

[75] [75]

Representation and parsing of multiword expressions: Current trends , volume=

Verbal multiword expressions: Idiomaticity and flexibility , author=. Representation and parsing of multiword expressions: Current trends , volume=. 2019 , publisher=

2019

[76] [76]

Chapter 9 Understanding Idiomatic Expressions: The Contribution of Word Meanings , editor =

Cristina Cacciari and Sam Glucksberg , abstract =. Chapter 9 Understanding Idiomatic Expressions: The Contribution of Word Meanings , editor =. 1991 , booktitle =. doi:https://doi.org/10.1016/S0166-4115(08)61535-6 , url =

work page doi:10.1016/s0166-4115(08)61535-6 1991

[77] [77]

, title=

Partee, Barbara H. , title=. 1995 , publisher=

1995

[78] [78]

Glossa: a journal of general linguistics , volume=

Decomposability and the syntactic flexibility of Hebrew idioms , author=. Glossa: a journal of general linguistics , volume=. 2025 , publisher=

2025