Not-So-Strange Love: Language Models and Generative Linguistic Theories are More Compatible than They Appear
Pith reviewed 2026-05-12 02:10 UTC · model grok-4.3
The pith
Language models can instantiate formal generative linguistic theories in addition to usage-based ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LMs can also instantiate theories based on formal structures - the types of theories seen in the generative tradition. This argument expands the space of theories that can be tested with LMs, potentially enabling reconciliations between usage-based and generative accounts.
What carries the argument
The capacity of language models to embody formal generative structures through their learned representations and behaviors.
If this is right
- This expands the space of theories that can be tested with LMs.
- It potentially enables reconciliations between usage-based and generative accounts.
- LMs can serve as a testing ground for formal linguistic theories.
Where Pith is reading between the lines
- Researchers could look for specific formal properties in trained language models to support this view.
- The approach might encourage more integrated models that draw from both theoretical traditions.
- It implies that linguistic theory testing can be more inclusive of computational methods.
Load-bearing premise
The observed success and behavior of LMs can be interpreted as instantiating formal generative theories without additional evidence or specific mechanisms.
What would settle it
A study showing that language models do not exhibit any distinctive predictions from generative theories that go beyond what usage-based models predict.
read the original abstract
Futrell and Mahowald (2025) frame the success of neural language models (LMs) as supporting gradient, usage-based linguistic theories. I argue that LMs can also instantiate theories based on formal structures - the types of theories seen in the generative tradition. This argument expands the space of theories that can be tested with LMs, potentially enabling reconciliations between usage-based and generative accounts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript argues that neural language models (LMs) can instantiate formal generative linguistic theories (e.g., those based on hierarchical structures and rules from the generative tradition), in addition to supporting the gradient, usage-based theories emphasized by Futrell and Mahowald (2025). This compatibility is said to expand the space of testable theories with LMs and potentially enable reconciliations between usage-based and generative accounts.
Significance. If developed with concrete mechanisms, this perspective could meaningfully broaden how LM behaviors are interpreted in linguistics, allowing formal generative hypotheses to be tested empirically via model performance and training dynamics. It would challenge the current framing of LM success as exclusively favoring usage-based theories and open integrative research avenues.
major comments (1)
- Abstract: The central claim that LMs 'can also instantiate theories based on formal structures' is asserted without any derivation, mapping from LM components (e.g., attention heads or embeddings) to generative theory elements (e.g., phrase structure or transformations), specific examples, or cited empirical results. This absence is load-bearing, as the manuscript supplies no evidence or mechanism to show instantiation, leaving the argument without grounding.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We address the major comment below and indicate the revisions we will make to better ground the central claim.
read point-by-point responses
-
Referee: [—] Abstract: The central claim that LMs 'can also instantiate theories based on formal structures' is asserted without any derivation, mapping from LM components (e.g., attention heads or embeddings) to generative theory elements (e.g., phrase structure or transformations), specific examples, or cited empirical results. This absence is load-bearing, as the manuscript supplies no evidence or mechanism to show instantiation, leaving the argument without grounding.
Authors: We acknowledge that the abstract presents the claim at a high level of generality without explicit mappings, derivations, or concrete examples. The manuscript is a concise position paper whose core argument is that LMs are in principle capable of instantiating formal generative structures because their training objectives and architectures enable them to acquire hierarchical and rule-like representations of language. We agree this requires more explicit grounding to be persuasive. In revision we will expand the abstract with a brief clause indicating the basis for the claim (LMs' demonstrated capacity to encode syntactic hierarchies) and add a short section to the main text that sketches potential correspondences, such as how self-attention layers can implement operations akin to Merge or movement, while citing relevant probing studies that link LM internals to generative constructs. revision: yes
Circularity Check
No significant circularity
full rationale
The paper consists of a short conceptual argument in its abstract: it cites Futrell and Mahowald (2025) for the usage-based framing and then states that LMs can also instantiate formal generative theories, thereby expanding testable theory space. No equations, parameters, predictions, or derivations are present. The single external citation is not self-citation by the author and does not serve as a load-bearing premise that reduces to the paper's own inputs. The central claim is an interpretive assertion rather than a chain that collapses by definition or construction to its starting assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Neural language models can instantiate linguistic theories based on formal structures.
Reference graph
Works this paper leans on
-
[1]
Boleda, G. (2025). LLMs as a synthesis between symbolic and distributed approaches to language. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9365–9379, Suzhou, China. Association for Computational Linguistics
work page 2025
-
[2]
Bybee, J. L. and Hopper, P. J. (2001).Frequency and the emergence of linguistic structure. John Benjamins Publishing Company
work page 2001
-
[3]
Chomsky, N. (1993). A minimalist program for linguistic theory. InThe View from Building 20, pages 1–52. MIT Press
work page 1993
-
[4]
Futrell, R. and Mahowald, K. (2025). How linguistics learned to stop worrying and love the language models.Behavioral and Brain Sciences, pages 1–98
work page 2025
- [5]
-
[6]
Marr, D. (1982).Vision: A computational investigation into the human representation and processing of visual information. W.H. Freeman. 3
work page 1982
-
[7]
T., Grant, E., Smolensky, P., Griffiths, T
McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L., and Linzen, T. (2020). Universal linguistic inductive biases via meta-learning.Proceedings of the 42nd Annual Conference of the Cognitive Science Society, pages 737–743
work page 2020
-
[8]
McCoy, R. T. and Griffiths, T. L. (2025). Modeling rapid language learning by distilling Bayesian priors into artificial neural networks.Nature Communications, 16(1):4676
work page 2025
-
[9]
Prince, A. and Smolensky, P. (1993/2004).Optimality theory: Constraint interaction in generative grammar. Wiley
work page 1993
-
[10]
Smolensky, P. and Legendre, G. (2006).The Harmonic Mind: From Neural Computation to Optimality- Theoretic Grammar. MIT Press
work page 2006
-
[11]
Smolensky, P., McCoy, R., Fernandez, R., Goldrick, M., and Gao, J. (2022). Neurocompositional com- puting: From the central paradox of cognition to a new generation of AI systems.AI Magazine, 43(3):308–322
work page 2022
-
[12]
Yedetore, A., Linzen, T., Frank, R., and McCoy, R. T. (2023). How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), pages 9370–9393, Toronto, Canada. Association for Computation...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.