Transformers on synthetic grammar acquire abstract global statistical knowledge first, then local dependencies, showing initial over-generalizations that are later constrained.
Neural reality of argument structure constructions
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.
citing papers explorer
-
Developmental approach reveals the statistical learning of Neural Language Models: Transformers generalize from the most abstract statistical patterns
Transformers on synthetic grammar acquire abstract global statistical knowledge first, then local dependencies, showing initial over-generalizations that are later constrained.
-
Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt
Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.