Pre-pretraining on MP-STRUCT matches k-Shuffle Dyck baselines in efficiency while adding human-like resistance to implausible languages and challenges the need for C-RASP definability in effective PPT languages.
Colorless Green Recurrent Networks Dream Hierarchically
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3representative citing papers
Varying the number of simultaneous parses in RNNGs increases predicted garden-path effects but does not fully reconcile LM surprisal with human reading times.
Different types of syntactic agreement recruit overlapping units within LLMs, indicating that agreement forms a meaningful functional category across English, Russian, Chinese, and structurally similar languages.
citing papers explorer
-
Language Acquisition Device in Large Language Models
Pre-pretraining on MP-STRUCT matches k-Shuffle Dyck baselines in efficiency while adding human-like resistance to implausible languages and challenges the need for C-RASP definability in effective PPT languages.
-
Why are language models less surprised than humans? Testing the Parse Multiplicity Mismatch Hypothesis
Varying the number of simultaneous parses in RNNGs increases predicted garden-path effects but does not fully reconcile LM surprisal with human reading times.
-
Different types of syntactic agreement recruit the same units within large language models
Different types of syntactic agreement recruit overlapping units within LLMs, indicating that agreement forms a meaningful functional category across English, Russian, Chinese, and structurally similar languages.