Assessing BERT's Syntactic Abilities
read the original abstract
I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) "coloreless green ideas" subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena. The BERT model performs remarkably well on all cases.
This paper has not been read by Pith yet.
Forward citations
Cited by 4 Pith papers
-
Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks
Collocational bootstrapping via co-occurrence regularities enables neural networks to learn subject-verb agreement robustly when input variability matches child-directed speech, indicating it as a viable acquisition strategy.
-
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.
-
Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt
Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.
-
Trait-space Monitoring for Emergent Misalignment During Supervised Finetuning
Trait-space drift monitoring detects emergent misalignment checkpoints in 7-9B LLMs with 2.2% FNR, 2.9% FPR and 0.99 AUROC, outperforming PCA and SAE baselines.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.