Assessing BERT's Syntactic Abilities

Yoav Goldberg

arxiv: 1901.05287 · v1 · pith:TLW3VC5Qnew · submitted 2019-01-16 · 💻 cs.CL

Assessing BERT's Syntactic Abilities

Yoav Goldberg This is my paper

classification 💻 cs.CL

keywords agreementbertstimulisubject-verbmodelphenomenasyntacticwords

0 comments

read the original abstract

I assess the extent to which the recently introduced BERT model captures English syntactic phenomena, using (1) naturally-occurring subject-verb agreement stimuli; (2) "coloreless green ideas" subject-verb agreement stimuli, in which content words in natural sentences are randomly replaced with words sharing the same part-of-speech and inflection; and (3) manually crafted stimuli for subject-verb agreement and reflexive anaphora phenomena. The BERT model performs remarkably well on all cases.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks
cs.CL 2026-05 unverdicted novelty 6.0

Collocational bootstrapping via co-occurrence regularities enables neural networks to learn subject-verb agreement robustly when input variability matches child-directed speech, indicating it as a viable acquisition strategy.
PIQA: Reasoning about Physical Commonsense in Natural Language
cs.CL 2019-11 accept novelty 6.0

PIQA is a new benchmark showing that current AI models achieve 77% on physical commonsense questions versus humans at 95%.
Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt
cs.CL 2026-06 unverdicted novelty 5.0

Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.
Trait-space Monitoring for Emergent Misalignment During Supervised Finetuning
cs.LG 2026-05 unverdicted novelty 5.0

Trait-space drift monitoring detects emergent misalignment checkpoints in 7-9B LLMs with 2.2% FNR, 2.9% FPR and 0.99 AUROC, outperforming PCA and SAE baselines.