Gender bias and factual gender knowledge are severely entangled in language model circuits and neurons, making neuron ablation an unreliable method for debiasing.
Computational Linguistics , pages=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2representative citing papers
Injecting 1% synthetic data targeting specific constructions during pre-training of GPT-2 Small boosts performance on 8 of 9 weakest BLiMP paradigms (e.g., only_npi_scope from 20.9% to 69.4%), while aggregate performance holds or improves, with one resistant case.
citing papers explorer
-
Heterogeneity in Formal Linguistic Competence of Language Models: Is Data the Real Bottleneck?
Injecting 1% synthetic data targeting specific constructions during pre-training of GPT-2 Small boosts performance on 8 of 9 weakest BLiMP paradigms (e.g., only_npi_scope from 20.9% to 69.4%), while aggregate performance holds or improves, with one resistant case.