Introduces LIHA ablation to locate first-token broadcaster heads and provides causal evidence that instruction tuning localizes language identity circuits to early layers in transformers.
Understanding and Mitigating Language Confusion in LLM s
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.
Broad empirical evaluation finds that fine-tuning heuristics for source-language choice in cross-lingual transfer do not hold reliably under in-context learning.
citing papers explorer
-
First-Token Broadcasters: Mechanistic Origins of Language Identity and Distributed Robustness in Transformers
Introduces LIHA ablation to locate first-token broadcaster heads and provides causal evidence that instruction tuning localizes language identity circuits to early layers in transformers.
-
Challenges and Recommendations for LLMs-as-a-Judge in Multilingual Settings and Low-Resource Languages
Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.
-
When English Isn't the Best Teacher: Source Language Effects in Cross-Lingual In-Context Learning
Broad empirical evaluation finds that fine-tuning heuristics for source-language choice in cross-lingual transfer do not hold reliably under in-context learning.