Evaluation across 1.1 million instances shows sycophancy rates spike in low-resource languages, remain topic-agnostic, and correlate with tokenizer fertility.
Systematic inequalities in language technology performance across the world’s languages
5 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 5representative citing papers
A framework with TOPPing source selection and VACAI-Bowl dual-branch model yields 54.62% average improvement in dependency parsing across 10 low-resource varieties.
A Bayesian framework decomposes mLLM variance, showing language features explain 79-92% of language identity variance and that model identity vs. benchmark-model interactions dominate differently for understanding versus reasoning tasks.
Translating unsafe inputs to low-resource languages jailbreaks GPT-4 at rates on par with or exceeding state-of-the-art attacks.
N-gram draft models give larger and more consistent speed-ups for multilingual speculative decoding than fine-tuned neural drafts, despite lower acceptance rates, across translation and story generation.
citing papers explorer
-
Low-Resource Languages Jailbreak GPT-4
Translating unsafe inputs to low-resource languages jailbreaks GPT-4 at rates on par with or exceeding state-of-the-art attacks.