XL-SafetyBench is a new cross-cultural benchmark showing frontier LLMs decouple jailbreak robustness from cultural sensitivity while local models trade off attack success against neutral-safe rates in a near-linear pattern indicating generation failure rather than alignment.
Salamandra technical report
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A new 30B open LLM trained with curriculum learning and upsampling outperforms other multilingual models on European languages, especially low-resource ones, with up to 10x fewer linguistic errors in human evaluations.
citing papers explorer
-
XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity
XL-SafetyBench is a new cross-cultural benchmark showing frontier LLMs decouple jailbreak robustness from cultural sensitivity while local models trade off attack success against neutral-safe rates in a near-linear pattern indicating generation failure rather than alignment.
-
TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation
A new 30B open LLM trained with curriculum learning and upsampling outperforms other multilingual models on European languages, especially low-resource ones, with up to 10x fewer linguistic errors in human evaluations.