Proceedings of the AAAI Conference on Artificial Intelligence , author=

Adrian De Wynter, Ishaan Watts, Tua Wongsangaroonsri, Minghui Zhang, Noura Farra, Nektar Ege Altıntoprak, Lena Baur, Samantha Claudet, Pavel Gajdušek, Qilong Gu, Anna Kaminska, Tomasz Kaminski, Ruby Kuo, Akiko Kyuba, Jongho Lee, Kartik Math · 2025 · DOI 10.1609/aaai.v39i27.35011

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Multilingual Refusal Alignment for Safer Large Language Models

cs.CL · 2026-04-24 · conditional · novelty 5.0

English-only safety alignment fails to transfer cross-lingually, while multilingual DPO training on the new RefusEU dataset improves safety across 12 European languages without degrading Global MMLU performance.

A Survey of Toxicity Detection and Mitigation Strategies for Multilingual Language Models

cs.CL · 2026-06-24 · unverdicted · novelty 1.0

A survey that catalogs threat models, detection approaches, and mitigation strategies for toxicity in multilingual LLMs while identifying challenges such as uneven language coverage and culturally variable harm definitions.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Multilingual Refusal Alignment for Safer Large Language Models cs.CL · 2026-04-24 · conditional · none · ref 47
English-only safety alignment fails to transfer cross-lingually, while multilingual DPO training on the new RefusEU dataset improves safety across 12 European languages without degrading Global MMLU performance.

Proceedings of the AAAI Conference on Artificial Intelligence , author=

fields

years

verdicts

representative citing papers

citing papers explorer