TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages

· 2026 · cs.CL · arXiv 2606.01322

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Safety evaluation of Large Language Models (LLMs) remains heavily English-centric, leaving Low-Resource Languages (LRLs), particularly African ones, critically underexplored. We introduce TUKABENCH, a jailbreak benchmark for seven African languages that extends JailbreakBench (JBB) beyond direct translation through four settings: human translation of JBB prompts, English adaptation to African contexts followed by human translation, human-curated prompts validated through interactions with GPT-5.2, and code-switched prompts combining English and African languages, isolating the effect of language, cultural grounding, and prompt evasiveness on model safety. Across closed and open models, prompting in African languages reduces refusal relative to English, with culturally adapted prompts leading to least refusal. The evaluation also surfaces two structural limitations: model comprehension failures and reduced LLM-as-a-judge reliability in LRLs. To capture the first, we introduce Deflection alongside Refused and Jailbroken; to assess the second, we validate outputs with human annotations, showing that judge-human agreement drops in lower-resource languages and less commonly supported scripts.

representative citing papers

Challenges and Recommendations for LLMs-as-a-Judge in Multilingual Settings and Low-Resource Languages

cs.CL · 2026-07-02 · unverdicted · novelty 5.0

Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Challenges and Recommendations for LLMs-as-a-Judge in Multilingual Settings and Low-Resource Languages cs.CL · 2026-07-02 · unverdicted · none · ref 125 · internal anchor
Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.

TukaBench: A Culturally Grounded Jailbreak Benchmark for African Languages

fields

years

verdicts

representative citing papers

citing papers explorer