A Bayesian framework decomposes mLLM variance, showing language features explain 79-92% of language identity variance and that model identity vs. benchmark-model interactions dominate differently for understanding versus reasoning tasks.
A fro B ench: How Good are Large Language Models on A frican Languages?
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Scaling NLI performance with sample size in African languages is language-dependent and frequently non-monotonic, with saturation or declines observed in some cases.
citing papers explorer
-
DEPART: DEcomposing PARiTy across Multilingual LLMs
A Bayesian framework decomposes mLLM variance, showing language features explain 79-92% of language identity variance and that model identity vs. benchmark-model interactions dominate differently for understanding versus reasoning tasks.
-
Sample-Size Scaling of the African Languages NLI Evaluation
Scaling NLI performance with sample size in African languages is language-dependent and frequently non-monotonic, with saturation or declines observed in some cases.