Large Language Models Only Pass Primary School Exams in I ndonesia: A Comprehensive Test on I ndo MMLU

Koto, Fajri, Aisyah, Nurul, Li, Haonan, Baldwin, Timothy · 2023 · DOI 10.18653/v1/2023.emnlp-main.760

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Evaluating Non-English Developer Support in Machine Learning for Software Engineering

cs.SE · 2026-05-07 · unverdicted · novelty 7.0

Code LLMs generate substantially worse comments outside English, and no tested automatic metric or LLM judge reliably matches human assessment of those outputs.

citing papers explorer

Showing 1 of 1 citing paper.

Evaluating Non-English Developer Support in Machine Learning for Software Engineering cs.SE · 2026-05-07 · unverdicted · none · ref 49
Code LLMs generate substantially worse comments outside English, and no tested automatic metric or LLM judge reliably matches human assessment of those outputs.

Large Language Models Only Pass Primary School Exams in I ndonesia: A Comprehensive Test on I ndo MMLU

fields

years

verdicts

representative citing papers

citing papers explorer