XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks

Preethi Jyothi; Purvam Jain; Suvrat Raju; Vihari Piratla

arxiv: 2605.30788 · v1 · pith:EMYASEAGnew · submitted 2026-05-29 · 💻 cs.CL · cs.AI· cs.LG

XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks

Purvam Jain , Preethi Jyothi , Vihari Piratla , Suvrat Raju This is my paper

classification 💻 cs.CL cs.AIcs.LG

keywords cross-lingualgapsmodelssincetasksalgorithmicbenchmarktask

0 comments

read the original abstract

We introduce a set of synthetic algorithmic tasks to detect cross-lingual gaps in the abilities of large language models. Our benchmark is commensurate across languages, since it requires models to perform the same underlying task in different languages; scalable, since each task can be generated at varying levels of complexity allowing it to be adapted to models with different capabilities; quantifiable, since every task admits an objective notion of correctness; and transparent, since tasks are generated from simple templates that can be readily audited for translation errors. Because our benchmark focuses on algorithmic tasks, differential performance is a sufficient -- but not necessary -- indicator of cross-lingual gaps. Nevertheless, we show through extensive experiments that our benchmark exposes persistent cross-lingual gaps in multiple state-of-the-art models.

This paper has not been read by Pith yet.

XLGoBench: Detecting cross-lingual skill gaps with algorithmic tasks

discussion (0)