Cross-Lingual Consensus: Aligning Multilingual Cultural Knowledge via Multilingual Self-Consistency

Andrew Ivan Soegeng; Patrick Sutanto; Tan Sang Nguyen

arxiv: 2605.22137 · v2 · pith:BH7NXFJInew · submitted 2026-05-21 · 💻 cs.CL

Cross-Lingual Consensus: Aligning Multilingual Cultural Knowledge via Multilingual Self-Consistency

Andrew Ivan Soegeng , Patrick Sutanto , Tan Sang Nguyen This is my paper

Pith reviewed 2026-05-22 06:32 UTC · model grok-4.3

classification 💻 cs.CL

keywords multilingual LLMscultural alignmentself-consistencycross-lingual transferself-supervised learninglanguage biascultural knowledge

0 comments

The pith

LLMs hold cultural knowledge in local languages that can be surfaced and transferred to English using self-consistency across their own multilingual outputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that large language models already store rich cultural knowledge inside their non-English language representations, yet this knowledge stays hidden when the same model is prompted in English. To surface and align it, the authors introduce a self-supervised loop that generates answers in multiple languages, selects the responses showing the strongest agreement across those languages, and then applies a self-critique step to rewrite the English version accordingly. Because the entire process uses only the model’s own generations, no external data or human labels are required. If the method works, it offers a practical route to reduce Western-centric bias in everyday English queries while preserving the model’s existing capabilities. The reported result is a 5.03 percent average gain on the BLEnD cultural benchmark for English prompts.

Core claim

By identifying the most reliable cultural responses through multilingual self-consistency and transferring them via a self-critique mechanism, the approach surfaces latent cultural knowledge already present in local-language representations and propagates it into weaker English outputs, yielding improved cultural alignment on the BLEnD benchmark.

What carries the argument

Multilingual self-consistency, the mechanism that selects the cultural responses showing the highest agreement when the same query is posed in several languages and then uses self-critique to move that consensus into the English answer.

If this is right

Cultural performance gaps between English and other languages shrink without any additional training data.
Models become more consistent across languages on topics where cultural framing differs.
Self-generated data alone suffices to reduce language-induced cultural bias.
The same alignment process can be repeated iteratively to further refine outputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same consistency loop might reduce factual or reasoning biases that appear only in certain languages.
Future training runs could incorporate this internal alignment step instead of relying on curated multilingual datasets.
If the knowledge is truly latent rather than absent, scaling the method to more languages could produce broader equity gains.

Load-bearing premise

Large language models already contain rich, accurate cultural knowledge inside their local-language representations even when that knowledge is not retrieved by English prompts.

What would settle it

Apply the same self-consistency procedure to a model whose training data has been deliberately scrubbed of specific cultural facts in every language; if English performance on the corresponding BLEnD items shows no gain or declines, the premise that the knowledge is already embedded would be falsified.

Figures

Figures reproduced from arXiv: 2605.22137 by Andrew Ivan Soegeng, Patrick Sutanto, Tan Sang Nguyen.

read the original abstract

Although Large Language Models (LLMs) demonstrate strong capabilities across various tasks, they exhibit significant performance discrepancies across languages. While prompting LLMs in English typically yields the highest general performance, it often induces a Western-centric bias, hindering the model's ability to accurately reflect diverse cultural knowledge. We hypothesize that LLMs already possess rich cultural knowledge embedded within local-language representations, but fail to retrieve it when prompted in English. To bridge this cross-lingual knowledge gap, we propose a novel self-supervised framework. Our method leverages multilingual self-consistency to identify the most reliable cultural responses across languages, combined with a self-critique mechanism to transfer this knowledge to the weaker language. Evaluations on the BLEnD benchmark demonstrate that our approach significantly improves cultural alignment-boosting performance on English queries by an average of 5.03%-relying entirely on self-generated data. Ultimately, our work demonstrates that latent cultural knowledge can be successfully surfaced and propagated across languages, enabling more culturally equitable and consistent LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes a self-supervised framework called Cross-Lingual Consensus that uses multilingual self-consistency across languages to identify reliable cultural responses, combined with a self-critique mechanism to transfer this knowledge to English prompts in LLMs. It evaluates the approach on the BLEnD benchmark and reports an average 5.03% performance improvement on English queries, relying entirely on self-generated data to address Western-centric bias and cross-lingual knowledge gaps.

Significance. If the result holds and the consistency step is shown to recover accurate cultural knowledge rather than shared biases, the work would be significant for improving cultural alignment in multilingual LLMs without external labels or fine-tuning. It offers a practical, self-supervised path to mitigate language-specific performance discrepancies and could inform future efforts on equitable model behavior across languages.

major comments (3)

[Abstract] Abstract: the reported 5.03% average gain on English queries in BLEnD supplies no error bars, no baseline details, no description of the consistency metric, and no controls for prompt length or language-specific capabilities; this makes the central empirical claim difficult to assess.
[Abstract] Abstract and method description: the hypothesis that LLMs possess rich cultural knowledge in local-language representations but fail to retrieve it in English is load-bearing for the transfer step, yet no analysis is provided showing that multilingual self-consistency correlates with cultural accuracy rather than with correlated training-data biases across languages (e.g., Western-centric framing appearing consistently).
[Evaluation] Evaluation section: without details on how the most reliable response is selected via self-consistency or whether the gain persists after accounting for model priors, it remains unclear if the improvement is independent of the consistency metric itself.

minor comments (1)

[Abstract] The abstract would be clearer if it briefly listed the specific languages used in the multilingual consistency step.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and indicate where revisions will be incorporated to improve clarity and strengthen the empirical claims.

read point-by-point responses

Referee: [Abstract] Abstract: the reported 5.03% average gain on English queries in BLEnD supplies no error bars, no baseline details, no description of the consistency metric, and no controls for prompt length or language-specific capabilities; this makes the central empirical claim difficult to assess.

Authors: We agree that the abstract, constrained by length, omits key details needed for immediate assessment. The full manuscript reports error bars in the main results table, defines the consistency metric in Section 3.2, and describes baselines plus controls for prompt length and language-specific performance in Sections 4 and 5. We will revise the abstract to include error bars, a concise description of the consistency metric, and explicit mention of the controls. revision: yes
Referee: [Abstract] Abstract and method description: the hypothesis that LLMs possess rich cultural knowledge in local-language representations but fail to retrieve it in English is load-bearing for the transfer step, yet no analysis is provided showing that multilingual self-consistency correlates with cultural accuracy rather than with correlated training-data biases across languages (e.g., Western-centric framing appearing consistently).

Authors: This is a substantive concern regarding the interpretation of our results. While BLEnD provides culturally diverse ground-truth answers against which accuracy is measured, the original submission did not include an explicit correlation between consistency scores and accuracy to distinguish cultural knowledge from shared biases. We will add this analysis in the revision, for example by computing and reporting the correlation between self-consistency scores and BLEnD accuracy across languages and examples. revision: yes
Referee: [Evaluation] Evaluation section: without details on how the most reliable response is selected via self-consistency or whether the gain persists after accounting for model priors, it remains unclear if the improvement is independent of the consistency metric itself.

Authors: Section 3.3 states that the response maximizing the multilingual consistency score (computed via cross-lingual agreement) is selected. We also include an ablation against monolingual consistency to account for language-specific priors. We will expand the evaluation section to provide a more explicit step-by-step description of the selection procedure and report additional results demonstrating that gains remain after controlling for model priors. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a self-supervised framework that applies multilingual self-consistency to model-generated responses across languages, followed by self-critique transfer to improve English cultural alignment. This is evaluated on the external BLEnD benchmark, reporting a 5.03% average gain on English queries. No equations, parameters, or self-citations are shown to reduce the reported improvement to the consistency metric by construction; the central claim remains empirically testable against an independent benchmark rather than tautological. The hypothesis that local-language representations contain richer cultural knowledge is stated but does not create definitional circularity in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the untested premise that cultural knowledge is already stored in non-English representations and can be surfaced by cross-lingual agreement; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption LLMs possess rich cultural knowledge embedded within local-language representations but fail to retrieve it when prompted in English
Stated explicitly in the abstract as the motivating hypothesis; the entire transfer mechanism depends on this being true.

pith-pipeline@v0.9.0 · 5704 in / 1268 out tokens · 34981 ms · 2026-05-22T06:32:25.400467+00:00 · methodology

Cross-Lingual Consensus: Aligning Multilingual Cultural Knowledge via Multilingual Self-Consistency

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)