Translating LIBERO to ten languages shows VLA failures under multilingual instructions are driven by language-sensitive steps; a step-wise inference intervention improves performance.
Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages
6 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 6verdicts
UNVERDICTED 6representative citing papers
Cross-lingual prompt exploration improves factual recall and consistency in LLMs across 17 languages more efficiently than native-language scaling.
Luar is a reinforcement learning method enabling reasoning language models to decide when to invoke English translation for improved multilingual reasoning.
COPSD improves mathematical reasoning in low-resource languages by having LLMs self-distill from their own high-resource English behavior via token-level divergence on rollouts with privileged crosslingual context.
Treating language as a latent variable via polyGRPO RL improves Qwen2.5-7B-Instruct by 6.72% on English reasoning benchmarks and 6.89% on multilingual ones, with cross-task gains on commonsense reasoning from math-only training.
Prompting and agent methods boost standard LLMs on financial QA by simulating long chain-of-thought reasoning, but reasoning LLMs already have this capability and show limited further gains, while multilingual alignment helps mainly by lengthening reasoning with minimal benefit for reasoning models.
citing papers explorer
-
When Does Language Matter? Multilingual Instructions Reveal Step-wise Language Sensitivity in Vision-Language-Action Models
Translating LIBERO to ten languages shows VLA failures under multilingual instructions are driven by language-sensitive steps; a step-wise inference intervention improves performance.
-
Cross-Lingual Exploration for Parametric Knowledge
Cross-lingual prompt exploration improves factual recall and consistency in LLMs across 17 languages more efficiently than native-language scaling.
-
Learning When to Translate for Multilingual Reasoning
Luar is a reinforcement learning method enabling reasoning language models to decide when to invoke English translation for improved multilingual reasoning.
-
Crosslingual On-Policy Self-Distillation for Multilingual Reasoning
COPSD improves mathematical reasoning in low-resource languages by having LLMs self-distill from their own high-resource English behavior via token-level divergence on rollouts with privileged crosslingual context.
-
Language as a Latent Variable for Reasoning Optimization
Treating language as a latent variable via polyGRPO RL improves Qwen2.5-7B-Instruct by 6.72% on English reasoning benchmarks and 6.89% on multilingual ones, with cross-task gains on commonsense reasoning from math-only training.