LLMs rely on semantic cues for matrix-game equilibria but can acquire approximate computation via residual training on small instances, with a Lipschitz proof enabling transfer to larger anonymous games.
arXiv preprint arXiv:2510.15414 , year =
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
LLMs trained on Codenames via RLVR exhibit scale-dependent effects: the 8B model gains on creativity benchmarks while smaller models gain on reasoning benchmarks.
Agent-World autonomously synthesizes verifiable real-world tasks and uses continuous self-evolution to train 8B and 14B agents that outperform proprietary models on 23 benchmarks.
citing papers explorer
-
Equilibrium Residuals Expose Three Regimes of Matrix-Game Strategic Reasoning in Language Models
LLMs rely on semantic cues for matrix-game equilibria but can acquire approximate computation via residual training on small instances, with a Lipschitz proof enabling transfer to larger anonymous games.
-
Playing with Words, Improving with Rewards: Training Language Models for Creative Association
LLMs trained on Codenames via RLVR exhibit scale-dependent effects: the 8B model gains on creativity benchmarks while smaller models gain on reasoning benchmarks.
-
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
Agent-World autonomously synthesizes verifiable real-world tasks and uses continuous self-evolution to train 8B and 14B agents that outperform proprietary models on 23 benchmarks.