LLMs rely on semantic cues for matrix-game equilibria but can acquire approximate computation via residual training on small instances, with a Lipschitz proof enabling transfer to larger anonymous games.
Goodman , title =
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
MIRL uses mutual information to guide trajectory selection and provide separate rewards for visual perception in RLVR for VLMs, achieving 70.22% average accuracy with 25% fewer full trajectories.
A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.
citing papers explorer
-
Equilibrium Residuals Expose Three Regimes of Matrix-Game Strategic Reasoning in Language Models
LLMs rely on semantic cues for matrix-game equilibria but can acquire approximate computation via residual training on small instances, with a Lipschitz proof enabling transfer to larger anonymous games.
-
MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models
MIRL uses mutual information to guide trajectory selection and provide separate rewards for visual perception in RLVR for VLMs, achieving 70.22% average accuracy with 25% fewer full trajectories.
-
A Survey on Knowledge Distillation of Large Language Models
A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.