GENSTRAT generates fresh imperfect-information card games and a six-axis capability profile plus jaggedness metric to evaluate LLM strategic competence with resistance to saturation.
arXiv preprint arXiv:2505.07215 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
HEAL mitigates entropy collapse in few-shot RLVR by selectively adding general-domain data and aligning trajectory-level entropy dynamics, matching full-shot performance with 32 target samples.
Generalizable agents require environment scaling via diverse executable rule-sets, distinguished from trajectory and task scaling in a new taxonomy.
citing papers explorer
-
GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models
GENSTRAT generates fresh imperfect-information card games and a six-axis capability profile plus jaggedness metric to evaluate LLM strategic competence with resistance to saturation.
-
HEALing Entropy Collapse: Enhancing Exploration in Few-Shot RLVR via Hybrid-Domain Entropy Dynamics Alignment
HEAL mitigates entropy collapse in few-shot RLVR by selectively adding general-domain data and aligning trajectory-level entropy dynamics, matching full-shot performance with 32 target samples.
-
Scalable Environments Drive Generalizable Agents
Generalizable agents require environment scaling via diverse executable rule-sets, distinguished from trajectory and task scaling in a new taxonomy.