CODE-SHARP autonomously grows an archive of hierarchical reward programs via foundation models to train generalist RL agents that outperform baselines by up to 6x on long-horizon tasks in Craftax and XLand.
Collect Wood
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs
CODE-SHARP autonomously grows an archive of hierarchical reward programs via foundation models to train generalist RL agents that outperform baselines by up to 6x on long-horizon tasks in Craftax and XLand.