pith. machine review for the scientific record. sign in

Simulus: Combining Improvements in Sample-Efficient World Model Agents

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

World models (WMs) represent the frontier of sample-efficient reinforcement learning, but their complexity leaves many promising improvements unrealized due to the significant expertise and effort required to identify and integrate them. Inspired by Rainbow, which showed that individually known improvements to DQN complement each other and can be effectively combined, we take on this challenge and ask whether the same principle applies to world model agents. We introduce Simulus, a modular token-based WM agent that integrates: (1) a flexible tokenization framework supporting arbitrary combinations of observation and action modalities; (2) intrinsic motivation for epistemic uncertainty reduction; (3) prioritized world model replay; and (4) regression-as-classification for reward and return prediction. Simulus achieves state-of-the-art sample efficiency for planning-free WMs across three diverse benchmarks: visual Atari 100K, continuous-control DMC Proprioception 500K, and symbolic Craftax-1M. Notably, intrinsic motivation proves beneficial even under the tight interaction budgets of sample-efficient RL, despite the risk of wasting scarce interactions on task-irrelevant experience. Ablation studies reveal that each component contributes individually, and their combination yields synergistic gains. Our code and model weights are publicly available at https://github.com/leor-c/Simulus.

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper.