IPR uses world-model rollouts to reinforce a VLM policy via PhysCode on a 1000+ game benchmark, achieving robust physical reasoning that improves with experience and transfers zero-shot to unseen games while surpassing GPT-5.
Stable retro: A maintained fork of ope- nai’s gym-retro.https://github.com/Farama- Foundation/stable-retro, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
IPR-1: Interactive Physical Reasoner
IPR uses world-model rollouts to reinforce a VLM policy via PhysCode on a 1000+ game benchmark, achieving robust physical reasoning that improves with experience and transfers zero-shot to unseen games while surpassing GPT-5.