pith. sign in

The arcade learning environment: An evaluation platform for general agents.Journal of artificial intelligence research, 47:253–279

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 3 2025 2

roles

background 2

polarities

background 2

representative citing papers

ANO: A Principled Approach to Robust Policy Optimization

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

ANO derives a robust policy optimizer from geometric principles that replaces clipping with a smooth redescending gradient, showing better performance and stability than PPO, SPO, and GRPO in MuJoCo, Atari, and RLHF experiments.

citing papers explorer

Showing 5 of 5 citing papers.