Reflex formalizes axial and bilateral reflection symmetries and adds symmetry regularization to PPO and SAC, reporting better performance and sample efficiency on Gym and DMC benchmarks.
Thus: π∗(a|gs) = 1 |A∗(gs)| = 1 |A∗(s)| =π ∗(g−1a|s).(46) Ifa /∈ A∗(gs), theng −1a /∈ A∗(s), both sides of the equation equal zero
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control
Reflex formalizes axial and bilateral reflection symmetries and adds symmetry regularization to PPO and SAC, reporting better performance and sample efficiency on Gym and DMC benchmarks.