A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
CVaR-constrained policy optimization for safe reinforcement learning,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.