MCB decouples per-collision termination from global resets in DRL navigation training, yielding faster early-stage success-rate gains in simulation and deployable policies on real robots.
Recovery rl: Safe reinforcement learning with learned recovery zones
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2verdicts
UNVERDICTED 2representative citing papers
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
citing papers explorer
-
Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation
MCB decouples per-collision termination from global resets in DRL navigation training, yielding faster early-stage success-rate gains in simulation and deployable policies on real robots.
-
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.