ALaM stabilizes state-wise multiplier networks in safe RL via quadratic penalties and supervised regression on dual targets, guaranteeing multiplier convergence and optimal constrained policies when combined with SAC.
Addressing function approximation error in actor-critic methods,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning
ALaM stabilizes state-wise multiplier networks in safe RL via quadratic penalties and supervised regression on dual targets, guaranteeing multiplier convergence and optimal constrained policies when combined with SAC.