ALaM stabilizes state-wise multiplier networks in safe RL via quadratic penalties and supervised regression on dual targets, guaranteeing multiplier convergence and optimal constrained policies when combined with SAC.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning
ALaM stabilizes state-wise multiplier networks in safe RL via quadratic penalties and supervised regression on dual targets, guaranteeing multiplier convergence and optimal constrained policies when combined with SAC.