Introduces RAPCs and a contraction Bellman operator for cost-optimal policies that satisfy probabilistic reach-avoid specifications in stochastic MDPs, with almost-sure convergence to local optima.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
ALGD augments the Lagrangian to locally convexify the energy landscape in diffusion models, stabilizing safe RL training and generation without changing optimal policies.
citing papers explorer
-
Stochastic Minimum-Cost Reach-Avoid Reinforcement Learning
Introduces RAPCs and a contraction Bellman operator for cost-optimal policies that satisfy probabilistic reach-avoid specifications in stochastic MDPs, with almost-sure convergence to local optima.
-
How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?
ALGD augments the Lagrangian to locally convexify the energy landscape in diffusion models, stabilizing safe RL training and generation without changing optimal policies.