Differentiable relaxation of LTL automata via soft labeling enables gradient-based RL from formal specifications, with theoretical bounds on discrete-differentiable discrepancy and up to 2x returns on nonlinear tasks.
Principles of Model Checking
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Secure planning against stealthy attacks on robot actuators in unknown stochastic environments is achieved by modeling the problem as a stochastic game and solving it with model-free RL to satisfy a combined LTL formula.
A model-free RL methodology is developed to maximize the probability of LTL satisfaction in unknown stochastic games when the derived DRA has a single Rabin pair, with a generalization providing lower bounds for multiple pairs.
A model-free RL algorithm uses LTL-derived rewards and path-dependent discounts to learn policies that maximize LTL satisfaction probability with convergence guarantees.
citing papers explorer
-
Accelerated Learning with Linear Temporal Logic using Differentiable Simulation
Differentiable relaxation of LTL automata via soft labeling enables gradient-based RL from formal specifications, with theoretical bounds on discrete-differentiable discrepancy and up to 2x returns on nonlinear tasks.
-
Secure Planning Against Stealthy Attacks via Model-Free Reinforcement Learning
Secure planning against stealthy attacks on robot actuators in unknown stochastic environments is achieved by modeling the problem as a stochastic game and solving it with model-free RL to satisfy a combined LTL formula.
-
Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives
A model-free RL methodology is developed to maximize the probability of LTL satisfaction in unknown stochastic games when the derived DRA has a single Rabin pair, with a generalization providing lower bounds for multiple pairs.
-
Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning
A model-free RL algorithm uses LTL-derived rewards and path-dependent discounts to learn policies that maximize LTL satisfaction probability with convergence guarantees.