Q-Learning for Robust Satisfaction of Signal Temporal Logic Specifications

Austin Jones; Calin Belta; Derya Aksaray; Mac Schwager; Zhaodan Kong

arxiv: 1609.07409 · v1 · pith:FCXJS6V2new · submitted 2016-09-23 · 💻 cs.SY

Q-Learning for Robust Satisfaction of Signal Temporal Logic Specifications

Derya Aksaray , Austin Jones , Zhaodan Kong , Mac Schwager , Calin Belta This is my paper

classification 💻 cs.SY

keywords q-learningsatisfactionproblemsdegreeexpectedlogicperformancepolicies

0 comments

read the original abstract

This paper addresses the problem of learning optimal policies for satisfying signal temporal logic (STL) specifications by agents with unknown stochastic dynamics. The system is modeled as a Markov decision process, in which the states represent partitions of a continuous space and the transition probabilities are unknown. We formulate two synthesis problems where the desired STL specification is enforced by maximizing the probability of satisfaction, and the expected robustness degree, that is, a measure quantifying the quality of satisfaction. We discuss that Q-learning is not directly applicable to these problems because, based on the quantitative semantics of STL, the probability of satisfaction and expected robustness degree are not in the standard objective form of Q-learning. To resolve this issue, we propose an approximation of STL synthesis problems that can be solved via Q-learning, and we derive some performance bounds for the policies obtained by the approximate approach. The performance of the proposed method is demonstrated via simulations.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic
cs.LG 2026-05 unverdicted novelty 7.0

Embedding Temporal Logic enables runtime monitoring of temporally extended perceptual behaviors by defining predicates via distances between observed and reference embeddings in learned spaces, with conformal calibrat...
Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic
cs.LG 2026-05 unverdicted novelty 7.0

Embedding Temporal Logic (ETL) performs runtime monitoring directly in learned embedding spaces using distance-based predicates composed with temporal operators, supported by conformal calibration for reliable predica...