Secure Planning Against Stealthy Attacks via Model-Free Reinforcement Learning

Alper Kamil Bozkurt; Miroslav Pajic; Yu Wang

arxiv: 2011.01882 · v2 · submitted 2020-11-03 · 💻 cs.RO · cs.GT

Secure Planning Against Stealthy Attacks via Model-Free Reinforcement Learning

Alper Kamil Bozkurt , Yu Wang , Miroslav Pajic This is my paper

Pith reviewed 2026-05-24 14:26 UTC · model grok-4.3

classification 💻 cs.RO cs.GT

keywords secure planningstealthy attacksmodel-free reinforcement learninglinear temporal logicstochastic gamerobotic planningunknown environmentactuator attacks

0 comments

The pith

A combined LTL formula for task and stealthy-attack prevention in a stochastic game can be satisfied by model-free reinforcement learning without an environment model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to plan robot tasks securely in unknown stochastic environments when an attacker can manipulate control signals but must remain undetected by an intrusion-detection system. It models the interaction as a stochastic game between the controller and the attacker. The objectives of both parties are captured together in one linear temporal logic formula. This combined specification is then satisfied by model-free reinforcement learning, which learns a policy directly from interaction without any model of the environment. A sympathetic reader would care because many real robotic systems operate in unmapped spaces where actuator attacks are possible.

Core claim

We consider the problem of security-aware planning in an unknown stochastic environment, in the presence of attacks on control signals of the robot. We model the attacker as an agent who has the full knowledge of the controller as well as the employed intrusion-detection system and who wants to prevent the controller from performing tasks while staying stealthy. We formulate the problem as a stochastic game between the attacker and the controller and present an approach to express the objective of such an agent and the controller as a combined linear temporal logic (LTL) formula. We then show that the planning problem, described formally as the problem of satisfying an LTL formula in a stoch

What carries the argument

Combined LTL formula that encodes both task completion and stealthy-attack prevention inside a stochastic game between controller and attacker, solved via model-free RL.

If this is right

The planning problem can be solved without any model of the environment.
Model-free RL computes a policy that meets both task and security requirements expressed in the combined LTL formula.
The method applies to robotic systems facing actuator attacks that must remain stealthy.
The approach is evaluated on two robotic planning case studies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same combined-LTL plus model-free-RL pattern could be tried for other attack types whose goals are expressible in temporal logic.
It suggests that adversarial control problems might be handled without explicit environment models when objectives fit inside an LTL formula.
Scalability to high-dimensional state spaces or continuous dynamics would require further testing beyond the two case studies.

Load-bearing premise

The objectives of the attacker and the controller can be expressed together in one LTL formula that model-free reinforcement learning can satisfy in a completely unknown environment.

What would settle it

Apply model-free RL to a simulated robotic task with actuator attacks and an intrusion-detection system, then check whether the learned policy satisfies the combined LTL formula while completing the task and blocking stealthy attacks.

Figures

Figures reproduced from arXiv: 2011.01882 by Alper Kamil Bozkurt, Miroslav Pajic, Yu Wang.

**Figure 1.** Figure 1: Surveillance scenario (from left to right): (a) The controller strategy from b to c and the cell labels; (b) The controller and attacker strategies from b to c before any anomaly occurs; (c) The controller and attacker strategies from b to c after one anomaly. 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 0.79 0.78 b 0.70 0.53 0.32 0.32 0.31 0.31 c 0.31 0.80 0.76 0.60 0.27 0.16 0.22 0.31 0.31 0.31 0.81 0.69 0.27 0.14 0.… view at source ↗

**Figure 2.** Figure 2: Task sequence scenario (from left to right): (a) The controller strategy from d to e and the cell labels; (b) The controller and attacker strategies from d to e right after an anomaly occurs; (c) The controller and attacker strategies from d to e right after an alarm. path from b to c, the learned controller strategy prefers a quite long path. There is only one cell between b and c, and this cell and all t… view at source ↗

read the original abstract

We consider the problem of security-aware planning in an unknown stochastic environment, in the presence of attacks on control signals (i.e., actuators) of the robot. We model the attacker as an agent who has the full knowledge of the controller as well as the employed intrusion-detection system and who wants to prevent the controller from performing tasks while staying stealthy. We formulate the problem as a stochastic game between the attacker and the controller and present an approach to express the objective of such an agent and the controller as a combined linear temporal logic (LTL) formula. We then show that the planning problem, described formally as the problem of satisfying an LTL formula in a stochastic game, can be solved via model-free reinforcement learning when the environment is completely unknown. Finally, we illustrate and evaluate our methods on two robotic planning case studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reduces planning against stealthy actuator attacks to satisfying a combined LTL formula in an unknown stochastic game and solves it with model-free RL, but the abstract supplies almost no technical detail on the construction or results.

read the letter

The main point is that the authors model the controller and a knowledgeable but stealthy attacker as players in a stochastic game, encode both task goals and attack prevention in one LTL formula, and then claim model-free RL can produce a satisfying policy without any model of the environment. That reduction is the core contribution they advertise for robotic planning under attacks. The framing itself is reasonable: real robots face actuator tampering that an IDS must miss, and treating the attacker as having full system knowledge but a stealth constraint captures a practical threat. The two case studies are at least a start toward showing the idea on concrete robots. What is actually new is the specific combination of model-free RL with LTL objectives inside a stochastic game for this security setting; prior work on LTL and RL or on game-theoretic security exists separately, but the joint application here is not standard. The paper earns credit for keeping the environment unknown and for avoiding an explicit model, which matches many deployed systems. The soft spots are straightforward. The abstract gives no description of how the combined LTL is built, what reward is placed on accepting states, which RL algorithm is used, or any quantitative outcomes from the case studies. Without those pieces it is impossible to check whether the game is properly turned into a solvable MDP or whether the learned policies actually stay stealthy while completing tasks. Model-free methods for LTL can work once the product automaton and rewards are defined, but the paper must show the construction explicitly rather than assert it. The evaluations are mentioned only in passing, so we cannot tell if the approach scales or beats simpler baselines. This is the kind of paper that belongs in a reading group focused on secure robotics or formal RL methods; a reader already working on LTL synthesis or adversarial planning would get the most out of the formulation. It deserves peer review because the problem is relevant and the high-level reduction is internally consistent, even though the current write-up leaves the implementation details unexamined.

Referee Report

2 major / 1 minor

Summary. The paper addresses security-aware planning for a robot in an unknown stochastic environment subject to stealthy attacks on its actuators. The attacker is modeled as having complete knowledge of the controller and intrusion-detection system and seeking to disrupt task completion while remaining undetected. The interaction is formulated as a stochastic game whose objectives (for both parties) are encoded as a single combined LTL formula; the resulting LTL-satisfaction problem on the unknown game is then solved by model-free reinforcement learning. The method is illustrated and evaluated on two robotic planning case studies.

Significance. If the reduction from the combined LTL formula to a model-free RL objective is shown to be sound and the learned policies are demonstrated to satisfy the specification with high probability, the work would provide a practical route to secure planning without an a-priori environment model. The explicit construction of a joint LTL formula that simultaneously encodes task satisfaction and stealth constraints is a clear technical contribution.

major comments (2)

[Abstract] Abstract: the central claim that 'the planning problem … can be solved via model-free reinforcement learning when the environment is completely unknown' is stated without any indication of the reward construction, the handling of the two-player game structure inside the RL loop, or convergence arguments; these details are load-bearing for the claim and must be supplied with explicit equations or algorithms.
[Abstract] The weakest assumption identified in the reader’s report (that a single LTL formula can be formed whose satisfaction corresponds to both task completion and stealthy-attack prevention, and that model-free RL can find a policy for it) is never discharged in the provided description; a concrete construction of the product automaton or the reward function on accepting states is required before the reduction can be accepted.

minor comments (1)

[Evaluation] The two case studies should report quantitative metrics (success rate, attack success rate, number of episodes) together with the exact LTL formulas employed so that the empirical support for the method can be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive suggestions. The comments focus on making the abstract self-contained with respect to the technical reduction. We will revise the abstract to include brief but explicit indications of the LTL-to-reward construction and the handling of the game inside the RL procedure, while preserving the manuscript's existing technical sections that already contain the full derivations.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'the planning problem … can be solved via model-free reinforcement learning when the environment is completely unknown' is stated without any indication of the reward construction, the handling of the two-player game structure inside the RL loop, or convergence arguments; these details are load-bearing for the claim and must be supplied with explicit equations or algorithms.

Authors: We agree that the abstract would be strengthened by indicating these elements. The full manuscript already supplies them: the combined LTL formula is constructed in Section III, the product automaton and the reward function (r = 1 on accepting states, discounted sum otherwise) appear in Section IV, the two-player structure is handled by treating the attacker as part of the environment in the model-free Q-learning update, and convergence follows from standard results on RL for LTL satisfaction under the assumption of sufficient exploration. To address the referee's point directly, we will add one sentence to the abstract that references the reward construction from the accepting states of the product automaton and notes that standard model-free RL is applied to the resulting zero-sum game. revision: yes
Referee: [Abstract] The weakest assumption identified in the reader’s report (that a single LTL formula can be formed whose satisfaction corresponds to both task completion and stealthy-attack prevention, and that model-free RL can find a policy for it) is never discharged in the provided description; a concrete construction of the product automaton or the reward function on accepting states is required before the reduction can be accepted.

Authors: The concrete construction is given in the body of the paper (Sections III and IV): the task LTL φ_task and the stealth LTL φ_stealth are conjoined into a single formula φ = φ_task ∧ φ_stealth; the product automaton is formed in the standard way; and the reward function assigns positive reward precisely on the accepting states of this automaton, turning LTL satisfaction into an RL objective. Because the referee correctly notes that the abstract itself does not discharge this, we will revise the abstract to include a short clause stating that the objectives are encoded as a single LTL formula whose satisfaction is reduced to a reward-maximization problem on the product automaton. revision: yes

Circularity Check

0 steps flagged

No significant circularity; standard reduction to RL on LTL game

full rationale

The derivation reduces the security planning problem to a stochastic game whose objectives are encoded as a single LTL formula; satisfaction of that formula in an unknown environment is then solved by model-free RL. This is a conventional product-automaton construction followed by reward shaping on accepting states, with no self-definitional loops, no fitted parameters renamed as predictions, and no load-bearing self-citations that close the argument. The approach is externally falsifiable via simulation on the two robotic case studies and does not rely on any uniqueness theorem or ansatz imported from the authors' prior work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach relies on standard assumptions in game theory and formal methods for robotics security.

axioms (2)

domain assumption The problem can be modeled as a stochastic game between controller and attacker.
Central to the formulation in the abstract.
domain assumption Objectives can be expressed as a combined LTL formula.
Used to formalize the planning problem.

pith-pipeline@v0.9.0 · 5671 in / 1285 out tokens · 31454 ms · 2026-05-24T14:26:47.197552+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 2 internal anchors

[1]

Kerns, Daniel P

Andrew J. Kerns, Daniel P. Shepard, Jahshan A. Bhatti, and Todd E. Humphreys. Unmanned aircraft capture and control via GPS spooﬁng. Journal of Field Robotics , 31(4):617–636, 2014

work page 2014
[2]

Psiaki, Todd E

Mark L. Psiaki, Todd E. Humphreys, and Brian Stauffer. Attackers can spoof navigation signals without our knowledge. Here’s how to ﬁght back GPS lies. IEEE Spectrum, 53(8):26–53, 2016

work page 2016
[3]

Real-time safety assessment of unmanned aircraft systems against stealthy cyber attacks

Cheolhyeon Kwon, Scott Yantek, and Inseok Hwang. Real-time safety assessment of unmanned aircraft systems against stealthy cyber attacks. Journal of Aerospace Information Systems, 13(1):27–45, 2016

work page 2016
[4]

Non-invasive spooﬁng attacks for anti-lock braking systems

Yasser Shoukry, Paul Martin, Paulo Tabuada, and Mani Srivastava. Non-invasive spooﬁng attacks for anti-lock braking systems. In International Conference on Cryptographic Hardware and Embedded Systems, pages 55–72. Springer, 2013

work page 2013
[5]

D’Argenio, Bernd Finkbeiner, and Holger Hermanns

Gilles Barthe, Pedro R. D’Argenio, Bernd Finkbeiner, and Holger Hermanns. Facets of software doping. In International Symposium on Leveraging Applications of Formal Methods, pages 601–608. Springer, 2016

work page 2016
[6]

Survey of recent cyber security attacks on robotic systems and their mitigation approaches

Abdullahi Chowdhury, Gour Karmakar, and Joarder Kamruzzaman. Survey of recent cyber security attacks on robotic systems and their mitigation approaches. In Cyber Law, Privacy, and Security: Concepts, Methodologies, Tools, and Applications, pages 1426–1441. IGI Global, 2019

work page 2019
[7]

Secure control against replay attacks

Yilin Mo and Bruno Sinopoli. Secure control against replay attacks. In 2009 47th Annual Allerton Conference on Communication, Control, and Computing, pages 911–918, 2009

work page 2009
[8]

Roy S. Smith. Covert Misappropriation of Networked Control Systems: Presenting a Feedback Structure. IEEE Control Systems Magazine, 35(1):82–92, 2015

work page 2015
[9]

Jo- hansson

Andre Teixeira, Iman Shames, Henrik Sandberg, and Karl H. Jo- hansson. Revealing stealthy attacks in control systems. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing, pages 1806–1813, Monticello, IL, USA, October 2012. IEEE

work page 2012
[10]

False data injection attacks in control systems

Yilin Mo and Bruno Sinopoli. False data injection attacks in control systems. In First workshop on Secure Control Systems , pages 1–6, 2010

work page 2010
[11]

Analysis and design of stealthy cyber attacks on unmanned aerial systems

Cheolhyeon Kwon, Weiyi Liu, and Inseok Hwang. Analysis and design of stealthy cyber attacks on unmanned aerial systems. Journal of Aerospace Information Systems , 11(8):525–539, 2014

work page 2014
[12]

Relaxing integrity requirements for attack-resilient cyber-physical systems

Ilija Jovanov and Miroslav Pajic. Relaxing integrity requirements for attack-resilient cyber-physical systems. IEEE Transactions on Automatic Control, 64(12):4843–4858, Dec 2019

work page 2019
[13]

ConAML: Constrained Adversarial Machine Learning for Cyber-Physical Systems

Jiangnan Li, Jin Young Lee, Yingyuan Yang, Jinyuan Stella Sun, and Kevin Tomsovic. ConAML: Constrained Adversarial Machine Learning for Cyber-Physical Systems. arXiv:2003.05631 [cs], March 2020

work page arXiv 2003
[14]

Adver- sarial Machine Learning Beyond the Image Domain

Giulio Zizzo, Chris Hankin, Sergio Maffeis, and Kevin Jones. Adver- sarial Machine Learning Beyond the Image Domain. In Proceedings of the 56th Annual Design Automation Conference 2019, DAC ’19, pages 1–4, Las Vegas, NV , USA, June 2019. Association for Computing Machinery

work page 2019
[15]

A Deep Learning-based Framework for Conducting Stealthy Attacks in Industrial Control Systems

Cheng Feng, Tingting Li, Zhanxing Zhu, and Deeph Chana. A deep learning-based framework for conducting stealthy attacks in industrial control systems. arXiv:1709.06397 [cs], September 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[16]

Lloyd S. Shapley. Stochastic games. Proceedings of the National Academy of Sciences , 39(10):1095–1100, 1953

work page 1953
[17]

Fainekos, and George J

Hadas Kress-Gazit, Georgios E. Fainekos, and George J. Pappas. Where’s Waldo? sensor-based temporal logic motion planning. In Proceedings 2007 IEEE International Conference on Robotics and Automation, pages 3116–3121. IEEE, 2007

work page 2007
[18]

Johansson, and Dimos V

Meng Guo, Karl H. Johansson, and Dimos V . Dimarogonas. Revising motion planning under linear temporal logic speciﬁcations in partially known workspaces. In 2013 IEEE International Conference on Robotics and Automation , pages 5025–5032. IEEE, 2013

work page 2013
[19]

Deshmukh, and Miroslav Pajic

Borzoo Bonakdarpour, Jyotirmoy V . Deshmukh, and Miroslav Pajic. Opportunities and challenges in monitoring cyber-physical systems security. In International Symposium on Leveraging Applications of Formal Methods, pages 9–18. Springer, 2018

work page 2018
[20]

Runtime monitoring of cyber-physical systems under timing and memory constraints

Ramy Medhat, Borzoo Bonakdarpour, Deepak Kumar, and Sebastian Fischmeister. Runtime monitoring of cyber-physical systems under timing and memory constraints. ACM Transactions on Embedded Computing Systems (TECS) , 14(4):1–29, 2015

work page 2015
[21]

Synthesizing monitors for safety properties

Klaus Havelund and Grigore Ros ¸u. Synthesizing monitors for safety properties. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages 342–356. Springer, 2002

work page 2002
[22]

Viswanathan, H

Moonjoo Kim, M. Viswanathan, H. Ben-Abdallah, S. Kannan, I. Lee, and O. Sokolsky. Formally speciﬁed monitoring of temporal prop- erties. In Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS’99, pages 114–122, June 1999

work page 1999
[23]

Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, and Miroslav Pajic. Model-free reinforcement learning for stochastic games with linear temporal logic objectives, 2020. arXiv:2010.01050 [cs.RO]

work page internal anchor Pith review Pith/arXiv arXiv 2020
[24]

Henzinger

Krishnendu Chatterjee and Thomas A. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394 – 413, 2012. Games in Veriﬁcation

work page 2012
[25]

Principles of Model Checking

Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, Cambridge, MA, USA, 2008

work page 2008
[26]

Pappas, and Insup Lee

Miroslav Pajic, James Weimer, Nicola Bezzo, Oleg Sokolsky, George J. Pappas, and Insup Lee. Design and implementation of attack-resilient cyberphysical systems: With a focus on attack-resilient state estimators. IEEE Control Systems Magazine , 37(2):66–81, April 2017

work page 2017
[27]

Miroslav Pajic, Insup Lee, and George J. Pappas. Attack-resilient state estimation for noisy dynamical systems. IEEE Transactions on Control of Network Systems , 4(1):82–92, March 2017

work page 2017
[28]

Pappas, and Insup Lee

Nicola Bezzo, James Weimer, Miroslav Pajic, Oleg Sokolsky, George J. Pappas, and Insup Lee. Attack resilient state estimation for autonomous robotic systems. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3692–3698, Sept 2014

work page 2014
[29]

Young Hwan Chang, Qie Hu, and Claire J. Tomlin. Secure estimation based Kalman ﬁlter for cyber–physical systems against sensor attacks. Automatica, 95:399–412, 2018

work page 2018
[30]

Security-aware synthesis using delayed-action games

Mahmoud Elfar, Yu Wang, and Miroslav Pajic. Security-aware synthesis using delayed-action games. In Computer Aided Veriﬁcation (CAV), pages 180–199. Springer International Publishing, 2019

work page 2019
[31]

Cummings, and Miroslav Pajic

Mahmoud Elfar, Haibei Zhu, Mary L. Cummings, and Miroslav Pajic. Security-aware synthesis of human-UA V protocols. In 2019 International Conference on Robotics and Automation (ICRA) , pages 8011–8017, May 2019

work page 2019
[32]

Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J

Georgios E. Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J. Pappas. Temporal logic motion planning for dynamic robots. Automatica, 45(2):343–352, February 2009

work page 2009
[33]

Syn- thesis for Robots: Guarantees and Feedback for Robot Behavior

Hadas Kress-Gazit, Morteza Lahijanian, and Vasumathi Raman. Syn- thesis for Robots: Guarantees and Feedback for Robot Behavior. An- nual Review of Control, Robotics, and Autonomous Systems, 1(1):211– 236, 2018

work page 2018
[34]

Network scheduling for secure cyber-physical systems

Vuk Lesi, Ilija Jovanov, and Miroslav Pajic. Network scheduling for secure cyber-physical systems. In 2017 IEEE Real-Time Systems Symposium (RTSS), pages 45–55, Dec 2017

work page 2017
[35]

Bobba, and Rodolfo Pel- lizzoni

Monowar Hasan, Sibin Mohan, Rakesh B. Bobba, and Rodolfo Pel- lizzoni. Exploring opportunistic execution for integrating security into legacy hard real-time systems. In 2016 IEEE Real-Time Systems Symposium (RTSS), pages 123–134. IEEE, 2016

work page 2016
[36]

Zavlanos, and Miroslav Pajic

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, and Miroslav Pajic. Control synthesis from linear temporal logic speciﬁcations using model-free reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA) , pages 10349–10355. IEEE, 2020

work page 2020
[37]

Omega-regular objectives in model-free reinforcement learning

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Omega-regular objectives in model-free reinforcement learning. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages 395–412. Springer, 2019

work page 2019
[38]

Model-free reinforcement learning for stochastic parity games

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Model-free reinforcement learning for stochastic parity games. In 31st International Conference on Concurrency Theory (CONCUR 2020) . Schloss Dagstuhl-Leibniz- Zentrum f ¨ur Informatik, 2020

work page 2020
[39]

Generalized Rabin(1) synthesis with applications to robust system synthesis

R ¨udiger Ehlers. Generalized Rabin(1) synthesis with applications to robust system synthesis. In NASA Formal Methods Symposium, pages 101–115. Springer, 2011

work page 2011
[40]

Efﬁcient model checking of safety properties

Timo Latvala. Efﬁcient model checking of safety properties. In International SPIN Workshop on Model Checking of Software , pages 74–88. Springer, 2003

work page 2003
[41]

CSRL, 2020

CPSL@Duke. CSRL, 2020. https://gitlab.oit.duke.edu/ cpsl/csrl

work page 2020
[42]

Markov games as a framework for multi-agent reinforcement learning

Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pages 157–163. Elsevier, 1994

work page 1994

[1] [1]

Kerns, Daniel P

Andrew J. Kerns, Daniel P. Shepard, Jahshan A. Bhatti, and Todd E. Humphreys. Unmanned aircraft capture and control via GPS spooﬁng. Journal of Field Robotics , 31(4):617–636, 2014

work page 2014

[2] [2]

Psiaki, Todd E

Mark L. Psiaki, Todd E. Humphreys, and Brian Stauffer. Attackers can spoof navigation signals without our knowledge. Here’s how to ﬁght back GPS lies. IEEE Spectrum, 53(8):26–53, 2016

work page 2016

[3] [3]

Real-time safety assessment of unmanned aircraft systems against stealthy cyber attacks

Cheolhyeon Kwon, Scott Yantek, and Inseok Hwang. Real-time safety assessment of unmanned aircraft systems against stealthy cyber attacks. Journal of Aerospace Information Systems, 13(1):27–45, 2016

work page 2016

[4] [4]

Non-invasive spooﬁng attacks for anti-lock braking systems

Yasser Shoukry, Paul Martin, Paulo Tabuada, and Mani Srivastava. Non-invasive spooﬁng attacks for anti-lock braking systems. In International Conference on Cryptographic Hardware and Embedded Systems, pages 55–72. Springer, 2013

work page 2013

[5] [5]

D’Argenio, Bernd Finkbeiner, and Holger Hermanns

Gilles Barthe, Pedro R. D’Argenio, Bernd Finkbeiner, and Holger Hermanns. Facets of software doping. In International Symposium on Leveraging Applications of Formal Methods, pages 601–608. Springer, 2016

work page 2016

[6] [6]

Survey of recent cyber security attacks on robotic systems and their mitigation approaches

Abdullahi Chowdhury, Gour Karmakar, and Joarder Kamruzzaman. Survey of recent cyber security attacks on robotic systems and their mitigation approaches. In Cyber Law, Privacy, and Security: Concepts, Methodologies, Tools, and Applications, pages 1426–1441. IGI Global, 2019

work page 2019

[7] [7]

Secure control against replay attacks

Yilin Mo and Bruno Sinopoli. Secure control against replay attacks. In 2009 47th Annual Allerton Conference on Communication, Control, and Computing, pages 911–918, 2009

work page 2009

[8] [8]

Roy S. Smith. Covert Misappropriation of Networked Control Systems: Presenting a Feedback Structure. IEEE Control Systems Magazine, 35(1):82–92, 2015

work page 2015

[9] [9]

Jo- hansson

Andre Teixeira, Iman Shames, Henrik Sandberg, and Karl H. Jo- hansson. Revealing stealthy attacks in control systems. In 2012 50th Annual Allerton Conference on Communication, Control, and Computing, pages 1806–1813, Monticello, IL, USA, October 2012. IEEE

work page 2012

[10] [10]

False data injection attacks in control systems

Yilin Mo and Bruno Sinopoli. False data injection attacks in control systems. In First workshop on Secure Control Systems , pages 1–6, 2010

work page 2010

[11] [11]

Analysis and design of stealthy cyber attacks on unmanned aerial systems

Cheolhyeon Kwon, Weiyi Liu, and Inseok Hwang. Analysis and design of stealthy cyber attacks on unmanned aerial systems. Journal of Aerospace Information Systems , 11(8):525–539, 2014

work page 2014

[12] [12]

Relaxing integrity requirements for attack-resilient cyber-physical systems

Ilija Jovanov and Miroslav Pajic. Relaxing integrity requirements for attack-resilient cyber-physical systems. IEEE Transactions on Automatic Control, 64(12):4843–4858, Dec 2019

work page 2019

[13] [13]

ConAML: Constrained Adversarial Machine Learning for Cyber-Physical Systems

Jiangnan Li, Jin Young Lee, Yingyuan Yang, Jinyuan Stella Sun, and Kevin Tomsovic. ConAML: Constrained Adversarial Machine Learning for Cyber-Physical Systems. arXiv:2003.05631 [cs], March 2020

work page arXiv 2003

[14] [14]

Adver- sarial Machine Learning Beyond the Image Domain

Giulio Zizzo, Chris Hankin, Sergio Maffeis, and Kevin Jones. Adver- sarial Machine Learning Beyond the Image Domain. In Proceedings of the 56th Annual Design Automation Conference 2019, DAC ’19, pages 1–4, Las Vegas, NV , USA, June 2019. Association for Computing Machinery

work page 2019

[15] [15]

A Deep Learning-based Framework for Conducting Stealthy Attacks in Industrial Control Systems

Cheng Feng, Tingting Li, Zhanxing Zhu, and Deeph Chana. A deep learning-based framework for conducting stealthy attacks in industrial control systems. arXiv:1709.06397 [cs], September 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[16] [16]

Lloyd S. Shapley. Stochastic games. Proceedings of the National Academy of Sciences , 39(10):1095–1100, 1953

work page 1953

[17] [17]

Fainekos, and George J

Hadas Kress-Gazit, Georgios E. Fainekos, and George J. Pappas. Where’s Waldo? sensor-based temporal logic motion planning. In Proceedings 2007 IEEE International Conference on Robotics and Automation, pages 3116–3121. IEEE, 2007

work page 2007

[18] [18]

Johansson, and Dimos V

Meng Guo, Karl H. Johansson, and Dimos V . Dimarogonas. Revising motion planning under linear temporal logic speciﬁcations in partially known workspaces. In 2013 IEEE International Conference on Robotics and Automation , pages 5025–5032. IEEE, 2013

work page 2013

[19] [19]

Deshmukh, and Miroslav Pajic

Borzoo Bonakdarpour, Jyotirmoy V . Deshmukh, and Miroslav Pajic. Opportunities and challenges in monitoring cyber-physical systems security. In International Symposium on Leveraging Applications of Formal Methods, pages 9–18. Springer, 2018

work page 2018

[20] [20]

Runtime monitoring of cyber-physical systems under timing and memory constraints

Ramy Medhat, Borzoo Bonakdarpour, Deepak Kumar, and Sebastian Fischmeister. Runtime monitoring of cyber-physical systems under timing and memory constraints. ACM Transactions on Embedded Computing Systems (TECS) , 14(4):1–29, 2015

work page 2015

[21] [21]

Synthesizing monitors for safety properties

Klaus Havelund and Grigore Ros ¸u. Synthesizing monitors for safety properties. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages 342–356. Springer, 2002

work page 2002

[22] [22]

Viswanathan, H

Moonjoo Kim, M. Viswanathan, H. Ben-Abdallah, S. Kannan, I. Lee, and O. Sokolsky. Formally speciﬁed monitoring of temporal prop- erties. In Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS’99, pages 114–122, June 1999

work page 1999

[23] [23]

Model-Free Reinforcement Learning for Stochastic Games with Linear Temporal Logic Objectives

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, and Miroslav Pajic. Model-free reinforcement learning for stochastic games with linear temporal logic objectives, 2020. arXiv:2010.01050 [cs.RO]

work page internal anchor Pith review Pith/arXiv arXiv 2020

[24] [24]

Henzinger

Krishnendu Chatterjee and Thomas A. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394 – 413, 2012. Games in Veriﬁcation

work page 2012

[25] [25]

Principles of Model Checking

Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, Cambridge, MA, USA, 2008

work page 2008

[26] [26]

Pappas, and Insup Lee

Miroslav Pajic, James Weimer, Nicola Bezzo, Oleg Sokolsky, George J. Pappas, and Insup Lee. Design and implementation of attack-resilient cyberphysical systems: With a focus on attack-resilient state estimators. IEEE Control Systems Magazine , 37(2):66–81, April 2017

work page 2017

[27] [27]

Miroslav Pajic, Insup Lee, and George J. Pappas. Attack-resilient state estimation for noisy dynamical systems. IEEE Transactions on Control of Network Systems , 4(1):82–92, March 2017

work page 2017

[28] [28]

Pappas, and Insup Lee

Nicola Bezzo, James Weimer, Miroslav Pajic, Oleg Sokolsky, George J. Pappas, and Insup Lee. Attack resilient state estimation for autonomous robotic systems. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3692–3698, Sept 2014

work page 2014

[29] [29]

Young Hwan Chang, Qie Hu, and Claire J. Tomlin. Secure estimation based Kalman ﬁlter for cyber–physical systems against sensor attacks. Automatica, 95:399–412, 2018

work page 2018

[30] [30]

Security-aware synthesis using delayed-action games

Mahmoud Elfar, Yu Wang, and Miroslav Pajic. Security-aware synthesis using delayed-action games. In Computer Aided Veriﬁcation (CAV), pages 180–199. Springer International Publishing, 2019

work page 2019

[31] [31]

Cummings, and Miroslav Pajic

Mahmoud Elfar, Haibei Zhu, Mary L. Cummings, and Miroslav Pajic. Security-aware synthesis of human-UA V protocols. In 2019 International Conference on Robotics and Automation (ICRA) , pages 8011–8017, May 2019

work page 2019

[32] [32]

Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J

Georgios E. Fainekos, Antoine Girard, Hadas Kress-Gazit, and George J. Pappas. Temporal logic motion planning for dynamic robots. Automatica, 45(2):343–352, February 2009

work page 2009

[33] [33]

Syn- thesis for Robots: Guarantees and Feedback for Robot Behavior

Hadas Kress-Gazit, Morteza Lahijanian, and Vasumathi Raman. Syn- thesis for Robots: Guarantees and Feedback for Robot Behavior. An- nual Review of Control, Robotics, and Autonomous Systems, 1(1):211– 236, 2018

work page 2018

[34] [34]

Network scheduling for secure cyber-physical systems

Vuk Lesi, Ilija Jovanov, and Miroslav Pajic. Network scheduling for secure cyber-physical systems. In 2017 IEEE Real-Time Systems Symposium (RTSS), pages 45–55, Dec 2017

work page 2017

[35] [35]

Bobba, and Rodolfo Pel- lizzoni

Monowar Hasan, Sibin Mohan, Rakesh B. Bobba, and Rodolfo Pel- lizzoni. Exploring opportunistic execution for integrating security into legacy hard real-time systems. In 2016 IEEE Real-Time Systems Symposium (RTSS), pages 123–134. IEEE, 2016

work page 2016

[36] [36]

Zavlanos, and Miroslav Pajic

Alper Kamil Bozkurt, Yu Wang, Michael M. Zavlanos, and Miroslav Pajic. Control synthesis from linear temporal logic speciﬁcations using model-free reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA) , pages 10349–10355. IEEE, 2020

work page 2020

[37] [37]

Omega-regular objectives in model-free reinforcement learning

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Omega-regular objectives in model-free reinforcement learning. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems , pages 395–412. Springer, 2019

work page 2019

[38] [38]

Model-free reinforcement learning for stochastic parity games

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Model-free reinforcement learning for stochastic parity games. In 31st International Conference on Concurrency Theory (CONCUR 2020) . Schloss Dagstuhl-Leibniz- Zentrum f ¨ur Informatik, 2020

work page 2020

[39] [39]

Generalized Rabin(1) synthesis with applications to robust system synthesis

R ¨udiger Ehlers. Generalized Rabin(1) synthesis with applications to robust system synthesis. In NASA Formal Methods Symposium, pages 101–115. Springer, 2011

work page 2011

[40] [40]

Efﬁcient model checking of safety properties

Timo Latvala. Efﬁcient model checking of safety properties. In International SPIN Workshop on Model Checking of Software , pages 74–88. Springer, 2003

work page 2003

[41] [41]

CSRL, 2020

CPSL@Duke. CSRL, 2020. https://gitlab.oit.duke.edu/ cpsl/csrl

work page 2020

[42] [42]

Markov games as a framework for multi-agent reinforcement learning

Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pages 157–163. Elsevier, 1994

work page 1994