How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies

Akansha Kalra; Basavasagar Patil; Daniel S. Brown; Guanhong Tao

arxiv: 2502.03698 · v4 · submitted 2025-02-06 · 💻 cs.LG · cs.CR· cs.RO

How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies

Akansha Kalra , Basavasagar Patil , Guanhong Tao , Daniel S. Brown This is my paper

Pith reviewed 2026-05-23 03:29 UTC · model grok-4.3

classification 💻 cs.LG cs.CRcs.RO

keywords adversarial attacksbehavior cloningimitation learninguniversal adversarial perturbationsblack-box attackspolicy vulnerabilitytransfer attacks

0 comments

The pith

Modern behavior cloning policies are highly vulnerable to universal adversarial perturbation attacks, including black-box transfers across algorithms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper performs the first systematic study of adversarial attacks on a range of imitation learning algorithms used for behavior cloning. It evaluates methods including Vanilla Behavior Cloning, LSTM-GMM, Implicit Behavior Cloning, Diffusion Policy, and Vector-Quantized Behavior Transformer under white-box, grey-box, and black-box universal adversarial perturbations. Experiments show that these policies are highly vulnerable, with attacks transferring successfully even without direct access to the target model. A reader would care because learning from demonstrations is a common way to train AI agents, and undetected fragility could affect reliability in deployment settings.

Core claim

The central claim is that most existing imitation learning algorithms for behavior cloning are highly vulnerable to universal adversarial perturbations. This vulnerability appears in white-box settings where full model access is available as well as in black-box transfer attacks where perturbations crafted on one algorithm affect others. The study compares vulnerabilities across classic and recent methods and concludes that these algorithms share common weaknesses to such attacks.

What carries the argument

Universal adversarial perturbation attacks applied to the input observations of behavior cloning policies, tested in white-box, grey-box, and black-box transfer settings across multiple algorithms.

If this is right

Vulnerability holds for both white-box and black-box attacks across a range of imitation learning algorithms.
Black-box transfer attacks succeed, allowing perturbations to move between different algorithms without model access.
Current imitation learning methods share limitations that make them susceptible to input perturbations.
The findings point to the need for new approaches to improve robustness in behavior cloning policies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the claim holds, adversarial robustness testing should become a standard part of evaluating any new behavior cloning method.
This raises the possibility that sensor noise or small environmental changes in real-world settings could act like these attacks and disrupt deployed policies.
The results connect to larger questions about how learned policies handle input variations that were not present in training demonstrations.

Load-bearing premise

The tested algorithms and attack setups are representative of modern behavior cloning policies and realistic threats in their intended deployment domains.

What would settle it

New experiments on additional imitation learning algorithms or in physical robot deployments that show low attack success rates would falsify the claim of widespread high vulnerability.

Figures

Figures reproduced from arXiv: 2502.03698 by Akansha Kalra, Basavasagar Patil, Daniel S. Brown, Guanhong Tao.

**Figure 1.** Figure 1: Environments used : for crafting and evaluating Universal Adversarial Perturbation attacks to study adversarial robustness of modern behavior cloning algorithms. (a)-(c) are from RoboMimic [10] and (d) is from [11]. . increases complexity by requiring precise alignment and complex insertion dynamics. The nut’s initial pose is randomized with z-axis rotation within a square region on the table surface. Pu… view at source ↗

**Figure 2.** Figure 2: Task Success Rates under decreasing attack strength (ε) of different behavior cloning algorithms demonstrating their sensitivity to even small adversarial inputs. The steep drop in performance of all BC algorithms except IBC, which suffers a minimal drop, emphasizes the lack of robustness across algorithms. D. How sensitive are attacks to the range of adversarial perturbation? We systematically vary attac… view at source ↗

read the original abstract

Learning from demonstrations is a popular approach to train AI models; however, their vulnerability to adversarial attacks remains underexplored. We present the first systematic study of adversarial attacks, across a range of both classic and recently proposed imitation learning algorithms, including Vanilla Behavior Cloning (Vanilla BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantized Behavior Transformer (VQ-BET). We study the vulnerability of these methods to both white-box, grey-box and black-box adversarial perturbations. Our experiments reveal that most existing methods are highly vulnerable to these attacks, including black-box transfer attacks that transfer across algorithms. To the best of our knowledge, we are the first to study and compare the vulnerabilities of different popular imitation learning algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern imitation learning algorithms, paving the way for future work in addressing such limitations. Videos and code are available at https://sites.google.com/view/uap-attacks-on-bc.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper finds most tested BC methods highly vulnerable to UAP attacks with black-box transfer across algorithms, but the narrow set of five methods and tasks limits how far the results generalize.

read the letter

The main takeaway is that this work shows high vulnerability to universal adversarial perturbations across most of the five imitation learning methods tested, including black-box attacks that transfer between algorithms. They evaluate vanilla BC, LSTM-GMM, IBC, diffusion policy, and VQ-BET under white-box, grey-box, and black-box conditions, and present this as the first systematic comparison of its kind. The experiments support the vulnerability claim on the tasks they ran. Releasing code and videos is a practical plus that lets others check the setups directly. The comparison across a mix of older and newer methods is the clearest new element here. The soft spot is representativeness. These five algorithms may not cover the full spread of current behavior cloning work, especially transformer-based or flow-matching variants trained on large heterogeneous data. The tasks could also be simpler than typical deployment cases in observation dimensionality or horizon length, which would make the high attack success rates and transfer results less indicative of broader practice. Black-box transfer might stem from shared data distributions rather than a general weakness. This paper is for researchers in imitation learning and robotics who focus on robustness or security. Readers looking at adversarial issues in learned controllers would get direct value from the empirical results. It deserves peer review because the comparison is new and the evidence is empirical rather than circular, even if the scope needs tightening in revision.

Referee Report

3 major / 2 minor

Summary. The paper conducts the first systematic empirical study of universal adversarial perturbation (UAP) attacks on behavior cloning policies. It evaluates five imitation learning algorithms—Vanilla BC, LSTM-GMM, IBC, Diffusion Policy (DP), and VQ-BET—under white-box, grey-box, and black-box attack settings, reporting that most are highly vulnerable with successful black-box transfer across algorithms. The work positions itself as the initial comparison of these vulnerabilities and releases code and videos.

Significance. If the experimental results hold under broader conditions, the findings would establish a concrete security limitation for deployed imitation learning systems, particularly in robotics and control where BC is common. The cross-algorithm black-box transfer result, if robust, would be especially notable as it suggests shared vulnerabilities rather than algorithm-specific weaknesses. The public release of code supports reproducibility and follow-on work on defenses.

major comments (3)

[§5] §5 (Experiments) and Table 2: The headline claim that 'most existing methods are highly vulnerable' rests on results from only five algorithms. No explicit justification or coverage argument is given for why Vanilla BC, LSTM-GMM, IBC, DP, and VQ-BET are representative of the current diversity of BC methods (e.g., newer transformer-based or flow-matching variants trained on large heterogeneous datasets). Without this, the transferability and vulnerability conclusions cannot be generalized beyond the tested suite.
[§4.3] §4.3 and §5.2: The black-box transfer attack protocol is described at a high level, but the manuscript does not report the precise success-rate thresholds, number of source-target pairs, or statistical significance tests used to declare 'transfer across algorithms.' This detail is load-bearing for the central empirical claim.
[§5.1] §5.1, Table 1: The environments and observation spaces used (e.g., dimensionality, horizon length, presence of safety constraints) are not compared against typical real-world BC deployment settings. If the chosen tasks are low-dimensional or lack the complexity of modern applications, the reported attack success rates may not indicate vulnerability under realistic threat models.

minor comments (2)

[Introduction] The abstract states the study covers 'classic and recently proposed' algorithms, but the introduction does not cite the original papers for each of the five methods with publication years; adding these would improve context.
Figure 3 (attack visualization) caption does not specify the perturbation magnitude (ε) or the exact policy being visualized; this reduces clarity for readers reproducing the results.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below with clarifications and indicate planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [§5] §5 (Experiments) and Table 2: The headline claim that 'most existing methods are highly vulnerable' rests on results from only five algorithms. No explicit justification or coverage argument is given for why Vanilla BC, LSTM-GMM, IBC, DP, and VQ-BET are representative of the current diversity of BC methods (e.g., newer transformer-based or flow-matching variants trained on large heterogeneous datasets). Without this, the transferability and vulnerability conclusions cannot be generalized beyond the tested suite.

Authors: These five algorithms were deliberately selected to span classical (Vanilla BC, LSTM-GMM) to modern (IBC, Diffusion Policy, VQ-BET) approaches, covering feedforward, recurrent, implicit, diffusion, and transformer-based paradigms that dominate recent BC literature. VQ-BET specifically addresses transformer-based methods. We will add an explicit justification subsection in §5 discussing selection criteria, prevalence in the field, and scope limitations (e.g., excluding certain flow-matching variants). This will support the claims without overgeneralization. revision: yes
Referee: [§4.3] §4.3 and §5.2: The black-box transfer attack protocol is described at a high level, but the manuscript does not report the precise success-rate thresholds, number of source-target pairs, or statistical significance tests used to declare 'transfer across algorithms.' This detail is load-bearing for the central empirical claim.

Authors: The referee correctly notes that these implementation details are not fully reported in the current manuscript. We will revise §4.3 and §5.2 to explicitly state the success-rate thresholds, the number of source-target pairs evaluated, and any statistical significance tests used, ensuring the transfer results are fully reproducible and supported. revision: yes
Referee: [§5.1] §5.1, Table 1: The environments and observation spaces used (e.g., dimensionality, horizon length, presence of safety constraints) are not compared against typical real-world BC deployment settings. If the chosen tasks are low-dimensional or lack the complexity of modern applications, the reported attack success rates may not indicate vulnerability under realistic threat models.

Authors: The environments were chosen as standard benchmarks from the source papers of each method to enable fair comparisons. We agree a direct mapping to real-world settings is absent. In the revision we will expand §5.1 with a discussion comparing observation dimensionality, horizon lengths, and constraints to typical robotic deployments, plus an explicit limitations paragraph on the gap to more complex real-world scenarios. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation of existing algorithms

full rationale

The paper conducts an experimental comparison of adversarial vulnerability across five imitation learning methods (Vanilla BC, LSTM-GMM, IBC, DP, VQ-BET) using white-box, grey-box, and black-box attacks. No derivation chain, equations, fitted parameters renamed as predictions, or self-citations that bear the load of any central claim exist. Results are reported directly from the described experiments on the chosen tasks and algorithms without reduction to prior definitions or ansatzes. The representativeness concern raised by the skeptic is a question of external validity, not circularity per the enumerated patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This empirical study applies existing adversarial attack techniques to imitation learning; it introduces no new free parameters, mathematical axioms beyond standard ML assumptions, or invented entities.

axioms (1)

domain assumption Imitation learning policies can be subjected to gradient-based or transfer-based adversarial perturbations using techniques from supervised learning.
Required to apply universal adversarial perturbations to the listed behavior cloning methods.

pith-pipeline@v0.9.0 · 5723 in / 1125 out tokens · 41027 ms · 2026-05-23T03:29:30.770535+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Immune2V: Image Immunization Against Dual-Stream Image-to-Video Generation
cs.CV 2026-04 unverdicted novelty 7.0

Immune2V immunizes images against dual-stream I2V generation by enforcing temporally balanced latent divergence and aligning generative features to a precomputed collapse trajectory, yielding stronger persistent degra...

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[1]

Intriguing properties of neural networks

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” CoRR, vol. abs/1312.6199, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[2]

Explaining and Harnessing Adversarial Examples

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness- ing adversarial examples,” CoRR, vol. abs/1412.6572, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[3]

Threat of adversarial attacks on deep learning in computer vision: A survey,

N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” Ieee Access, 2018

work page 2018
[4]

Adversarial attacks on deep-learning models in natural language processing: A survey,

W. E. Zhang, Q. Z. Sheng, A. Alhazmi, and C. Li, “Adversarial attacks on deep-learning models in natural language processing: A survey,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 11, no. 3, pp. 1–41, 2020

work page 2020
[5]

A survey on adversarial attacks and defences,

A. Chakraborty, M. Alam, V . Dey, A. Chattopadhyay, and D. Mukhopadhyay, “A survey on adversarial attacks and defences,” CAAI Transactions on Intelligence Technology, 2021

work page 2021
[6]

Studying adversarial attacks on behavioral cloning dynamics,

G. Hall, A. Das, J. Quarles, and P. Rad, “Studying adversarial attacks on behavioral cloning dynamics,” in 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2020, pp. 452–459

work page 2020
[7]

Adversar- ial driving: Attacking end-to-end autonomous driving,

H. Wu, S. Yunas, S. Rowlands, W. Ruan, and J. Wahlstr ¨om, “Adversar- ial driving: Attacking end-to-end autonomous driving,” in 2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2023, pp. 1–7

work page 2023
[8]

Simple physical adversarial examples against end-to-end autonomous driv- ing models,

A. Boloor, X. He, C. Gill, Y . V orobeychik, and X. Zhang, “Simple physical adversarial examples against end-to-end autonomous driv- ing models,” in 2019 IEEE International Conference on Embedded Software and Systems (ICESS). IEEE, 2019, pp. 1–7

work page 2019
[9]

Diffusion policy attacker: Craft- ing adversarial attacks for diffusion-based policies,

Y . Chen, H. Xue, and Y . Chen, “Diffusion policy attacker: Craft- ing adversarial attacks for diffusion-based policies,” ArXiv, vol. abs/2405.19424, 2024

work page arXiv 2024
[10]

What matters in learning from offline human demonstrations for robot manipulation,

A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y . Zhu, and R. Mart’in-Mart’in, “What matters in learning from offline human demonstrations for robot manipulation,” in Conference on Robot Learning, 2021

work page 2021
[11]

Implicit behavioral cloning,

P. Florence, C. Lynch, A. Zeng, O. Ramirez, A. Wahid, L. Downs, A. Wong, J. Lee, I. Mordatch, and J. Tompson, “Implicit behavioral cloning,” Conference on Robot Learning (CoRL), 2021

work page 2021
[12]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” in Proceedings of Robotics: Science and Systems (RSS), 2023

work page 2023
[13]

Behavior generation with latent actions

S. Lee, Y . Wang, H. Etukuru, H. J. Kim, N. Muhammad, M. Shafiullah, and L. Pinto, “Behavior generation with latent actions,” ArXiv, vol. abs/2403.03181, 2024

work page arXiv 2024
[14]

Uni- versal adversarial perturbations,

S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Uni- versal adversarial perturbations,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

work page 2017
[15]

(certified!!) adversarial robustness for free!

N. Carlini, F. Tram `er, K. D. Dvijotham, L. Rice, M. Sun, and J. Z. Kolter, “(certified!!) adversarial robustness for free!” in The Eleventh International Conference on Learning Representations. OpenReview, 2023

work page 2023
[16]

Physical adversarial attack on a robotic arm,

Y . Jia, C. M. Poskitt, J. Sun, and S. Chattopadhyay, “Physical adversarial attack on a robotic arm,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9334–9341, 2022

work page 2022
[17]

Quantifying assistive robustness via the natural-adversarial frontier,

J. Z.-Y . He, D. S. Brown, Z. Erickson, and A. Dragan, “Quantifying assistive robustness via the natural-adversarial frontier,” in Conference on Robot Learning. PMLR, 2023, pp. 1865–1886

work page 2023
[18]

Preventing imitation learning with adversarial policy ensembles,

A. Zhan, S. Tiomkin, and P. Abbeel, “Preventing imitation learning with adversarial policy ensembles,” arXiv preprint arXiv:2002.01059, 2020

work page arXiv 2002
[19]

Rethinking the intermediate features in adversarial attacks: Misleading robotic models via adver- sarial distillation,

K. Zhao, H. Huang, M. Li, and Y . Wu, “Rethinking the intermediate features in adversarial attacks: Misleading robotic models via adver- sarial distillation,” arXiv preprint arXiv:2411.15222, 2024

work page arXiv 2024
[20]

Attacking deep reinforcement learning with decoupled adversarial policy,

K. Mo, W. Tang, J. Li, and X. Yuan, “Attacking deep reinforcement learning with decoupled adversarial policy,” IEEE Transactions on Dependable and Secure Computing, vol. 20, pp. 758–768, 2023

work page 2023
[21]

Stealthy and efficient adversarial attacks against deep reinforcement learning,

J. Sun, T. Zhang, X. Xie, L. Ma, Y . Zheng, K. Chen, and Y . Liu, “Stealthy and efficient adversarial attacks against deep reinforcement learning,” in AAAI Conference on Artificial Intelligence, 2020

work page 2020
[22]

Robust deep reinforcement learning with adversarial attacks,

A. Pattanaik, Z. Tang, S. Liu, G. Bommannan, and G. V . Chowdhary, “Robust deep reinforcement learning with adversarial attacks,” in Adaptive Agents and Multi-Agent Systems, 2017

work page 2017
[23]

Tactics of adversarial attack on deep reinforcement learning agents,

Y .-C. Lin, Z.-W. Hong, Y .-H. Liao, M.-L. Shih, M.-Y . Liu, and M. Sun, “Tactics of adversarial attack on deep reinforcement learning agents,” in International Joint Conference on Artificial Intelligence, 2017

work page 2017
[24]

Adversarial policies: Attacking deep reinforcement learning.arXiv preprint arXiv:1905.10615, 2019

A. Gleave, M. Dennis, N. Kant, C. Wild, S. Levine, and S. J. Russell, “Adversarial policies: Attacking deep reinforcement learning,” ArXiv, vol. abs/1905.10615, 2019

work page arXiv 1905
[25]

Robust deep reinforcement learning against adversarial perturbations on state observations,

H. Zhang, H. Chen, C. Xiao, B. Li, M. Liu, D. Boning, and C.-J. Hsieh, “Robust deep reinforcement learning against adversarial perturbations on state observations,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 024–21 037, 2020

work page 2020
[26]

Robust reinforcement learning on state observations with learned optimal adversary,

H. Zhang, H. Chen, D. Boning, and C.-J. Hsieh, “Robust reinforcement learning on state observations with learned optimal adversary,” in International Conference on Learning Representation (ICLR), 2021

work page 2021
[27]

Robust reinforcement learning: A review of foundations and recent advances,

J. Moos, K. Hansel, H. Abdulsamad, S. Stark, D. Clever, and J. Peters, “Robust reinforcement learning: A review of foundations and recent advances,” Machine Learning and Knowledge Extraction, 2022

work page 2022
[28]

Adversarial Attacks on Neural Network Policies

S. Huang, N. Papernot, I. Goodfellow, Y . Duan, and P. Abbeel, “Adversarial attacks on neural network policies,” arXiv preprint arXiv:1702.02284, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[29]

White- box adversarial policies in deep reinforcement learning,

S. Casper, T. Killian, G. Kreiman, and D. Hadfield-Menell, “White- box adversarial policies in deep reinforcement learning,”arXiv preprint arXiv:2209.02167, 2022

work page arXiv 2022
[30]

Bird: generalizable backdoor detection and removal for deep reinforcement learning,

X. Chen, W. Guo, G. Tao, X. Zhang, and D. Song, “Bird: generalizable backdoor detection and removal for deep reinforcement learning,” Advances in Neural Information Processing Systems, 2023

work page 2023
[31]

A framework for behavioural cloning

M. Bain and C. Sammut, “A framework for behavioural cloning.” in Machine Intelligence 15, 1995, pp. 103–129

work page 1995
[32]

Behavioral cloning from obser- vation,

F. Torabi, G. Warnell, and P. Stone, “Behavioral cloning from obser- vation,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 4950–4957

work page 2018
[33]

A reduction of imitation learning and structured prediction to no-regret online learning,

S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011

work page 2011
[34]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, pp. 1735–1780, 1997

work page 1997
[35]

G. J. McLachlan and D. Peel, Finite mixture models. John Wiley & Sons, 2000

work page 2000
[36]

Denoising Diffusion Probabilistic Models

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” ArXiv, vol. abs/2006.11239, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2006
[37]

Towards deep learning models resistant to adversarial attacks,

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations, 2018

work page 2018

[1] [1]

Intriguing properties of neural networks

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” CoRR, vol. abs/1312.6199, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[2] [2]

Explaining and Harnessing Adversarial Examples

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harness- ing adversarial examples,” CoRR, vol. abs/1412.6572, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[3] [3]

Threat of adversarial attacks on deep learning in computer vision: A survey,

N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” Ieee Access, 2018

work page 2018

[4] [4]

Adversarial attacks on deep-learning models in natural language processing: A survey,

W. E. Zhang, Q. Z. Sheng, A. Alhazmi, and C. Li, “Adversarial attacks on deep-learning models in natural language processing: A survey,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 11, no. 3, pp. 1–41, 2020

work page 2020

[5] [5]

A survey on adversarial attacks and defences,

A. Chakraborty, M. Alam, V . Dey, A. Chattopadhyay, and D. Mukhopadhyay, “A survey on adversarial attacks and defences,” CAAI Transactions on Intelligence Technology, 2021

work page 2021

[6] [6]

Studying adversarial attacks on behavioral cloning dynamics,

G. Hall, A. Das, J. Quarles, and P. Rad, “Studying adversarial attacks on behavioral cloning dynamics,” in 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2020, pp. 452–459

work page 2020

[7] [7]

Adversar- ial driving: Attacking end-to-end autonomous driving,

H. Wu, S. Yunas, S. Rowlands, W. Ruan, and J. Wahlstr ¨om, “Adversar- ial driving: Attacking end-to-end autonomous driving,” in 2023 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2023, pp. 1–7

work page 2023

[8] [8]

Simple physical adversarial examples against end-to-end autonomous driv- ing models,

A. Boloor, X. He, C. Gill, Y . V orobeychik, and X. Zhang, “Simple physical adversarial examples against end-to-end autonomous driv- ing models,” in 2019 IEEE International Conference on Embedded Software and Systems (ICESS). IEEE, 2019, pp. 1–7

work page 2019

[9] [9]

Diffusion policy attacker: Craft- ing adversarial attacks for diffusion-based policies,

Y . Chen, H. Xue, and Y . Chen, “Diffusion policy attacker: Craft- ing adversarial attacks for diffusion-based policies,” ArXiv, vol. abs/2405.19424, 2024

work page arXiv 2024

[10] [10]

What matters in learning from offline human demonstrations for robot manipulation,

A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y . Zhu, and R. Mart’in-Mart’in, “What matters in learning from offline human demonstrations for robot manipulation,” in Conference on Robot Learning, 2021

work page 2021

[11] [11]

Implicit behavioral cloning,

P. Florence, C. Lynch, A. Zeng, O. Ramirez, A. Wahid, L. Downs, A. Wong, J. Lee, I. Mordatch, and J. Tompson, “Implicit behavioral cloning,” Conference on Robot Learning (CoRL), 2021

work page 2021

[12] [12]

Diffusion policy: Visuomotor policy learning via action diffusion,

C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” in Proceedings of Robotics: Science and Systems (RSS), 2023

work page 2023

[13] [13]

Behavior generation with latent actions

S. Lee, Y . Wang, H. Etukuru, H. J. Kim, N. Muhammad, M. Shafiullah, and L. Pinto, “Behavior generation with latent actions,” ArXiv, vol. abs/2403.03181, 2024

work page arXiv 2024

[14] [14]

Uni- versal adversarial perturbations,

S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Uni- versal adversarial perturbations,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

work page 2017

[15] [15]

(certified!!) adversarial robustness for free!

N. Carlini, F. Tram `er, K. D. Dvijotham, L. Rice, M. Sun, and J. Z. Kolter, “(certified!!) adversarial robustness for free!” in The Eleventh International Conference on Learning Representations. OpenReview, 2023

work page 2023

[16] [16]

Physical adversarial attack on a robotic arm,

Y . Jia, C. M. Poskitt, J. Sun, and S. Chattopadhyay, “Physical adversarial attack on a robotic arm,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9334–9341, 2022

work page 2022

[17] [17]

Quantifying assistive robustness via the natural-adversarial frontier,

J. Z.-Y . He, D. S. Brown, Z. Erickson, and A. Dragan, “Quantifying assistive robustness via the natural-adversarial frontier,” in Conference on Robot Learning. PMLR, 2023, pp. 1865–1886

work page 2023

[18] [18]

Preventing imitation learning with adversarial policy ensembles,

A. Zhan, S. Tiomkin, and P. Abbeel, “Preventing imitation learning with adversarial policy ensembles,” arXiv preprint arXiv:2002.01059, 2020

work page arXiv 2002

[19] [19]

Rethinking the intermediate features in adversarial attacks: Misleading robotic models via adver- sarial distillation,

K. Zhao, H. Huang, M. Li, and Y . Wu, “Rethinking the intermediate features in adversarial attacks: Misleading robotic models via adver- sarial distillation,” arXiv preprint arXiv:2411.15222, 2024

work page arXiv 2024

[20] [20]

Attacking deep reinforcement learning with decoupled adversarial policy,

K. Mo, W. Tang, J. Li, and X. Yuan, “Attacking deep reinforcement learning with decoupled adversarial policy,” IEEE Transactions on Dependable and Secure Computing, vol. 20, pp. 758–768, 2023

work page 2023

[21] [21]

Stealthy and efficient adversarial attacks against deep reinforcement learning,

J. Sun, T. Zhang, X. Xie, L. Ma, Y . Zheng, K. Chen, and Y . Liu, “Stealthy and efficient adversarial attacks against deep reinforcement learning,” in AAAI Conference on Artificial Intelligence, 2020

work page 2020

[22] [22]

Robust deep reinforcement learning with adversarial attacks,

A. Pattanaik, Z. Tang, S. Liu, G. Bommannan, and G. V . Chowdhary, “Robust deep reinforcement learning with adversarial attacks,” in Adaptive Agents and Multi-Agent Systems, 2017

work page 2017

[23] [23]

Tactics of adversarial attack on deep reinforcement learning agents,

Y .-C. Lin, Z.-W. Hong, Y .-H. Liao, M.-L. Shih, M.-Y . Liu, and M. Sun, “Tactics of adversarial attack on deep reinforcement learning agents,” in International Joint Conference on Artificial Intelligence, 2017

work page 2017

[24] [24]

Adversarial policies: Attacking deep reinforcement learning.arXiv preprint arXiv:1905.10615, 2019

A. Gleave, M. Dennis, N. Kant, C. Wild, S. Levine, and S. J. Russell, “Adversarial policies: Attacking deep reinforcement learning,” ArXiv, vol. abs/1905.10615, 2019

work page arXiv 1905

[25] [25]

Robust deep reinforcement learning against adversarial perturbations on state observations,

H. Zhang, H. Chen, C. Xiao, B. Li, M. Liu, D. Boning, and C.-J. Hsieh, “Robust deep reinforcement learning against adversarial perturbations on state observations,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 024–21 037, 2020

work page 2020

[26] [26]

Robust reinforcement learning on state observations with learned optimal adversary,

H. Zhang, H. Chen, D. Boning, and C.-J. Hsieh, “Robust reinforcement learning on state observations with learned optimal adversary,” in International Conference on Learning Representation (ICLR), 2021

work page 2021

[27] [27]

Robust reinforcement learning: A review of foundations and recent advances,

J. Moos, K. Hansel, H. Abdulsamad, S. Stark, D. Clever, and J. Peters, “Robust reinforcement learning: A review of foundations and recent advances,” Machine Learning and Knowledge Extraction, 2022

work page 2022

[28] [28]

Adversarial Attacks on Neural Network Policies

S. Huang, N. Papernot, I. Goodfellow, Y . Duan, and P. Abbeel, “Adversarial attacks on neural network policies,” arXiv preprint arXiv:1702.02284, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[29] [29]

White- box adversarial policies in deep reinforcement learning,

S. Casper, T. Killian, G. Kreiman, and D. Hadfield-Menell, “White- box adversarial policies in deep reinforcement learning,”arXiv preprint arXiv:2209.02167, 2022

work page arXiv 2022

[30] [30]

Bird: generalizable backdoor detection and removal for deep reinforcement learning,

X. Chen, W. Guo, G. Tao, X. Zhang, and D. Song, “Bird: generalizable backdoor detection and removal for deep reinforcement learning,” Advances in Neural Information Processing Systems, 2023

work page 2023

[31] [31]

A framework for behavioural cloning

M. Bain and C. Sammut, “A framework for behavioural cloning.” in Machine Intelligence 15, 1995, pp. 103–129

work page 1995

[32] [32]

Behavioral cloning from obser- vation,

F. Torabi, G. Warnell, and P. Stone, “Behavioral cloning from obser- vation,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 4950–4957

work page 2018

[33] [33]

A reduction of imitation learning and structured prediction to no-regret online learning,

S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011

work page 2011

[34] [34]

Long short-term memory,

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, pp. 1735–1780, 1997

work page 1997

[35] [35]

G. J. McLachlan and D. Peel, Finite mixture models. John Wiley & Sons, 2000

work page 2000

[36] [36]

Denoising Diffusion Probabilistic Models

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” ArXiv, vol. abs/2006.11239, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2006

[37] [37]

Towards deep learning models resistant to adversarial attacks,

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations, 2018

work page 2018