pith. sign in

arxiv: 2606.09499 · v1 · pith:6ARZJALOnew · submitted 2026-06-08 · 💻 cs.RO · cs.AI· cs.CR

Targeting World Models to Compromise Robot Learning Pipelines

Pith reviewed 2026-06-27 16:03 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.CR
keywords data poisoningworld modelsrobot learningbackdoor attacksrobot policiessupply chain securityDRLVLA
0
0 comments X

The pith

World models can be attacked by hiding malicious prompts in safe robot datasets that only activate when the model generates training data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that world models, used to create synthetic training data for robots, open a new poisoning route in the learning pipeline. Attackers insert malicious prompts or altered dynamics into visibly safe teleoperated datasets. These elements stay dormant until the world model processes the data, at which point they produce dangerous trajectories. Downstream reinforcement learning policies then train on the generated unsafe data and become compromised, even though the original ground-truth data looked clean. This matters because it reveals a supply-chain vulnerability that traditional poisoning defenses, which check datasets directly, would miss.

Core claim

Novel attack methods inject malicious prompts or compromising transition dynamics into visibly safe teleoperated datasets which are only activated once fed through a world model as input. This results in the generation of synthetic, dangerous robot training trajectories and subsequently unsafe or compromised robot policies. The attacks succeed against both state-of-the-art action-conditioned and text-conditioned world models, producing a full end-to-end backdoor on a downstream DRL policy and a proof-of-concept for the VLA setting.

What carries the argument

The activation mechanism: injected malicious prompts or dynamics remain inactive in the raw dataset but trigger only when the world model uses that dataset to generate new trajectories.

If this is right

  • The same injection technique works on both action-conditioned and text-conditioned world models.
  • A full end-to-end backdoor is achieved on a downstream deep reinforcement learning policy.
  • A working proof-of-concept exists for the vision-language-action setting.
  • Current robot learning pipelines that rely on world models require new defenses focused on model internals rather than dataset inspection alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Any pipeline that treats world-model outputs as trusted synthetic data inherits the same hidden entry point.
  • Verification procedures would need to test world models with crafted adversarial inputs rather than only checking raw training sets.
  • Similar activation-style attacks could be adapted to other learned simulators used for policy training.

Load-bearing premise

The world model will generate compromised trajectories from the injected elements and the downstream learner will train on those trajectories without any filtering or detection.

What would settle it

An experiment in which a world model receives the poisoned dataset yet produces only safe trajectories that yield secure policies would show the attack does not work as described.

Figures

Figures reproduced from arXiv: 2606.09499 by Ahmed Agha, Alina Oprea, Christopher Amato, Ethan Rathbun, Eugene Bagdasarian, Saaduddin Mahmud.

Figure 1
Figure 1. Figure 1: Visual example of our proposed “Visual Prompt Hijacking” attack, which targets text [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of our proposed threat model. By targeting world models within the robot [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Evaluating a VTH backdoor attack against PPO policies trained in the Dino WM on the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Evaluating VTH attacks against the 2B Cosmos-Predict2.5 action conditioned world [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Interesting VPH attack success cases. In the top row we see a successful attack scenario [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Interesting VPH attack failure cases. In the top row we see a rollout where the world model [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of our VTH attack against Cosmos-Predict 2.5 Action Conditioned. In [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Visualization of our VTH attack against the Dino WM. Here the top right cell represents [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Example rollout of one of our backdoor poisoned PPO policies. The right column shows [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
read the original abstract

World models have recently seen a rapid growth in both their popularity and capability as more data efficient tools for generating robot training data or simulating real world environments, with many works proposing their integration into the robot learning pipeline. While highly practical, in this work we demonstrate that world models introduce a uniquely stealthy and effective data poisoning entry point into the robot learning supply chain that can result in the deployment of unsafe or otherwise compromised robotic policies despite training on seemingly safe ground truth training data. In contrast to traditional data poisoning techniques which directly implant dangerous trajectories into sold or uploaded datasets, our novel attack methods inject malicious prompts or compromising transition dynamics into visibly safe teleoperated datasets which are only activated once fed through a world model as input. This can result in the generation of synthetic, dangerous robot training trajectories and subsequently unsafe or compromised robot policies. We demonstrate the effectiveness of our attacks against both state of the art action conditioned and text conditioned world models, showing a full end-to-end backdoor on a downstream DRL policy and a proof-of-concept for the VLA setting. Overall these findings necessitate research into more secure world models and reevaluating their position within the robot learning supply chain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that world models introduce a uniquely stealthy and effective data poisoning entry point into the robot learning supply chain. Malicious prompts or compromising transition dynamics are injected into visibly safe teleoperated datasets; these are only activated when the data passes through the world model, generating synthetic dangerous trajectories that train unsafe or compromised downstream policies despite the original ground-truth data appearing safe. Effectiveness is demonstrated against state-of-the-art action-conditioned and text-conditioned world models, including a full end-to-end backdoor on a DRL policy and a proof-of-concept for the VLA setting.

Significance. If the empirical results hold, the work identifies a novel supply-chain attack vector specific to world models that could compromise robotic policies trained on seemingly clean data. This would necessitate research into secure world models and reevaluation of their position in robot learning pipelines, extending traditional data-poisoning concerns to generative components.

major comments (2)
  1. [Abstract and Experiments section] Abstract and Experiments section: The central claim that the attacks are 'stealthy' and 'effective' rests on the premise that trajectories generated by the compromised world model evade downstream filtering or safety checks (e.g., unsafe-state detection, high action variance, or reward violations) before policy training. No analysis, experiments, or discussion of standard validation steps in robot learning pipelines is provided, which is load-bearing for the supply-chain compromise argument.
  2. [Abstract] Abstract: No quantitative results, success rates, baselines, controls, or experimental details are described despite claims of demonstrations against SOTA world models and an end-to-end backdoor. This prevents assessment of whether the data supports the effectiveness claims.
minor comments (1)
  1. [Abstract] Abstract: The term 'VLA setting' is used without definition or expansion on first use.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help improve the clarity and strength of our arguments regarding the vulnerabilities in world model-based robot learning pipelines. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract and Experiments section] The central claim that the attacks are 'stealthy' and 'effective' rests on the premise that trajectories generated by the compromised world model evade downstream filtering or safety checks (e.g., unsafe-state detection, high action variance, or reward violations) before policy training. No analysis, experiments, or discussion of standard validation steps in robot learning pipelines is provided, which is load-bearing for the supply-chain compromise argument.

    Authors: We agree that demonstrating evasion of standard downstream validation and safety checks is important for fully substantiating the stealthiness of the supply-chain attack. The current manuscript focuses on trajectory generation and policy impact but does not include explicit analysis or experiments on common filters such as unsafe-state detection or reward violations. We will add a new subsection to the Experiments section with discussion and preliminary experiments showing how the poisoned trajectories perform under these checks. revision: yes

  2. Referee: [Abstract] No quantitative results, success rates, baselines, controls, or experimental details are described despite claims of demonstrations against SOTA world models and an end-to-end backdoor. This prevents assessment of whether the data supports the effectiveness claims.

    Authors: The abstract is intentionally concise. We acknowledge that including quantitative highlights would strengthen the presentation of effectiveness. In revision we will update the abstract to include key quantitative results such as attack success rates on the evaluated world models and end-to-end backdoor performance, along with references to the baselines and controls used. revision: yes

Circularity Check

0 steps flagged

Empirical attack demonstration contains no derivation chain or self-referential elements

full rationale

The paper is an empirical security demonstration showing attacks on world models via malicious prompts or dynamics in input datasets. The provided abstract and description contain no equations, fitted parameters, predictions derived from inputs, or mathematical derivations. Claims rest on experimental results against action-conditioned and text-conditioned world models, with no self-citation load-bearing a derivation, no ansatz smuggling, and no renaming of known results as new unification. The central premise is an empirical observation about supply-chain compromise rather than any self-definitional or constructed prediction, making the work self-contained against external benchmarks with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on experimental demonstrations of the described attacks rather than mathematical axioms, free parameters, or new postulated entities. No such elements are referenced in the abstract.

pith-pipeline@v0.9.1-grok · 5752 in / 1154 out tokens · 27480 ms · 2026-06-27T16:03:17.108909+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 38 canonical work pages · 20 internal anchors

  1. [1]

    B. D. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration.Robotics Auton. Syst., 57(5):469–483, 2009. ISSN 0921-8890. doi:10.1016/j. robot.2008.10.024. URLhttps://www.sciencedirect.com/science/article/pii/S0 921889008001772

  2. [2]

    RT-1: Robotics Transformer for Real-World Control at Scale

    A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, J. Dabis, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, J. Hsu, et al. RT-1: Robotics transformer for real-world control at scale.Robotics: Science and Systems, 2022. doi:10.48550/arXiv.2212.06817. URL https://arxiv.org/abs/2212.06817

  3. [3]

    RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

    A. Brohan, N. Brown, J. Carbajal, Y . Chebotar, K. Choromanski, T. Ding, D. Driess, K. A. Dubey, C. Finn, P. R. Florence, et al. RT-2: Vision-Language-Action models transfer web knowledge to robotic control.Conference on Robot Learning, 2023. doi:10.48550/arXiv.230 7.15818. URLhttps://arxiv.org/abs/2307.15818

  4. [4]

    M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Fos- ter, G. Lam, P. Sanketi, et al. OpenVLA: An open-source Vision-Language-Action model. Conference on Robot Learning, 2024. doi:10.48550/arXiv.2406.09246. URLhttps: //arxiv.org/abs/2406.09246

  5. [5]

    O. Team, D. Ghosh, H. Walke, K. Pertsch, K. Black, O. Mees, S. Dasari, J. Hejna, T. Kreiman, C. Xu, et al. Octo: An open-source generalist robot policy.Robotics: Science and Systems,

  6. [6]

    Octo: An Open-Source Generalist Robot Policy

    doi:10.48550/arXiv.2405.12213. URLhttps://arxiv.org/abs/2405.12213

  7. [7]

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Open X-Embodiment Collaboration et al. Open X-Embodiment: Robotic learning datasets and RT-X models.arXiv preprint arXiv:2310.08864, 2023. doi:10.48550/arXiv.2310.08864. URL https://arxiv.org/abs/2310.08864

  8. [8]

    $\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

    P. Intelligence, K. Black, N. Brown, J. Darpinian, K. Dhabalia, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, et al.π 0.5: a Vision-Language-Action model with open-world generaliza- tion.arXiv.org, 2025. doi:10.48550/arXiv.2504.16054

  9. [9]

    SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

    M. Shukor, D. Aubakirova, F. Capuano, P. Kooijmans, S. Palma, A. Zouitine, M. Aractingi, C. Pascal, M. Russi, A. Marafioti, et al. SmolVLA: A Vision-Language-Action model for affordable and efficient robotics.arXiv.org, 2025. doi:10.48550/arXiv.2506.01844. URL https://arxiv.org/abs/2506.01844

  10. [10]

    MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

    A. Mandlekar, S. Nasiriany, B. Wen, I. Akinola, Y . Narang, L. Fan, Y . Zhu, and D. Fox. MimicGen: A data generation system for scalable robot learning using human demonstrations. Conference on Robot Learning, 2023. doi:10.48550/arXiv.2310.17596. URLhttps: //arxiv.org/abs/2310.17596. Conference on Robot Learning (CoRL) 2023

  11. [11]

    N. A. Ali, J. Bai, M. Bala, Y . Balaji, A. Blakeman, T. Cai, J. Cao, T. Cao, E. Cha, Y .-W. Chao, et al. World simulation with video foundation models for physical ai.arXiv.org, 2025. doi:10.48550/arXiv.2511.00062

  12. [12]

    Hafner, W

    D. Hafner, W. Yan, and T. Lillicrap. Training agents inside of scalable world models.arXiv.org,

  13. [13]

    doi:10.48550/arXiv.2509.24527

  14. [14]

    Y . Guo, T. Lee, L. X. Shi, J. Chen, P. Liang, and C. Finn. Vlaw: Iterative co-improvement of vision-language-action policy and world model.arXiv.org, 2026. doi:10.48550/arXiv.2602.12 063

  15. [15]

    RISE: Self-Improving Robot Policy with Compositional World Model

    J. Yang, K.-L. C. Lin, J. Li, W. Zhang, T. Lin, L. Wu, Z. Su, H. Zhao, Y .-Q. Zhang, L. Chen, et al. Rise: Self-improving robot policy with compositional world model.arXiv.org, 2026. doi:10.48550/arXiv.2602.11075. 9

  16. [16]

    Evaluating gemini robotics policies in a veo world simulator,

    G. Team and G. DeepMind. Evaluating gemini robotics policies in a veo world simulator. arXiv.org, 2025. doi:10.48550/arXiv.2512.10675

  17. [17]

    G. Zhou, H. Pan, Y . LeCun, and L. Pinto. Dino-wm: World models on pre-trained visual features enable zero-shot planning.International Conference on Machine Learning, 2024. doi:10.48550/arXiv.2411.04983

  18. [18]

    L. Maes, Q. L. Lidec, D. Scieur, Y . LeCun, and R. Balestriero. Leworldmodel: Stable end- to-end joint-embedding predictive architecture from pixels.arXiv preprint arXiv:2603.19312, 2026

  19. [19]

    Enhance robot learning with synthetic trajectory data generated by world foundation models

    NVIDIA Research. Enhance robot learning with synthetic trajectory data generated by world foundation models. NVIDIA Technical Blog, June 2025. URLhttps://developer.nvid ia.com/blog/enhance-robot-learning-with-synthetic-trajectory-data-gen erated-by-world-foundation-models/. Accessed: 2026-05-13

  20. [20]

    Scaling GAIA-1: 9-billion parameter generative world model for autonomous driv- ing

    Wayve. Scaling GAIA-1: 9-billion parameter generative world model for autonomous driv- ing. Wayve Blog, Apr. 2024. URLhttps://wayve.ai/thinking/scaling-gaia-1/. Accessed: 2026-05-13

  21. [21]

    C. M. Jiang, X. Masotto, and B. Sun. The Waymo world model: A new frontier for autonomous driving simulation. Waymo Blog, Feb. 2026. URLhttps://waymo.com/blog/2026/02/t he-waymo-world-model-a-new-frontier-for-autonomous-driving-simulation/. Accessed: 2026-05-13

  22. [22]

    A. Zou, Z. Wang, J. Kolter, and M. Fredrikson. Universal and transferable adversarial attacks on aligned language models.arXiv.org, 2023

  23. [23]

    Carlini and D

    N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. InIEEE Symposium on Security and Privacy, pages 39–57. Ieee, 2016. doi:10.1109/SP.2017.49

  24. [24]

    Not what you’ve signed up for: Compromising real- world LLM-integrated applications with indirect prompt injection,

    K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt in- jection. InAISec@CCS, pages 79–90. ACM, 2023. doi:10.1145/3605764.3623985

  25. [25]

    N. N. Agarwal, A. Ali, M. Bala, Y . Balaji, E. Barker, T. Cai, P. Chattopadhyay, Y . Chen, Y . Cui, Y . Ding, et al. Cosmos world foundation model platform for physical ai.arXiv.org,

  26. [26]

    doi:10.48550/arXiv.2501.03575

  27. [27]

    $\pi_0$: A Vision-Language-Action Flow Model for General Robot Control

    K. Black, N. Brown, D. Driess, A. Esmail, M. Equi, C. Finn, N. Fusai, L. Groom, K. Hausman, B. Ichter, et al.π 0: A Vision-Language-Action flow model for general robot control. In arXiv.org, 2024. doi:10.48550/arXiv.2410.24164

  28. [28]

    R. S. Sutton and A. G. Barto.Reinforcement learning: An introduction. 1998. doi:10.1109/ TNN.1998.712192

  29. [29]

    J. Hu, P. Stone, and R. Mart’in-Mart’in. Slac: Simulation-pretrained latent action space for whole-body real-world rl. InarXiv.org, pages 2966–2982. PMLR, 2025. doi:10.48550/arXiv .2506.04147

  30. [30]

    McCarthy, D

    R. McCarthy, D. C. Tan, D. Schmidt, F. Acero, N. Herr, Y . Du, T. G. Thuruthel, and Z. Li. Towards Generalist Robot Learning from Internet Video: A Survey.Journal of Artificial Intel- ligence Research, 2024. doi:10.1613/jair.1.17400. URLhttps://arxiv.org/abs/2404.1 9664

  31. [31]

    Intelligence, A

    P. Intelligence, A. Amin, R. Aniceto, A. Balakrishna, K. Black, K. Conley, G. Connors, J. Darpinian, K. Dhabalia, J. DiCarlo, et al.π ˚ 0.6: a vla that learns from experience.New Electronics, 52(12):16–17, 2025. doi:10.12968/s0047-9624(22)61323-3. 10

  32. [32]

    IEEE Transactions on Robotics39(3), 1706–1727 (2023) https://doi.org/10.1109/TRO.2023.3236952

    K. Darvish, L. Penco, J. Ramos, R. Cisneros, J. Pratt, E. Yoshida, S. Ivaldi, and D. Pucci. Teleoperation of Humanoid Robots: A Survey.IEEE Transactions on Robotics, 39(3):1706– 1727, 2023. doi:10.1109/TRO.2023.3236952. URLhttps://arxiv.org/abs/2301.043 17

  33. [33]

    Wensing, Ben- jamin Katz, Gerardo Bledt, and Sangbae Kim

    J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. InIEEE/RJS International Conference on Intelligent RObots and Systems, pages 23–30. IEEE, 2017. doi:10.1109/IROS .2017.8202133

  34. [34]

    L. Da, J. Turnau, T. P. Kutralingam, A. Velasquez, P. Shakarian, and H. Wei. A survey of sim- to-real methods in rl: Progress, prospects and challenges with foundation models.arXiv.org,

  35. [35]

    doi:10.48550/arXiv.2502.13187

  36. [36]

    The Robotics Breakout Moment

    Salesforce Ventures. The Robotics Breakout Moment. Salesforce Ventures Industry Perspec- tives, 2025. URLhttps://salesforceventures.com/perspectives/the-robotic s-breakout-moment/. Accessed: 2026

  37. [37]

    Y . Park, J. S. Bhatia, L. Ankile, and P. Agrawal. DexHub and DART: Towards Internet Scale Robot Data Collection, 2024. URLhttps://arxiv.org/abs/2411.02214

  38. [38]

    Cheng, J

    X. Cheng, J. Li, S. Yang, G. Yang, and X. Wang. Open-TeleVision: Teleoperation with Im- mersive Active Visual Feedback, 2024. URLhttps://arxiv.org/abs/2407.01512

  39. [39]

    Verbin and E

    E. Verbin and E. Baroz. Request for Startups — Teleoperation for Robotics: Why Teleop- eration is the Key to Unlocking the $10T Robotics Stack. Lunar Ventures, Substack, 2025. URLhttps://verbine.substack.com/p/request-for-startups-teleoperation. Accessed: 2026

  40. [40]

    Guha and J

    A. Guha and J. Piotti. Sensei: Robotic Training Data at Scale. Y Combinator Company Profile,

  41. [41]

    Accessed: 2026

    URLhttps://www.ycombinator.com/companies/sensei. Accessed: 2026

  42. [42]

    Cad `ene, S

    R. Cad `ene, S. Aliberts, F. Capuano, M. Aractingi, A. Zouitine, P. Kooijmans, J. Choghari, M. Russi, C. Pascal, S. Palma, et al. Lerobot: An open-source library for end-to-end robot learning.arXiv.org, 2026. doi:10.48550/arXiv.2602.22818

  43. [43]

    Y . Tian, S. Yang, J. Zeng, P. Wang, D. Lin, H. Dong, and J. Pang. Predictive inverse dynamics models are scalable learners for robotic manipulation. InInternational Conference on Learning Representations, volume 2025, pages 92033–92052, 2024. doi:10.48550/arXiv.2412.15109

  44. [44]

    Y . Wang, R. Syed, F. Wu, M. Zhang, A. Onol, J. Barreiros, H. Nayyeri, T. Dear, H. Zhang, and Y . Li. Interactive world simulator for robot policy training and evaluation.arXiv preprint arXiv:2603.08546, 2026

  45. [45]

    L. Wang, Z. Javed, X. Wu, W. Guo, X. Xing, and D. Song. Backdoorl: Backdoor attack against competitive reinforcement learning.International Joint Conference on Artificial Intelligence, pages 3699–3705, 2021. doi:10.24963/ijcai.2021/509

  46. [46]

    J. Guo, W. Jiang, Y . Lin, Y . Liu, R. Zhang, G. Lu, A. Chen, X. Han, H. Li, and D. Niyato. State backdoor: Towards stealthy real-world poisoning attack on vision-language-action model in state space.arXiv.org, 2026. doi:10.48550/arXiv.2601.04266

  47. [47]

    Rathbun, W

    E. Rathbun, W. W. Lin, A. Oprea, and C. Amato. Beware untrusted simulators–reward-free backdoor attacks in reinforcement learning.arXiv.org, 2026. doi:10.48550/arXiv.2602.05089

  48. [48]

    X. Zhou, G. Tie, G. Zhang, H. Wang, P. Zhou, and L. Sun. Badvla: Towards backdoor attacks on vision-language-action models via objective-decoupled optimization. InarXiv.org, 2025. doi:10.48550/arXiv.2505.16640. 11

  49. [49]

    Rathbun, C

    E. Rathbun, C. Amato, and A. Oprea. Sleepernets: Universal backdoor poisoning attacks against reinforcement learning agents.Neural Information Processing Systems, 37:111994– 112024, 2024. doi:10.48550/arXiv.2405.20539

  50. [50]

    Kiourti, K

    P. Kiourti, K. Wardega, S. Jha, and W. Li. Trojdrl: Trojan attacks on deep reinforcement learning agents.arXiv.org, 2019

  51. [51]

    M. R. Luo, G. Cui, and B. Rigg. The development of the CIE 2000 colour-difference formula: CIEDE2000.Color Research & Application, 26(5):340–350, 2001. doi:10.1002/col.10 49

  52. [52]

    Z. Zhao, Z. Liu, and M. Larson. Towards large yet imperceptible adversarial image per- turbations with perceptual color distance. InComputer Vision and Pattern Recognition, pages 1036–1045, June 2019. doi:10.1109/CVPR42600.2020.00112. URLhttps: //arxiv.org/abs/1911.02466

  53. [53]

    Aydin and A

    A. Aydin and A. Temizel. Adversarial image generation by spatial transformation in perceptual colorspaces.Pattern Recognition Letters, 174:92–98, 2023. doi:10.1016/j.patrec.2023.09.003. URLhttps://arxiv.org/abs/2310.13950

  54. [54]

    A. Wang, B. Ai, B. Wen, C. Mao, C.-W. Xie, D. Chen, F. Yu, H. Zhao, J. Yang, J. Zeng, et al. Wan: Open and advanced large-scale video generative models.arXiv.org, 2025. doi: 10.48550/arXiv.2503.20314

  55. [55]

    Bagdasarian, R

    E. Bagdasarian, R. Jha, T. Zhang, and V . Shmatikov. Adversarial illusions in Multi-Modal embeddings. InUSENIX Security Symposium, 2023

  56. [56]

    J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models.Neural Information Processing Systems, 33:6840–6851, 2020

  57. [57]

    B. Liu, Y . Zhu, C. Gao, Y . Feng, Q. Liu, Y . Zhu, and P. Stone. Libero: Benchmarking knowl- edge transfer for lifelong robot learning.Advances in Neural Information Processing Systems, 36:44776–44791, 2023

  58. [58]

    D. Chen, R. Chen, S. Zhang, Y . Wang, Y . Liu, H. Zhou, Q. Zhang, Y . Wan, P. Zhou, and L. Sun. Mllm-as-a-judge: Assessing multimodal llm-as-a-judge with vision-language benchmark. In International Conference on Machine Learning, 2024

  59. [59]

    J. Jang, S. Ye, Z. Lin, J. Xiang, J. Bjorck, Y . Fang, F. Hu, S. Huang, K. Kundalia, Y .-C. Lin, et al. Dreamgen: Unlocking generalization in robot learning through video world models. arXiv preprint arXiv:2505.12705, 2025

  60. [60]

    Gymnasium: A Standard Interface for Reinforcement Learning Environments

    M. Towers, A. Kwiatkowski, J. K. Terry, J. U. Balis, G. Cola, T. Deleu, M. Goul ˜ao, A. Kallinteris, M. Krimmel, K. Arjun, et al. Gymnasium: A standard interface for reinforce- ment learning environments.arXiv.org, 2024. doi:10.48550/arXiv.2407.17032

  61. [61]

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Chen, K. Ellis, et al. Droid: A large-scale in-the-wild robot manipulation dataset.Robotics: Science and Systems, 2024. doi:10.48550/arXiv.2403.12945

  62. [62]

    M. J. Kim, Y . Gao, T.-Y . Lin, Y .-C. Lin, Y . Ge, G. Lam, P. Liang, S. Song, M.-Y . Liu, C. Finn, et al. Cosmos policy: Fine-tuning video models for visuomotor control and planning.arXiv preprint arXiv:2601.16163, 2026

  63. [63]

    Madry, A

    A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning mod- els resistant to adversarial attacks. InInternational Conference on Learning Representations, 2018. 12

  64. [64]

    Post-Trained

    S. Huang, R. F. J. Dossa, C. Ye, J. Braga, D. Chakraborty, K. Mehta, and J. G. Ara´ujo. Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms.Journal of Machine Learning Research, 23(274):1–18, 2022. URLhttp://jmlr.org/papers/v2 3/21-1342.html. 13 Appendix Table of Contents Section Contents Appendix A Experimental Det...