pith. machine review for the scientific record. sign in

arxiv: 2604.04630 · v1 · submitted 2026-04-06 · 💻 cs.CV

Recognition: no theorem link

Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers

Authors on Pith no claims yet

Pith reviewed 2026-05-10 20:27 UTC · model grok-4.3

classification 💻 cs.CV
keywords multimodal backdoor attackvisual language modelsautonomous drivinggraffiti triggerscross-lingual triggerspoisoning attackssafety-critical AIDriveVLM
0
0 comments X

The pith

Multimodal backdoor attacks on visual language models for autonomous driving succeed with graffiti patterns and cross-lingual text triggers while leaving clean performance intact or improved.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that visual language models integrated into self-driving systems can be compromised by a new attack called GLA that plants hidden triggers in both vision and language inputs. These triggers use graffiti-like patterns blended into road scenes and text prompts shifted across languages but kept semantically equivalent. If the demonstration holds, then standard ways of spotting backdoors by looking for drops in normal accuracy would fail, because the poisoned models perform as well or better on clean driving tasks. Readers should care because VLMs are moving into real-time control of vehicles, where an attacker who controls only a small fraction of training data could force dangerous responses to specific cues without anyone noticing degraded behavior on ordinary roads.

Core claim

GLA shows that stable diffusion inpainting can produce graffiti visual triggers that merge naturally into urban driving scenes and that cross-lingual text triggers can create consistent distributional shifts on the language side. Together these two channels allow implantation of a backdoor in DriveVLM such that a 10 percent poisoning ratio yields 90 percent attack success rate on triggered inputs, zero false positives on clean inputs, and no loss or even gains on standard metrics such as BLEU-1.

What carries the argument

The GLA dual-trigger mechanism that pairs generated graffiti visuals with semantically preserved cross-lingual text to form covert multimodal backdoor channels.

If this is right

  • Only a 10 percent poisoning ratio is needed to reach 90 percent attack success rate with zero false positives on clean data.
  • The backdoor can improve rather than degrade clean-task metrics such as BLEU-1, so performance-based detectors are ineffective.
  • Existing unimodal backdoor defenses are insufficient for multimodal models in safety-critical driving.
  • Backdoor evaluation for VLMs must now include tests for hidden performance gains on clean inputs.
  • Multimodal systems require new trigger designs that blend into the physical environment and language distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Developers of driving VLMs may need to add checks for generative-image artifacts and cross-lingual distributional shifts during both training and deployment.
  • The observed clean-metric gains suggest attackers could disguise backdoors as ordinary fine-tuning improvements.
  • Similar naturalistic trigger techniques could be adapted to other VLMs used in robotics or traffic monitoring if the inpainting and language-shift methods generalize beyond the tested model.
  • Testing regimes for autonomous systems should measure resilience across a range of poisoning ratios rather than assuming high ratios are required for effective attacks.

Load-bearing premise

The generated graffiti patterns stay visually indistinguishable from real graffiti in actual city driving scenes and the cross-lingual text shifts preserve meaning without introducing detectable language artifacts.

What would settle it

Running the graffiti patterns through a set of real-world driving videos and having both human observers and automated anomaly detectors flag them as artificial at rates significantly above background graffiti would show the visual trigger is not covert.

read the original abstract

Visual language model (VLM) is rapidly being integrated into safety-critical systems such as autonomous driving, making it an important attack surface for potential backdoor attacks. Existing backdoor attacks mainly rely on unimodal, explicit, and easily detectable triggers, making it difficult to construct both covert and stable attack channels in autonomous driving scenarios. GLA introduces two naturalistic triggers: graffiti-based visual patterns generated via stable diffusion inpainting, which seamlessly blend into urban scenes, and cross-language text triggers, which introduce distributional shifts while maintaining semantic consistency to build robust language-side trigger signals. Experiments on DriveVLM show that GLA requires only a 10\% poisoning ratio to achieve a 90\% Attack Success Rate (ASR) and a 0\% False Positive Rate (FPR). More insidiously, the backdoor does not weaken the model on clean tasks, but instead improves metrics such as BLEU-1, making it difficult for traditional performance-degradation-based detection methods to identify the attack. This study reveals underestimated security threats in self-driving VLMs and provides a new attack paradigm for backdoor evaluation in safety-critical multimodal systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces GLA, a multimodal backdoor attack on vision-language models (VLMs) for autonomous driving. It proposes two naturalistic triggers—graffiti patterns generated via stable diffusion inpainting for the visual channel and cross-lingual text triggers for the language channel—and evaluates them on DriveVLM. The central empirical claim is that a 10% poisoning ratio yields 90% attack success rate (ASR) and 0% false positive rate (FPR) while simultaneously improving clean-task metrics such as BLEU-1, rendering performance-degradation detectors ineffective.

Significance. If the reported attack success and clean-performance gains hold under rigorous controls, the work is significant for security research on multimodal models in safety-critical domains. It demonstrates a covert, low-poisoning-ratio threat that evades standard detection and supplies a concrete new attack paradigm for evaluating backdoors in VLMs used for autonomous driving.

major comments (2)
  1. [Experimental Results] Experimental section: the 90% ASR / 0% FPR result at 10% poisoning is load-bearing for the central claim, yet the manuscript provides no ablation isolating the contribution of the graffiti versus cross-lingual components, nor any comparison against prior unimodal or multimodal backdoor baselines on the same DriveVLM setup. Without these controls it is difficult to attribute the reported effectiveness specifically to the proposed multimodal design.
  2. [Trigger Generation] Trigger validation subsection: the claim that stable-diffusion inpainted graffiti remains undetectable in real urban scenes is central to the covertness argument, but the paper reports no quantitative detection metrics (e.g., human study accuracy or automated anomaly scores) or failure cases under varying lighting/weather conditions.
minor comments (2)
  1. [Abstract] The abstract states improved BLEU-1 but does not specify the exact clean-task evaluation protocol or dataset split used for this metric.
  2. [Preliminaries] Notation for the poisoning ratio and ASR/FPR definitions should be introduced once in a dedicated preliminaries section rather than repeated inline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Experimental Results] Experimental section: the 90% ASR / 0% FPR result at 10% poisoning is load-bearing for the central claim, yet the manuscript provides no ablation isolating the contribution of the graffiti versus cross-lingual components, nor any comparison against prior unimodal or multimodal backdoor baselines on the same DriveVLM setup. Without these controls it is difficult to attribute the reported effectiveness specifically to the proposed multimodal design.

    Authors: We agree that isolating the individual contributions of the graffiti and cross-lingual triggers, along with direct baseline comparisons, would strengthen attribution to the multimodal design. In the revised manuscript we will add an ablation table reporting ASR and FPR for graffiti-only, cross-lingual-only, and combined triggers at the 10% poisoning ratio. We will also adapt and evaluate representative prior backdoor methods (e.g., BadNet-style visual triggers and existing multimodal attacks) on the identical DriveVLM setup and report the results side-by-side with GLA to quantify the advantage of the proposed naturalistic triggers. revision: yes

  2. Referee: [Trigger Generation] Trigger validation subsection: the claim that stable-diffusion inpainted graffiti remains undetectable in real urban scenes is central to the covertness argument, but the paper reports no quantitative detection metrics (e.g., human study accuracy or automated anomaly scores) or failure cases under varying lighting/weather conditions.

    Authors: We acknowledge that quantitative evidence of undetectability is important for the covertness claim. We will add a human-subject study in which participants are asked to flag anomalous graffiti in a set of real urban images (with and without triggers) and report detection accuracy. We will also include automated anomaly scores using standard out-of-distribution detectors and provide a qualitative and quantitative analysis of trigger visibility under varied lighting and weather conditions, including any observed failure cases. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The manuscript is an empirical study of a proposed multimodal backdoor attack (GLA) on DriveVLM. All load-bearing claims consist of reported experimental outcomes (10% poisoning ratio yielding 90% ASR and 0% FPR, plus clean-task metric gains) obtained from controlled poisoning trials. No derivation chain, equations, or first-principles results are present that reduce to fitted parameters, self-definitions, or self-citations. The method description and trigger construction are presented as design choices validated by direct measurement rather than quantities derived from the target metrics themselves. This structure is self-contained and externally falsifiable via replication of the poisoning protocol.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

This is an empirical attack demonstration paper. No mathematical derivations or free parameters in a modeling sense are present; the poisoning ratio is an experimental hyperparameter chosen for the reported results. No new physical or mathematical entities are postulated.

free parameters (1)
  • poisoning ratio
    Experimental choice of 10% used to achieve the reported ASR; not derived from first principles.

pith-pipeline@v0.9.0 · 5516 in / 1333 out tokens · 46367 ms · 2026-05-10T20:27:17.601355+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 25 canonical work pages · 2 internal anchors

  1. [1]

    arXiv preprint arXiv:2403.19838 (2024)

    Gopalkrishnan, A., Greer, R., Trivedi, M.: Multi-frame, lightweight & efficient vision-language mod- els for question answering in autonomous driving. arXiv preprint arXiv:2403.19838 (2024)

  2. [2]

    In: European Conference on Computer Vision, pp

    Sima, C., Renz, K., Chitta, K., Chen, L., Zhang, H., Xie, C., Beißwenger, J., Luo, P., Geiger, A., Li, H.: Drivelm: Driving with graph visual question answering. In: European Conference on Computer Vision, pp. 256–274 (2024). Springer

  3. [3]

    Vlrm: Vision-language models act as reward models for image captioning.arXiv preprint arXiv:2404.01911,

    Dzabraev, M., Kunitsyn, A., Iva- niuta, A.: Vlrm: Vision-language models act as reward models for image captioning. arXiv preprint arXiv:2404.01911 (2024)

  4. [4]

    Expert Systems with Applications212, 118669 (2023)

    Salaberria, A., Azkune, G., Lacalle, O.L., Soroa, A., Agirre, E.: Image 14 captioning for effective use of lan- guage models in knowledge-based visual question answering. Expert Systems with Applications212, 118669 (2023)

  5. [5]

    Interna- tional Journal of Computer Vision (2025)

    Liang, J., Liang, S., Liu, A., Cao, X.: Vl-trojan: Multimodal instruction backdoor attacks against autoregres- sive visual language models. Interna- tional Journal of Computer Vision (2025)

  6. [6]

    In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion (2024)

    Liang, S., Zhu, M., Liu, A., Wu, B., Cao, X., Chang, E.-C.: Bad- clip: Dual-embedding guided back- door attack on multimodal con- trastive learning. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion (2024)

  7. [7]

    Inter- national Journal of Computer Vision (2025)

    Liu, A., Liu, X., Zhang, X., Xiao, Y., Zhou, Y., Liang, S., Wang, J., Cao, X., Tao, D.: Pre-trained trojan attacks for visual recognition. Inter- national Journal of Computer Vision (2025)

  8. [8]

    arXiv preprint arXiv:2406.04031 (2024)

    Ying, Z., Liu, A., Zhang, T., Yu, Z., Liang, S., Liu, X., Tao, D.: Jailbreak vision language models via bi-modal adversarial prompt. arXiv preprint arXiv:2406.04031 (2024)

  9. [9]

    In: CVPR (2025)

    Liang, S., Liang, J., Pang, T., Du, C., Liu, A., Zhu, M., Cao, X., Tao, D.: Revisiting backdoor attacks against large vision-language models from domain shift. In: CVPR (2025)

  10. [10]

    arXiv preprint arXiv:2407.16307 (2024)

    Liu, X., Jia, X., Xun, Y., Liang, S., Cao, X.: Multimodal unlearnable examples: Protecting data against multimodal contrastive learning. arXiv preprint arXiv:2407.16307 (2024)

  11. [11]

    arXiv preprint arXiv:2402.11473 (2024)

    Liang, J., Liang, S., Liu, A., Jia, X., Kuang, J., Cao, X.: Poisoned forgery face: Towards backdoor attacks on face forgery detection. arXiv preprint arXiv:2402.11473 (2024)

  12. [12]

    Liu, X., Jia, X., Gu, J., Xun, Y., Liang, S., Cao, X.: Does few-shot learning suffer from backdoor attacks? arXiv preprint arXiv:2401.01377 (2023)

  13. [13]

    arXiv preprint arXiv:2209.05244 (2022)

    Wang, Y., Shi, H., Min, R., Wu, R., Liang, S., Wu, Y., Liang, D., Liu, A.: Universal backdoor attacks detec- tion via adaptive adversarial probe. arXiv preprint arXiv:2209.05244 (2022)

  14. [14]

    arXiv preprint arXiv:2403.16257 (2024)

    Liang, S., Liu, K., Gong, J., Liang, J., Xun, Y., Chang, E.- C., Cao, X.: Unlearning backdoor threats: Enhancing backdoor defense in multimodal contrastive learning via local token unlearning. arXiv preprint arXiv:2403.16257 (2024)

  15. [15]

    Advances in Neural Informa- tion Processing Systems (2024)

    Zhu, M., Liang, S., Wu, B.: Breaking the false sense of security in back- door defense through re-activation attack. Advances in Neural Informa- tion Processing Systems (2024)

  16. [16]

    IEEE Access7, 47230–47244 (2019) https://doi.org/ 10.1109/ACCESS.2019.2909068

    Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: Badnets: Evaluat- ing backdooring attacks on deep neural networks. IEEE Access7, 47230–47244 (2019) https://doi.org/ 10.1109/ACCESS.2019.2909068

  17. [17]

    arXiv 15 preprint arXiv:2312.04913 (2023)

    He, B., Jia, X., Liang, S., Lou, T., Liu, Y., Cao, X.: Sa-attack: Improving adversarial transferabil- ity of vision-language pre-training models via self-augmentation. arXiv 15 preprint arXiv:2312.04913 (2023)

  18. [18]

    arXiv preprint arXiv:2311.11017 (2023)

    Liu, J., Zhu, S., Liang, S., Zhang, J., Fang, H., Zhang, W., Chang, E.-C.: Improving adversarial transferability by stable diffusion. arXiv preprint arXiv:2311.11017 (2023)

  19. [19]

    arXiv preprint arXiv:2304.10136 (2023) 24

    Wang, Z., Zhang, Z., Liang, S., Wang, X.: Diversifying the high- level features for better adversar- ial transferability. arXiv preprint arXiv:2304.10136 (2023)

  20. [20]

    In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Liu, A., Tang, S., Liang, S., Gong, R., Wu, B., Liu, X., Tao, D.: Explor- ing the relationship between archi- tectural design and adversarially robust generalization. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  21. [21]

    arXiv preprint arXiv:2402.09164 (2024)

    Chen, R., Zhang, H., Liang, S., Li, J., Cao, X.: Less is more: Fewer interpretable region via submodu- lar subset selection. arXiv preprint arXiv:2402.09164 (2024)

  22. [22]

    Semantic Mirror Jail- break: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs

    Li, X., Liang, S., Zhang, J., Fang, H., Liu, A., Chang, E.-C.: Semantic mirror jailbreak: Genetic algorithm based jailbreak prompts against open-source llms. arXiv preprint arXiv:2402.14872 (2024)

  23. [23]

    In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition, pp

    Hu, Y., Yang, J., Chen, L., Li, K., Sima, C., Zhu, X., Chai, S., Du, S., Lin, T., Wang, W.,et al.: Planning- oriented autonomous driving. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition, pp. 17853–17862 (2023)

  24. [24]

    IEEE Robotics and Automation Letters (2024)

    Xu, Z., Zhang, Y., Xie, E., Zhao, Z., Guo, Y., Wong, K.-Y.K., Li, Z., Zhao, H.: Drivegpt4: Interpretable end-to-end autonomous driving via large language model. IEEE Robotics and Automation Letters (2024)

  25. [25]

    In: Computer Vision–ECCV 2020 (2020)

    Liang, S., Wei, X., Yao, S., Cao, X.: Efficient adversarial attacks for visual object tracking. In: Computer Vision–ECCV 2020 (2020)

  26. [26]

    arXiv preprint arXiv:1811.12641 , year=

    Wei, X., Liang, S., Chen, N., Cao, X.: Transferable adversar- ial attacks for image and video object detection. arXiv preprint arXiv:1811.12641 (2018)

  27. [27]

    arXiv preprint arXiv:2201.08970 (2022)

    Liang, S., Wu, B., Fan, Y., Wei, X., Cao, X.: Parallel rectangle flip attack: A query-based black-box attack against object detection. arXiv preprint arXiv:2201.08970 (2022)

  28. [28]

    In: European Conference on Com- puter Vision (2022)

    Liang, S., Li, L., Fan, Y., Jia, X., Li, J., Wu, B., Cao, X.: A large-scale multiple-objective method for black- box attack against object detection. In: European Conference on Com- puter Vision (2022)

  29. [29]

    In: Proceedings of the 31st ACM International Confer- ence on Multimedia (2023)

    Guo, J., Zheng, X., Liu, A., Liang, S., Xiao, Y., Wu, Y., Liu, X.: Isola- tion and induction: Training robust deep neural networks against model stealing attacks. In: Proceedings of the 31st ACM International Confer- ence on Multimedia (2023)

  30. [30]

    CoRR (2023)

    Qi, X., Huang, K., Panda, A., Wang, M., Mittal, P.: Visual adversarial examples jailbreak large language models. CoRR (2023)

  31. [31]

    16 Information Fusion118, 102970 (2025)

    Sun, S., Lin, Z., Wu, X.: Halluci- nations of large multimodal mod- els: Problem and countermeasures. 16 Information Fusion118, 102970 (2025)

  32. [32]

    arXiv preprint arXiv:2406.00721 (2024)

    Wang, C., Wang, W., Yu, C., Mu, J.: Explore internal and external similarity for single image deraining with graph neural networks. arXiv preprint arXiv:2406.00721 (2024)

  33. [33]

    In: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security, pp

    Liang, J., Pang, R., Li, C., Wang, T.: Model extraction attacks revis- ited. In: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security, pp. 1231–1245 (2024)

  34. [34]

    In: European Symposium on Research in Computer Security, pp

    Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L.: Data poisoning attacks against federated learning systems. In: European Symposium on Research in Computer Security, pp. 480–501 (2020). Springer

  35. [35]

    A survey on physical adversarial attack in computer vision.arXiv preprint arXiv:2209.14262, 2022

    Wang, D., Yao, W., Jiang, T., Tang, G., Chen, X.: A survey on physical adversarial attack in computer vision. arXiv preprint arXiv:2209.14262 (2022)

  36. [36]

    In: 10th USENIX Workshop on Offensive Technologies (WOOT 16) (2016)

    Davidson, D., Wu, H., Jellinek, R., Singh, V., Ristenpart, T.: Con- trolling{UAVs}with sensor input spoofing attacks. In: 10th USENIX Workshop on Offensive Technologies (WOOT 16) (2016)

  37. [37]

    In: Proceedings of the 32nd ACM International Conference on Multimedia (2024)

    Zhang, X., Liu, A., Zhang, T., Liang, S., Liu, X.: Towards robust physical-world backdoor attacks on lane detection. In: Proceedings of the 32nd ACM International Conference on Multimedia (2024)

  38. [38]

    arXiv preprint arXiv:2405.07595 (2024)

    Kong, D., Liang, S., Ren, W.: Environmental matching attack against unmanned aerial vehicles object detection. arXiv preprint arXiv:2405.07595 (2024)

  39. [39]

    arXiv preprint arXiv:2406.00934 (2024)

    Zhang, T., Wang, L., Li, H., Xiao, Y., Liang, S., Liu, A., Liu, X., Tao, D.: Lanevil: Benchmarking the robustness of lane detection to envi- ronmental illusions. arXiv preprint arXiv:2406.00934 (2024)

  40. [40]

    arXiv preprint arXiv:2403.05247 (2024)

    Lou, T., Jia, X., Gu, J., Liu, L., Liang, S., He, B., Cao, X.: Hide in thicket: Generating impercepti- ble and rational adversarial pertur- bations on 3d point clouds. arXiv preprint arXiv:2403.05247 (2024)

  41. [41]

    In: 2019 IEEE Sympo- sium on Security and Privacy (SP), pp

    Wang, B., Yao, Y., Shan, S., Li, H., Viswanath, B., Zheng, H., Zhao, B.Y.: Neural cleanse: Identifying and mitigating backdoor attacks in neu- ral networks. In: 2019 IEEE Sympo- sium on Security and Privacy (SP), pp. 707–723 (2019). IEEE

  42. [42]

    In: Proceedings of the 33rd ACM International Con- ference on Multimedia, pp

    Li, K., Ren, W., Wang, W., Zhang, L., Cao, X.: Detecting synthetic image by cross-modal commonal- ity interaction. In: Proceedings of the 33rd ACM International Con- ference on Multimedia, pp. 11367– 11375 (2025)

  43. [43]

    IEEE Transactions on Circuits and Systems for Video Technology (2025)

    Li, H., Wang, W., Wang, C., Wang, M., Zhang, X., Lan, L., Liu, X., Li, K., Cao, X.: Phrase grounding- based style transfer for single- domain generalized object detec- tion. IEEE Transactions on Circuits and Systems for Video Technology (2025)

  44. [44]

    In: Proceedings of 17 the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion, pp

    Liu, Y., Shen, G., Tao, G., Wang, Z., Ma, S., Zhang, X.: Complex back- door detection by symmetric fea- ture differencing. In: Proceedings of 17 the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion, pp. 15003–15013 (2022)

  45. [45]

    In: The Thirty-ninth Annual Conference on Neural Information Processing Systems

    Wang, W., Xia, H., Huang, C., Ding, Z., Wang, C., Li, H., Cao, X.: Rethinking joint maximum mean discrepancy for visual domain adap- tation. In: The Thirty-ninth Annual Conference on Neural Information Processing Systems

  46. [46]

    IEEE Transactions on Image Processing (2025)

    Wang, W., Wang, M., Huang, C., Wang, C., Mu, J., Nie, F., Cao, X.: Optimal graph learning based label propagation for cross-domain image classification. IEEE Transactions on Image Processing (2025)

  47. [47]

    In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion, pp

    Liang, S., Zhu, M., Liu, A., Wu, B., Cao, X., Chang, E.-C.: Bad- clip: Dual-embedding guided back- door attack on multimodal con- trastive learning. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recogni- tion, pp. 24645–24654 (2024)

  48. [48]

    Interna- tional Journal of Computer Vision, 1–20 (2025)

    Liang, J., Liang, S., Liu, A., Cao, X.: Vl-trojan: Multimodal instruction backdoor attacks against autoregres- sive visual language models. Interna- tional Journal of Computer Vision, 1–20 (2025)

  49. [49]

    Advances in Neural Information Processing Systems37, 57733–57764 (2024)

    Xu, Y., Yao, J., Shu, M., Sun, Y., Wu, Z., Yu, N., Goldstein, T., Huang, F.: Shadowcast: Stealthy data poisoning attacks against vision-language models. Advances in Neural Information Processing Systems37, 57733–57764 (2024)

  50. [50]

    Test-time backdoor attacks on multimodal large language models

    Lu, D., Pang, T., Du, C., Liu, Q., Yang, X., Lin, M.: Test-time back- door attacks on multimodal large language models. arXiv preprint arXiv:2402.08577 (2024)

  51. [51]

    In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Jiang, W., Li, H., Xu, G., Zhang, T.: Color backdoor: A robust poisoning attack in color space. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8133–8142 (2023)

  52. [52]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Saha, A., Subramanya, A., Pirsi- avash, H.: Hidden trigger backdoor attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11957–11965 (2020)

  53. [53]

    2021 , month = jan, journal =

    Wallace, E., Feng, S., Kandpal, N., Gardner, M., Singh, S.: Univer- sal adversarial triggers for attacking and analyzing nlp. arXiv preprint arXiv:1908.07125 (2019)

  54. [54]

    In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition Workshops (2023)

    Chen, J., Liu, X., Liang, S., Jia, X., Xun, Y.: Universal watermark vac- cine: Universal adversarial perturba- tions for watermark protection. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition Workshops (2023)

  55. [55]

    In: Proceedings of the 31st ACM International Conference on Multimedia (2023)

    Dong, X., Wang, R., Liang, S., Liu, A., Jing, L.: Face encryption via frequency-restricted identity- agnostic attacks. In: Proceedings of the 31st ACM International Conference on Multimedia (2023)

  56. [56]

    In: Proceedings of the 31st ACM International Conference on Multi- media (2023)

    Liang, J., Liang, S., Liu, A., Ma, K., Li, J., Cao, X.: Exploring inconsis- tent knowledge distillation for object detection with data augmentation. In: Proceedings of the 31st ACM International Conference on Multi- media (2023)

  57. [57]

    Muxue, L., Wang, C., Liang, S., Liu, 18 A., Liu, Z., Yang, L., Cao, X.: Adver- sarial instance attacks for interac- tions between human and object

  58. [58]

    In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pp

    Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pp. 10684–10695 (2022)

  59. [59]

    In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition, pp

    Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krish- nan, A., Pan, Y., Baldan, G., Bei- jbom, O.: nuscenes: A multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pat- tern Recognition, pp. 11621–11631 (2020)

  60. [60]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy, A.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  61. [61]

    Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

    Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017)

  62. [62]

    In: Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pp

    Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Invisible backdoor attack with sample-specific triggers. In: Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pp. 16463–16472 (2021)

  63. [63]

    In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp

    Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for auto- matic evaluation of machine trans- lation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

  64. [64]

    In: Proceedings of the Acl Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation And/or Sum- marization, pp

    Banerjee, S., Lavie, A.: Meteor: An automatic metric for mt evalua- tion with improved correlation with human judgments. In: Proceedings of the Acl Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation And/or Sum- marization, pp. 65–72 (2005)

  65. [65]

    In: Text Summarization Branches Out, pp

    Lin, C.-Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004)

  66. [66]

    In: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pp

    Vedantam, R., Lawrence Zitnick, C., Parikh, D.: Cider: Consensus- based image description evaluation. In: Proceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pp. 4566–4575 (2015) 19