A frustratingly simple yet highly effective attack baseline: Over 90% success rate against the strong black-box models of gpt-4.5/4o/o1

Zhaoyi Li, Xiaohan Zhao, Dong-Dong Wu, Jiacheng Cui, Zhiqiang Shen · 2025 · arXiv 2503.10635

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

Mosaic combines text perturbation, multi-view image optimization, and surrogate model ensembles to reduce reliance on any single open-source model and achieve higher attack success rates on commercial closed-source VLMs.

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

cs.CR · 2026-05-15 · unverdicted · novelty 6.0

DarkLLM trains an LLM to generate language-driven adversarial perturbations that unify targeted, untargeted, segmentation, and multi-model attacks on foundation models.

Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

PRAF-Attack improves targeted attack transferability on black-box MLLMs by using multi-scale progressive resolution and adaptive intermediate feature alignment instead of final-layer global features.

citing papers explorer

Showing 3 of 3 citing papers.

Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization cs.CV · 2026-04-10 · unverdicted · none · ref 20
Mosaic combines text perturbation, multi-view image optimization, and surrogate model ensembles to reduce reliance on any single open-source model and achieve higher attack success rates on commercial closed-source VLMs.
DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models cs.CR · 2026-05-15 · unverdicted · none · ref 31
DarkLLM trains an LLM to generate language-driven adversarial perturbations that unify targeted, untargeted, segmentation, and multi-model attacks on foundation models.
Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment cs.CV · 2026-05-11 · unverdicted · none · ref 23
PRAF-Attack improves targeted attack transferability on black-box MLLMs by using multi-scale progressive resolution and adaptive intermediate feature alignment instead of final-layer global features.

A frustratingly simple yet highly effective attack baseline: Over 90% success rate against the strong black-box models of gpt-4.5/4o/o1

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer