TARO builds a temporally guided score prior from high-noise and low-noise diffusion views to purify adversarial examples more robustly than uniform timestep methods.
Robust clip: Unsupervised adversarial fine-tuning of vision embeddings for robust large vision-language models
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
unclear 1representative citing papers
A hierarchical adversarial fine-tuning method for VLMs aligns image and text embeddings at multiple hierarchy depths with theoretical margin connections to boost robustness to leaf and superclass attacks while using multiple trees for semantic variety.
FDA differentially subtracts function-word cross-attention from original attention heads to cut attack success rates by 18-90% across models and tasks while dropping performance by at most 0.6%.
Universal adversarial attacks cause output perturbation 90 times more often than precise target injection in VLMs, with only 2 verbatim successes out of 6615 tests.
ORCA is an agentic reasoning framework that enhances factual accuracy and adversarial robustness of pretrained LVLMs via an Observe-Reason-Critique-Act loop with small vision models, reporting accuracy gains of up to 40% on hallucination benchmarks and 20% under adversarial perturbations.
AGC is a training-free inference-time defense for CLIP that adaptively corrects features along geodesics to robust augmentations, claiming 44.4% higher average robust accuracy and 10x lower latency than prior baselines across eight datasets and three backbones.
Generative purification with consensus aggregation reduces adversarial illusion attack success rates to near zero on ImageBind while improving alignment on both clean and attacked inputs.
citing papers explorer
-
TARO: Temporal Adversarial Rectification Optimization Using Diffusion Models as Purifiers
TARO builds a temporally guided score prior from high-noise and low-noise diffusion views to purify adversarial examples more robustly than uniform timestep methods.
-
Hierarchically Robust Zero-shot Vision-language Models
A hierarchical adversarial fine-tuning method for VLMs aligns image and text embeddings at multiple hierarchy depths with theoretical margin connections to boost robustness to leaf and superclass attacks while using multiple trees for semantic variety.
-
Pay Less Attention to Function Words for Free Robustness of Vision-Language Models
FDA differentially subtracts function-word cross-attention from original attention heads to cut attack success rates by 18-90% across models and tasks while dropping performance by at most 0.6%.
-
VisInject: Disruption != Injection -- A Dual-Dimension Evaluation of Universal Adversarial Attacks on Vision-Language Models
Universal adversarial attacks cause output perturbation 90 times more often than precise target injection in VLMs, with only 2 verbatim successes out of 6615 tests.
-
ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models
ORCA is an agentic reasoning framework that enhances factual accuracy and adversarial robustness of pretrained LVLMs via an Observe-Reason-Critique-Act loop with small vision models, reporting accuracy gains of up to 40% on hallucination benchmarks and 20% under adversarial perturbations.
-
AGC: Adaptive Geodesic Correction for Adversarial Robustness on Vision-Language Models
AGC is a training-free inference-time defense for CLIP that adaptively corrects features along geodesics to robust augmentations, claiming 44.4% higher average robust accuracy and 10x lower latency than prior baselines across eight datasets and three backbones.
-
Breaking the Illusion: Consensus-Based Generative Mitigation of Adversarial Illusions in Multi-Modal Embeddings
Generative purification with consensus aggregation reduces adversarial illusion attack success rates to near zero on ImageBind while improving alignment on both clean and attacked inputs.