VIPER exposes Functional Fusion in dynamic prompt architectures, enabling a backdoor that resists pruning by tightly integrating attack and utility parameters in the same high-magnitude core.
Imagenet: A large-scale hierarchical image database
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7representative citing papers
CLIP-Inspector reconstructs OOD triggers to detect backdoors in prompt-tuned CLIP models with 94% accuracy and higher AUROC than baselines, plus a repair step via fine-tuning.
Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
GEP transfers semantic knowledge from image foundation models to event data via alignment and generative pretraining on mixed sequences to create transferable event-based visual models.
MaskDiME uses adaptive masked diffusion to produce 30x faster, localized, and semantically consistent visual counterfactual explanations without training, matching or exceeding prior performance on five datasets.
RE-VLM fuses RGB and event data in a dual-stream VLM with a graph-based pipeline for generating training captions and QA pairs, plus two new datasets, showing gains over RGB-only and event-only baselines especially in challenging conditions.
citing papers explorer
-
Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
VIPER exposes Functional Fusion in dynamic prompt architectures, enabling a backdoor that resists pruning by tightly integrating attack and utility parameters in the same high-magnitude core.
-
CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
CLIP-Inspector reconstructs OOD triggers to detect backdoors in prompt-tuned CLIP models with 94% accuracy and higher AUROC than baselines, plus a repair step via fine-tuning.
-
Dual-Modality Anchor-Guided Filtering for Test-time Prompt Tuning
Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
-
Generative Event Pretraining with Foundation Model Alignment
GEP transfers semantic knowledge from image foundation models to event data via alignment and generative pretraining on mixed sequences to create transferable event-based visual models.
-
MaskDiME: Adaptive Masked Diffusion for Precise and Efficient Visual Counterfactual Explanations
MaskDiME uses adaptive masked diffusion to produce 30x faster, localized, and semantically consistent visual counterfactual explanations without training, matching or exceeding prior performance on five datasets.
-
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding
RE-VLM fuses RGB and event data in a dual-stream VLM with a graph-based pipeline for generating training captions and QA pairs, plus two new datasets, showing gains over RGB-only and event-only baselines especially in challenging conditions.
- LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models