VIPER exposes Functional Fusion in dynamic prompt architectures, enabling a backdoor that resists pruning by tightly integrating attack and utility parameters in the same high-magnitude core.
Learning gener- ative visual models from few training examples: An incre- mental bayesian approach tested on 101 object categories
7 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
CLIP-Inspector reconstructs OOD triggers to detect backdoors in prompt-tuned CLIP models with 94% accuracy and higher AUROC than baselines, plus a repair step via fine-tuning.
Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
GEP transfers semantic knowledge from image foundation models to event data via alignment and generative pretraining on mixed sequences to create transferable event-based visual models.
SigLino distills SigLIP2 and DINOv3 into efficient vision models via asymmetric relation-knowledge distillation, token-balanced batching, and hierarchical data sampling on a new 200M-image corpus, yielding better transfer to grounding VLMs than training from scratch.
BiCLIP recovers a structured geometric transformation from few-shot anchors to canonicalize domain features in VLMs and reports state-of-the-art results on 11 benchmarks.
LPT reduces overfitting during prompt tuning of VLMs by CLIP-based foreground filtering, a structural preservation constraint aligning features to frozen CLIP, and a hierarchical logit constraint at the output, improving generalization on base-to-novel, cross-dataset, and domain-generalization tasks
citing papers explorer
-
Exposing Functional Fusion: A New Class of Strategic Backdoor in Dynamic Prompt Architectures
VIPER exposes Functional Fusion in dynamic prompt architectures, enabling a backdoor that resists pruning by tightly integrating attack and utility parameters in the same high-magnitude core.
-
CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
CLIP-Inspector reconstructs OOD triggers to detect backdoors in prompt-tuned CLIP models with 94% accuracy and higher AUROC than baselines, plus a repair step via fine-tuning.
-
Dual-Modality Anchor-Guided Filtering for Test-time Prompt Tuning
Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
-
Generative Event Pretraining with Foundation Model Alignment
GEP transfers semantic knowledge from image foundation models to event data via alignment and generative pretraining on mixed sequences to create transferable event-based visual models.
-
SigLino: Efficient Multi-Teacher Distillation for Agglomerative Vision Foundation Models
SigLino distills SigLIP2 and DINOv3 into efficient vision models via asymmetric relation-knowledge distillation, token-balanced batching, and hierarchical data sampling on a new 200M-image corpus, yielding better transfer to grounding VLMs than training from scratch.
-
BiCLIP: Domain Canonicalization via Structured Geometric Transformation
BiCLIP recovers a structured geometric transformation from few-shot anchors to canonicalize domain features in VLMs and reports state-of-the-art results on 11 benchmarks.
-
LPT: Less-overfitting Prompt Tuning for Vision-Language Model
LPT reduces overfitting during prompt tuning of VLMs by CLIP-based foreground filtering, a structural preservation constraint aligning features to frozen CLIP, and a hierarchical logit constraint at the output, improving generalization on base-to-novel, cross-dataset, and domain-generalization tasks