Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
3d object representations for fine-grained categorization
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3verdicts
UNVERDICTED 3representative citing papers
BiCLIP recovers a structured geometric transformation from few-shot anchors to canonicalize domain features in VLMs and reports state-of-the-art results on 11 benchmarks.
LPT reduces overfitting during prompt tuning of VLMs by CLIP-based foreground filtering, a structural preservation constraint aligning features to frozen CLIP, and a hierarchical logit constraint at the output, improving generalization on base-to-novel, cross-dataset, and domain-generalization tasks
citing papers explorer
-
Dual-Modality Anchor-Guided Filtering for Test-time Prompt Tuning
Dual-modality anchors from text descriptions and test-time image statistics filter views and ensemble predictions to improve test-time prompt tuning, achieving SOTA on 15 datasets.
-
BiCLIP: Domain Canonicalization via Structured Geometric Transformation
BiCLIP recovers a structured geometric transformation from few-shot anchors to canonicalize domain features in VLMs and reports state-of-the-art results on 11 benchmarks.
-
LPT: Less-overfitting Prompt Tuning for Vision-Language Model
LPT reduces overfitting during prompt tuning of VLMs by CLIP-based foreground filtering, a structural preservation constraint aligning features to frozen CLIP, and a hierarchical logit constraint at the output, improving generalization on base-to-novel, cross-dataset, and domain-generalization tasks