Introduces a text-guided backdoor attack using common textual words as triggers and visual perturbations for stealthy, adjustable control on multimodal pretrained models.
Sentence-level prompts benefit composed image retrieval
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
IMAGINE uses adaptive schema-imagery via dynamic multimodal prototypes to incorporate implicit semantics into composed video retrieval, claiming SOTA results on CVR and CIR benchmarks.
citing papers explorer
-
IMAGINE: Adaptive Schema-Imagery Enhanced Composition for Composed Video Retrieval
IMAGINE uses adaptive schema-imagery via dynamic multimodal prototypes to incorporate implicit semantics into composed video retrieval, claiming SOTA results on CVR and CIR benchmarks.