PERL augments frozen CLIP with a shared recurrent reasoning module of roughly 6K parameters that iteratively refines representations via latent token injection, delivering strong base-to-novel and transfer performance across 15 benchmarks.
Clip-adapter: Better vision-language models with feature adapters.International journal of computer vision, 132(2):581–595
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.
citing papers explorer
-
PERL: Parameter Efficient Reasoning in CLIP Latent Space
PERL augments frozen CLIP with a shared recurrent reasoning module of roughly 6K parameters that iteratively refines representations via latent token injection, delivering strong base-to-novel and transfer performance across 15 benchmarks.
-
Unlocking Patch-Level Features for CLIP-Based Class-Incremental Learning
SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.