SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.
Boosting continual learning of vision-language models via mixture-of-experts adapters
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
RaPO reduces catastrophic forgetting in visual continual learning by shaping rewards around policy drift and stabilizing advantages with cross-task exponential moving averages during reinforcement fine-tuning of multimodal models.
StructAlign uses simplex ETF geometry and cross-modal relation-preserving losses to mitigate intra- and cross-modal feature drift in continual text-to-video retrieval.
citing papers explorer
-
Unlocking Patch-Level Features for CLIP-Based Class-Incremental Learning
SPA unlocks patch-level features in CLIP for class-incremental learning via semantic-guided selection and optimal transport alignment with class descriptions, plus projectors and pseudo-feature replay to reduce forgetting.
-
Overcoming Catastrophic Forgetting in Visual Continual Learning with Reinforcement Fine-Tuning
RaPO reduces catastrophic forgetting in visual continual learning by shaping rewards around policy drift and stabilizing advantages with cross-task exponential moving averages during reinforcement fine-tuning of multimodal models.
-
StructAlign: Structured Cross-Modal Alignment for Continual Text-to-Video Retrieval
StructAlign uses simplex ETF geometry and cross-modal relation-preserving losses to mitigate intra- and cross-modal feature drift in continual text-to-video retrieval.