CoLA introduces a dual-path low-rank adaptation method that adds cross-modal learning to LoRA, delivering small gains over standard LoRA on visual grounding and audio-visual benchmarks while preserving parameter efficiency.
Swimvg: Step-wise multimodal fusion and adaption for visual grounding.arXiv preprint arXiv:2502.16786
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CoLA: Cross-Modal Low-rank Adaptation for Multimodal Downstream Tasks
CoLA introduces a dual-path low-rank adaptation method that adds cross-modal learning to LoRA, delivering small gains over standard LoRA on visual grounding and audio-visual benchmarks while preserving parameter efficiency.