MoIR mitigates modality dominance in VLMs by explicitly enriching low-information tokens with routed data from stronger modalities prior to LLM processing, yielding more balanced contributions and improved robustness under degradation.
Lora: Low-rank adaptation of large lan- guage models.,
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
KD-CVG uses an Advertising Creative Knowledge Base plus Semantic-Aware Retrieval and Multimodal Knowledge Reference modules to improve semantic alignment and motion realism in text-to-video generation for advertising.
TAPE decouples domain alignment from task fitting using parameter-efficient fine-tuning to adapt foundation models for superior OCT-OCTA segmentation with high efficiency.
citing papers explorer
-
Information Router for Mitigating Modality Dominance in Vision-Language Models
MoIR mitigates modality dominance in VLMs by explicitly enriching low-information tokens with routed data from stronger modalities prior to LLM processing, yielding more balanced contributions and improved robustness under degradation.
-
KD-CVG: A Knowledge-Driven Approach for Creative Video Generation
KD-CVG uses an Advertising Creative Knowledge Base plus Semantic-Aware Retrieval and Multimodal Knowledge Reference modules to improve semantic alignment and motion realism in text-to-video generation for advertising.
-
TAPE: A two-stage parameter-efficient adaptation framework for foundation models in OCT-OCTA analysis
TAPE decouples domain alignment from task fitting using parameter-efficient fine-tuning to adapt foundation models for superior OCT-OCTA segmentation with high efficiency.