ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
Llava-mole: Sparse mixture of lora experts for mitigating data con- flicts in instruction finetuning mllms
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
SMoES improves MoE-VLM performance and efficiency via soft modality-guided expert routing and inter-bin mutual information regularization, yielding 0.9-4.2% task gains and 56% communication reduction.
MoRAM frames continual learning as incremental addition of rank-1 adapters viewed as self-activating key-value associative memory units in a mixture-of-experts setup.
LoRA-Mixer routes modular LoRA experts into attention projection matrices with an adaptive Routing Specialization Loss to improve multi-task performance while using fewer trainable parameters than prior LoRA-MoE methods.
citing papers explorer
-
View-Aware Semantic Alignment for Aerial-Ground Person Re-Identification
ViSA proposes expert-driven token generation and dual-branch local fusion modules for view-aware semantic alignment in AGPReID, reporting up to 10.06% mAP gains on the CARGO benchmark.
-
SMoES: Soft Modality-Guided Expert Specialization in MoE-VLMs
SMoES improves MoE-VLM performance and efficiency via soft modality-guided expert routing and inter-bin mutual information regularization, yielding 0.9-4.2% task gains and 56% communication reduction.
-
Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts
MoRAM frames continual learning as incremental addition of rank-1 adapters viewed as self-activating key-value associative memory units in a mixture-of-experts setup.
-
LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing
LoRA-Mixer routes modular LoRA experts into attention projection matrices with an adaptive Routing Specialization Loss to improve multi-task performance while using fewer trainable parameters than prior LoRA-MoE methods.