AdaMoLE: Adaptive mixture of LoRA experts

Zefang Liu, Jiahua Luo · 2024 · arXiv 2405.00361

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Beyond LoRA vs. Full Fine-Tuning: Gradient-Guided Optimizer Routing for LLM Adaptation

cs.CL · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

MoLF routes updates between full fine-tuning and LoRA at the optimizer level to match or exceed the better of the two static methods on SQL, medical QA, and counterfactual tasks while an efficient variant outperforms prior adaptive LoRA by up to 20%.

Sub-Token Routing in LoRA for Adaptation and Query-Aware KV Compression

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

Sub-token routing in LoRA-adapted transformers adds a finer compression axis for KV caches, with query-independent and query-aware designs that improve efficiency under reduced budgets when combined with token-level selection.

citing papers explorer

Showing 2 of 2 citing papers.

Beyond LoRA vs. Full Fine-Tuning: Gradient-Guided Optimizer Routing for LLM Adaptation cs.CL · 2026-05-08 · unverdicted · none · ref 35 · 2 links
MoLF routes updates between full fine-tuning and LoRA at the optimizer level to match or exceed the better of the two static methods on SQL, medical QA, and counterfactual tasks while an efficient variant outperforms prior adaptive LoRA by up to 20%.
Sub-Token Routing in LoRA for Adaptation and Query-Aware KV Compression cs.LG · 2026-04-23 · unverdicted · none · ref 10
Sub-token routing in LoRA-adapted transformers adds a finer compression axis for KV caches, with query-independent and query-aware designs that improve efficiency under reduced budgets when combined with token-level selection.

AdaMoLE: Adaptive mixture of LoRA experts

fields

years

verdicts

representative citing papers

citing papers explorer