BiRG-LoRA reaches 69.31% macro-average accuracy across CMB, CMExam, MedQA and MedMCQA, outperforming MoELoRA by 0.89 points with 28.1% fewer parameters under a matched single-seed protocol.
arXiv preprint arXiv:2501.15103 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
DAG-MoE uses a lightweight module to learn DAG-based structural aggregation of selected experts, expanding combination space and enabling intra-layer multi-step reasoning compared to standard weighted-sum MoE.
citing papers explorer
-
DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts
DAG-MoE uses a lightweight module to learn DAG-based structural aggregation of selected experts, expanding combination space and enabling intra-layer multi-step reasoning compared to standard weighted-sum MoE.