Dr. LLM retrofits frozen LLMs with MCTS-supervised per-layer routers for skip/execute/repeat decisions, delivering up to +3.4% accuracy and 5-layer savings on reasoning tasks with strong out-of-domain generalization.
Router-tuning: A simple and effective approach for enabling dynamic-depth in transformers.arXiv preprint arXiv:2410.13184,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Dr.LLM: Dynamic Layer Routing in LLMs
Dr. LLM retrofits frozen LLMs with MCTS-supervised per-layer routers for skip/execute/repeat decisions, delivering up to +3.4% accuracy and 5-layer savings on reasoning tasks with strong out-of-domain generalization.