← back to paper
arxiv: 2604.19835 · 2 revisions
Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts