← back to paper
arxiv: 2605.05225 · 2 revisions
MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference