M\textsuperscript{4}Fuse: Lightweight State-Space MoE with a Cross-Scale Gating Bridge for Brain Tumor Segmentation

Li Yang; Meihua Zhou; Xinyu Tong

arxiv: 2605.02444 · v1 · submitted 2026-05-04 · 💻 cs.CV · cs.LG

Mtextsuperscript{4}Fuse: Lightweight State-Space MoE with a Cross-Scale Gating Bridge for Brain Tumor Segmentation

Meihua Zhou , Xinyu Tong , Li Yang This is my paper

Pith reviewed 2026-05-08 18:35 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords brain tumor segmentationlightweight 3D networkstate-space modelmixture of expertscross-scale gatingBraTS benchmarkmedical image analysisencoder-decoder balance

0 comments

The pith

M4Fuse delivers higher brain tumor segmentation accuracy with 63 percent fewer parameters by using state-space mixing and sample-level experts even at half the usual input resolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces M4Fuse as a lightweight 3D segmentation network that corrects encoder-decoder imbalance in brain tumor models by replacing heavy depth expansion with three coordinated components. A grouped state-space mixer propagates long-range context at linear cost, a cross-scale dual-stage gating bridge cleans and aligns skip connections, and a sample-level mixture-of-experts absorbs scanner-to-scanner shifts. On BraTS2019 and BraTS2021, this design yields better average scores than competing lightweight methods while cutting parameters by 62.63 percent and still improving results when input size is halved to 64x128x128. The result shows that careful capacity balancing and shift-robust routing can maintain diagnostic utility under tight compute limits.

Core claim

M4Fuse prioritizes discriminative brain tumor cues over exhaustive appearance reconstruction by balancing encoder and decoder capacity, propagating long-range context with linear complexity via a grouped state-space mixer, denoising and aligning skip features with a cross-scale dual-stage gating bridge, and absorbing cross-site acquisition shifts with a sample-level mixture-of-experts, achieving superior parameter-to-accuracy efficiency on BraTS2019 and BraTS2021 even at the reduced input resolution of 64x128x128.

What carries the argument

The synergistic combination of grouped state-space mixer, cross-scale dual-stage gating bridge, and sample-level mixture-of-experts that together replace depth expansion while preserving long-range context and shift robustness.

If this is right

Accurate segmentation remains possible at input volumes half the size used by prior lightweight models.
Parameter counts drop by more than 60 percent relative to other high-performing lightweight networks on the same benchmarks.
Average segmentation performance improves by 0.09 percent despite the reduced model size.
Component ablations confirm that each of the three core modules contributes measurably to the observed efficiency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The architecture could be adapted to other 3D medical segmentation tasks that suffer from scanner variability, such as liver or prostate imaging.
Lower memory footprint opens the possibility of running full 3D inference on edge devices in operating rooms or portable scanners.
The linear-complexity mixer may scale to higher-resolution volumes without the quadratic cost growth typical of attention-based alternatives.

Load-bearing premise

That the specific grouping of state-space mixing, cross-scale gating, and sample-level expert routing will continue to produce efficiency and accuracy gains on data from unseen scanners or acquisition protocols.

What would settle it

A head-to-head comparison on a new multi-center brain tumor dataset acquired with different scanners where M4Fuse requires more parameters than the next-best lightweight model to reach equal Dice scores.

Figures

Figures reproduced from arXiv: 2605.02444 by Li Yang, Meihua Zhou, Xinyu Tong.

**Figure 1.** Figure 1: (a) On the left is a standard segmentation architecture, view at source ↗

**Figure 2.** Figure 2: Overall architecture of M4Fuse. (a) CSBridge (CSB, feature fusion stage): CSB connects the encoder and decoder to enhance multi-scale features by integrating the spatial-channel attention of SBridge and CBridge: Dec(t ′ ) = Enc(t)+t·SBridge(t)+CBridge(t), where t ′ = t · sx + cx + tx is fed to the decoder. Each decoder stage fuses with its corresponding subsampled encoder feature through residual skip conn… view at source ↗

**Figure 3.** Figure 3: Visualized segmentation of BraTS 2021 datasets, input resolution: 64×128×128, where red indicates tumor core, blue indicates view at source ↗

read the original abstract

Encoder-decoder imbalance and the reliance on large input volumes make many 3D brain tumor segmentation models both compute-heavy and brittle. We present M\textsuperscript{4}Fuse, a lightweight network that prioritizes discriminative brain tumor cues over exhaustive appearance reconstruction. Our method balances encoder and decoder capacity and replaces depth expansion with a synergistic design: it propagates long-range context with linear complexity via a grouped state space mixer, denoises and aligns skip features using a cross-scale dual-stage gating bridge, and absorbs cross-site acquisition shifts with a sample-level mixture-of-experts. On the BraTS2019 and BraTS2021 benchmarks, M\textsuperscript{4}Fuse outperforms other lightweight excellent methods in both parameter count and performance. Even at a challenging input resolution of \(64\times128\times128\) (half that of existing excellent models), M\textsuperscript{4}Fuse reduces parameters by 62.63\% and improves average performance by 0.09\%. Ablations of key components validate the method's exceptional parameter-to-accuracy efficiency and robustness across diverse data centers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

M4Fuse puts together a grouped state-space mixer, cross-scale gating bridge, and sample-level MoE for lighter 3D brain tumor segmentation, but the 0.09% gain at half resolution lacks the stats needed to show it is real.

read the letter

The paper's core contribution is a new lightweight architecture for BraTS segmentation that balances encoder-decoder capacity through three pieces: a grouped state-space mixer for long-range context at linear cost, a cross-scale dual-stage gating bridge to clean up and align skip features, and a sample-level mixture-of-experts to absorb site shifts. This combination is not a direct copy of prior work and targets the practical problem of running volumetric models on modest hardware without losing too much accuracy. The design choices make sense for 3D data where full attention or deep CNN stacks get expensive fast, and the focus on parameter efficiency rather than raw depth is a reasonable direction. The abstract also notes ablations that check the pieces, which is better than many incremental papers. On the positive side, the state-space component and the gating bridge look like they could transfer to other segmentation tasks where skip connections matter. The MoE part for handling acquisition differences is a standard trick but applied here at the sample level in a way that fits the clinical data story. The main weakness is the experimental support for the headline numbers. The claim of 62.63% parameter reduction plus a 0.09% average performance lift at 64×128×128 resolution is presented without error bars, run-to-run variance, or significance tests. BraTS Dice scores often fluctuate by 0.5–2% across folds or seeds, so a 0.09% shift could sit inside noise. There is also no clear statement that the comparison baselines were re-run at the identical low resolution or that the ablations controlled for that setting. Without those details the efficiency story is harder to trust. The paper is aimed at medical imaging researchers who need models that fit on limited clinical hardware. Someone working on efficient 3D segmentation or state-space models in vision would find the architecture worth reading and possibly extending. It is coherent on its own terms and shows honest engagement with the efficiency constraints of the task, so it deserves a serious referee even though the current results section needs more rigor. I would send it for peer review with a request for statistical validation, matched-resolution baselines, and clearer ablation tables.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes M⁴Fuse, a lightweight encoder-decoder architecture for 3D brain tumor segmentation. It combines a grouped state-space mixer for long-range context with linear complexity, a cross-scale dual-stage gating bridge for denoising and aligning skip connections, and a sample-level mixture-of-experts to manage cross-site variations. The central claims are that it outperforms other lightweight methods on BraTS2019 and BraTS2021 in both accuracy and parameter count, and that even at a reduced input resolution of 64×128×128 it achieves a 62.63% parameter reduction while improving average performance by 0.09%.

Significance. Should the efficiency and accuracy claims be substantiated with rigorous statistical evidence and reproducible experiments, this approach could meaningfully advance the development of computationally efficient models for volumetric medical image segmentation, particularly in settings with limited computational resources or variable data acquisition protocols. The integration of state-space models with MoE and gating mechanisms offers a promising direction for balancing model capacity in 3D tasks.

major comments (2)

[Abstract] The reported 0.09% improvement in average performance at the challenging 64×128×128 resolution lacks any mention of error bars, standard deviations from multiple runs, or statistical significance testing. Since BraTS segmentation metrics typically exhibit run-to-run or cross-validation variances of 0.5–2%, this small gain cannot be confidently distinguished from noise without additional analysis, directly impacting the validity of the outperformance claim.
[Abstract] The comparison at halved resolution does not specify whether the competing lightweight models were evaluated under identical input conditions or if their architectures were adapted accordingly. Without such matched baselines or a dedicated table detailing per-model performance at 64×128×128, the 62.63% parameter reduction and performance delta may not be directly comparable.

minor comments (2)

The abstract uses the phrase 'lightweight excellent methods,' which is imprecise; rephrasing to 'other lightweight state-of-the-art methods' would improve clarity.
While ablations are mentioned as validating the components, the abstract does not summarize the key quantitative findings from these ablations, which would strengthen the presentation of the synergistic design.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript to strengthen the statistical rigor and experimental clarity of our claims.

read point-by-point responses

Referee: [Abstract] The reported 0.09% improvement in average performance at the challenging 64×128×128 resolution lacks any mention of error bars, standard deviations from multiple runs, or statistical significance testing. Since BraTS segmentation metrics typically exhibit run-to-run or cross-validation variances of 0.5–2%, this small gain cannot be confidently distinguished from noise without additional analysis, directly impacting the validity of the outperformance claim.

Authors: We agree that the reported 0.09% average improvement requires supporting statistical evidence to substantiate the outperformance claim. In the revised manuscript we will add mean and standard deviation values computed over multiple independent training runs (with different random seeds), along with the results of paired statistical significance tests (e.g., Wilcoxon signed-rank or paired t-test) against the strongest baseline. These additions will appear both in the abstract and in an expanded results table, allowing readers to evaluate whether the observed delta exceeds typical BraTS variance. revision: yes
Referee: [Abstract] The comparison at halved resolution does not specify whether the competing lightweight models were evaluated under identical input conditions or if their architectures were adapted accordingly. Without such matched baselines or a dedicated table detailing per-model performance at 64×128×128, the 62.63% parameter reduction and performance delta may not be directly comparable.

Authors: The referee correctly identifies an ambiguity in the abstract. All competing lightweight models were evaluated at the identical 64×128×128 input resolution without architectural modifications, ensuring a matched comparison. We will revise the abstract to state this explicitly and insert a new supplementary table that reports Dice scores, parameter counts, and FLOPs for every baseline at this resolution. This will make both the 62.63% parameter reduction and the 0.09% performance delta directly interpretable. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external BraTS benchmarks without self-referential derivations

full rationale

The paper introduces an architectural design (grouped state-space mixer, cross-scale gating bridge, sample-level MoE) for 3D segmentation and validates it via direct comparison to external BraTS2019/BraTS2021 leaderboards and parameter counts at fixed resolutions. No equations, uniqueness theorems, or first-principles derivations appear that reduce claimed performance deltas to quantities defined inside the paper by construction. Ablation results and benchmark scores are presented as independent measurements rather than tautological predictions. Any internal self-citations (if present in the full text) are not load-bearing for the headline numeric claims, which remain falsifiable against public datasets.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, implementation details, or training procedures, so no free parameters, axioms, or invented entities can be extracted; the model components are described at the level of high-level design choices only.

pith-pipeline@v0.9.0 · 5505 in / 1294 out tokens · 42081 ms · 2026-05-08T18:35:15.523557+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages

[1]

Reducing seg- mentation failures in cardiac mri via late feature fusion and gan-based augmentation.Computers in Biology and Medicine, 161:106973, 2023

Yasmina Al Khalil, Sina Amirrajab, Cristian Lorenz, J ¨urgen Weese, Josien Pluim, and Marcel Breeuwer. Reducing seg- mentation failures in cardiac mri via late feature fusion and gan-based augmentation.Computers in Biology and Medicine, 161:106973, 2023. 2

work page 2023
[2]

Dynamic weighted knowledge distilla- tion for brain tumor segmentation.Pattern Recognition, 155: 110731, 2024

Dianlong An, Panpan Liu, Yan Feng, Pengju Ding, Weifeng Zhou, and Bin Yu. Dynamic weighted knowledge distilla- tion for brain tumor segmentation.Pattern Recognition, 155: 110731, 2024. 1

work page 2024
[3]

An overview of load balancing in hetnets: Old myths and open problems.IEEE Wireless Communications, 21(2):18–25, 2014

Jeffrey G Andrews, Sarabjot Singh, Qiaoyang Ye, Xingqin Lin, and Harpreet S Dhillon. An overview of load balancing in hetnets: Old myths and open problems.IEEE Wireless Communications, 21(2):18–25, 2014. 2

work page 2014
[4]

Lcmf-net: A lightweight collaborative multimodal fusion network for brain tumor segmentation.Neural Networks, page 108257, 2025

Guogang Cao, Zhaojun Yang, Wanying Liang, Sai Zhang, Tao Zhong, Hongdong Mao, Dong Wang, and Ming Zong. Lcmf-net: A lightweight collaborative multimodal fusion network for brain tumor segmentation.Neural Networks, page 108257, 2025. 3

work page 2025
[5]

Moe-lightning: High-throughput moe inference on memory-constrained gpus

Shiyi Cao, Shu Liu, Tyler Griggs, Peter Schafhalter, Xiaox- uan Liu, Ying Sheng, Joseph E Gonzalez, Matei Zaharia, and Ion Stoica. Moe-lightning: High-throughput moe inference on memory-constrained gpus. InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, pages 715–730, 2025. 2

work page 2025
[6]

Lienkamp, Thomas Brox, and Olaf Ronneberger

¨Ozg¨un C ¸ ic ¸ek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, and Olaf Ronneberger. 3d u-net: Learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Interven- tion – MICCAI 2016, pages 424–432, Cham, 2016. Springer International Publishing. 2, 6

work page 2016
[7]

Denseformer-moe: A dense transformer foundation model with mixture of experts for multi-task brain image analysis.IEEE Transactions on Medical Imaging, 2025

Rizhi Ding, Hui Lu, and Manhua Liu. Denseformer-moe: A dense transformer foundation model with mixture of experts for multi-task brain image analysis.IEEE Transactions on Medical Imaging, 2025. 2

work page 2025
[8]

Decoupling feature-driven and mul- timodal fusion attention for clothing-changing person re- identification.Artificial Intelligence Review, 58(8):241,

Yongkang Ding, Xiaoyin Wang, Hao Yuan, Meina Qu, and Xiangzhou Jian. Decoupling feature-driven and mul- timodal fusion attention for clothing-changing person re- identification.Artificial Intelligence Review, 58(8):241,

work page
[9]

Maximum score rout- ing for mixture-of-experts

Bowen Dong, Yilong Fan, Yutao Sun, Zhenyu Li, Tengyu Pan, Zhou Xun, and Jianyong Wang. Maximum score rout- ing for mixture-of-experts. InFindings of the Association for Computational Linguistics: ACL 2025, pages 12619–12632,

work page 2025
[10]

Mixture-of-experts for semantic segmentation of re- moting sensing image

Shaofeng He, Qiu Cheng, Yu Huai, Zhongke Zhu, and Jie Ding. Mixture-of-experts for semantic segmentation of re- moting sensing image. InInternational Conference on Image Processing and Artificial Intelligence (ICIPAl 2024), pages 478–483. SPIE, 2024. 2

work page 2024
[11]

Swinunetr-v2: Stronger swin transformers with stagewise convolutions for 3d medi- cal image segmentation

Yufan He, Vishwesh Nath, Dong Yang, Yucheng Tang, An- driy Myronenko, and Daguang Xu. Swinunetr-v2: Stronger swin transformers with stagewise convolutions for 3d medi- cal image segmentation. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2023, pages 416– 426, Cham, 2023. Springer Nature Switzerland. 2, 6

work page 2023
[12]

nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021

Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Pe- tersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021. 3, 6

work page 2021
[13]

Huafeng Li, Zengyi Yang, Yafei Zhang, Wei Jia, Zheng- tao Yu, and Yu Liu. Mulfs-cap: Multimodal fusion- supervised cross-modality alignment perception for unreg- istered infrared-visible image fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 2

work page 2025
[14]

THOR-MoE: Hierarchical task-guided and context-responsive routing for neural machine translation

Yunlong Liang, Fandong Meng, and Jie Zhou. THOR-MoE: Hierarchical task-guided and context-responsive routing for neural machine translation. InProceedings of the 63rd An- nual Meeting of the Association for Computational Linguis- tics (Volume 1: Long Papers), pages 21433–21445, Vienna, Austria, 2025. Association for Computational Linguistics. 2

work page 2025
[15]

Lightm-unet: Mamba assists in lightweight unet for medical image segmentation,

Weibin Liao, Yinghao Zhu, Xinyuan Wang, Chengwei Pan, Yasha Wang, and Liantao Ma. Lightm-unet: Mamba assists in lightweight unet for medical image segmentation.arXiv preprint arXiv:2403.05246, 2024. 2, 6

work page arXiv 2024
[16]

Multimodal brain tumor segmen- tation boosted by monomodal normal brain images.IEEE Transactions on Image Processing, 33:1199–1210, 2024

Huabing Liu, Zhengze Ni, Dong Nie, Dinggang Shen, Jinda Wang, and Zhenyu Tang. Multimodal brain tumor segmen- tation boosted by monomodal normal brain images.IEEE Transactions on Image Processing, 33:1199–1210, 2024. 1

work page 2024
[17]

Cswin-unet: Transformer unet with cross-shaped windows for medical image segmentation.Information Fusion, 113: 102634, 2025

Xiao Liu, Peng Gao, Tao Yu, Fei Wang, and Ru-Yue Yuan. Cswin-unet: Transformer unet with cross-shaped windows for medical image segmentation.Information Fusion, 113: 102634, 2025. 1

work page 2025
[18]

Paddleseg: A high-efficient development toolkit for image segmentation, 2021

Yi Liu, Lutao Chu, Guowei Chen, Zewu Wu, Zeyu Chen, Baohua Lai, and Yuying Hao. Paddleseg: A high-efficient development toolkit for image segmentation, 2021. 2, 7

work page 2021
[19]

3d mri brain tumor segmentation using autoencoder regularization

Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. InBrainlesion: Glioma, Mul- tiple Sclerosis, Stroke and Traumatic Brain Injuries, Cham,

work page
[20]

Springer International Publishing. 6

work page
[21]

Vcanet: Vision transformer with fusion channel and spatial attention module for 3d brain tu- mor segmentation.Computers in Biology and Medicine, 186: 109662, 2025

Dichao Pan, Jianguo Shen, Zaid Al-Huda, and Mo- hammed AA Al-Qaness. Vcanet: Vision transformer with fusion channel and spatial attention module for 3d brain tu- mor segmentation.Computers in Biology and Medicine, 186: 109662, 2025. 1

work page 2025
[22]

A review of affective computing: From unimodal anal- ysis to multimodal fusion.Information fusion, 37:98–125,

Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hus- sain. A review of affective computing: From unimodal anal- ysis to multimodal fusion.Information fusion, 37:98–125,

work page
[23]

Abhiram Potlapalli and Seetharam Khetavath. Exploring the use of deep learning models for image compression in em- bedded systems: Encoder and decoder architectures.Journal of Intelligent Systems & Internet of Things, 15(1), 2025. 1

work page 2025
[24]

A critical review on segmentation of glioma brain tumor and prediction of 9 overall survival.Archives of Computational Methods in En- gineering, 32(3):1525–1569, 2025

Novsheena Rasool and Javaid Iqbal Bhat. A critical review on segmentation of glioma brain tumor and prediction of 9 overall survival.Archives of Computational Methods in En- gineering, 32(3):1525–1569, 2025. 1

work page 2025
[25]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical image com- puting and computer-assisted intervention, pages 234–241. Springer, 2015. 1, 2

work page 2015
[26]

Johansen, Dag Johansen, Michael A

Abhishek Srivastava, Debesh Jha, Sukalpa Chanda, Uma- pada Pal, H ˚avard D. Johansen, Dag Johansen, Michael A. Riegler, Sharib Ali, and P ˚al Halvorsen. Msrf-net: A multi- scale residual fusion network for biomedical image segmen- tation.IEEE Journal of Biomedical and Health Informatics, 26(5):2252–2263, 2022. 2

work page 2022
[27]

Dayu Tan, Zhiyuan Yao, Xin Peng, Haiping Ma, Yike Dai, Yansen Su, and Weimin Zhong. Multi-level medical image segmentation network based on multi-scale and context in- formation fusion strategy.IEEE Transactions on Emerging Topics in Computational Intelligence, 8(1):474–487, 2023. 2

work page 2023
[28]

Narrowing the semantic gaps in u-net with learnable skip connections: The case of medical image segmentation.Neu- ral Networks, 178:106546, 2024

Haonan Wang, Peng Cao, Jinzhu Yang, and Osmar Zaiane. Narrowing the semantic gaps in u-net with learnable skip connections: The case of medical image segmentation.Neu- ral Networks, 178:106546, 2024. 1

work page 2024
[29]

Mff-sdd: A bidirectional guidance and multiscale multimodal fusion model for small defect detection in industrial films.IEEE Transactions on Industrial Informatics, 2025

Huiyan Wang, Ruihao Peng, Ming Ying, Fashuai Li, Jiuyi Zhang, Xiaolan Li, Yan Tian, and Guofeng Zhang. Mff-sdd: A bidirectional guidance and multiscale multimodal fusion model for small defect detection in industrial films.IEEE Transactions on Industrial Informatics, 2025. 2

work page 2025
[30]

Transbts: Multimodal brain tumor segmen- tation using transformer

Wenxuan Wang, Chen Chen, Meng Ding, Hong Yu, Sen Zha, and Jiangyun Li. Transbts: Multimodal brain tumor segmen- tation using transformer. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2021, pages 109– 119, Cham, 2021. Springer International Publishing. 2, 6

work page 2021
[31]

Deep multimodal fusion by chan- nel exchanging.Advances in neural information processing systems, 33:4835–4845, 2020

Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, and Junzhou Huang. Deep multimodal fusion by chan- nel exchanging.Advances in neural information processing systems, 33:4835–4845, 2020. 2

work page 2020
[32]

Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation

Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, and Lei Zhu. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2024, pages 578–588, Cham, 2024. Springer Nature Switzerland. 2, 6

work page 2024
[33]

Lightweight real-time semantic seg- mentation network with efficient transformer and cnn.IEEE Transactions on Intelligent Transportation Systems, 24(12): 15897–15906, 2023

Guoan Xu, Juncheng Li, Guangwei Gao, Huimin Lu, Jian Yang, and Dong Yue. Lightweight real-time semantic seg- mentation network with efficient transformer and cnn.IEEE Transactions on Intelligent Transportation Systems, 24(12): 15897–15906, 2023. 2

work page 2023
[34]

Dynamic multimodal fu- sion

Zihui Xue and Radu Marculescu. Dynamic multimodal fu- sion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 2575–2584,

work page
[35]

XMoE: Sparse models with fine-grained and adaptive expert selection

Yuanhang Yang, Shiyi Qi, Wenchao Gu, Chaozheng Wang, Cuiyun Gao, and Zenglin Xu. XMoE: Sparse models with fine-grained and adaptive expert selection. InFindings of the Association for Computational Linguistics: ACL 2024, pages 11664–11674, Bangkok, Thailand, 2024. Association for Computational Linguistics. 2

work page 2024
[36]

All- in-one medical image restoration via task-adaptive routing

Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Yi, Hui Zhang, Dan Zhao, Bingzheng Wei, and Yan Xu. All- in-one medical image restoration via task-adaptive routing. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 67–77. Springer,

work page
[37]

Su- perlightnet: Lightweight parameter aggregation network for multimodal brain tumor segmentation

Feng Yu, Jiacheng Cao, Li Liu, and Minghua Jiang. Su- perlightnet: Lightweight parameter aggregation network for multimodal brain tumor segmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5197–5206, 2025. 2, 6

work page 2025
[38]

Resilient datacenter load balancing in the wild

Hong Zhang, Junxue Zhang, Wei Bai, Kai Chen, and Mosharaf Chowdhury. Resilient datacenter load balancing in the wild. InProceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 253– 266, 2017. 2

work page 2017
[39]

Sed: Searching enhanced decoder with switchable skip connection for semantic segmentation.Pat- tern Recognition, 149:110196, 2024

Xian Zhang, Zhibin Quan, Qiang Li, Dejun Zhu, and Wankou Yang. Sed: Searching enhanced decoder with switchable skip connection for semantic segmentation.Pat- tern Recognition, 149:110196, 2024. 1

work page 2024
[40]

Mpmoe: Memory efficient moe for pre-trained models with adaptive pipeline parallelism.IEEE Transactions on Parallel and Distributed Systems, 35(6):998–1011, 2024

Zheng Zhang, Yaqi Xia, Hulin Wang, Donglin Yang, Chuang Hu, Xiaobo Zhou, and Dazhao Cheng. Mpmoe: Memory efficient moe for pre-trained models with adaptive pipeline parallelism.IEEE Transactions on Parallel and Distributed Systems, 35(6):998–1011, 2024. 2

work page 2024
[41]

Contrast-aware hybrid attention network for medi- cal image segmentation.Information Sciences, page 123000,

Meihua Zhou, Jun Feng, Tianlong Zheng, Min Cheng, and Li Yang. Contrast-aware hybrid attention network for medi- cal image segmentation.Information Sciences, page 123000,

work page
[42]

Dcl-se: Dynamic curricu- lum learning for spatiotemporal encoding of brain imaging

Meihua Zhou, Xinyu Tong, Jiarui Zhao, Min Cheng, Li Yang, Lei Tian, and Nan Wan. Dcl-se: Dynamic curricu- lum learning for spatiotemporal encoding of brain imaging. arXiv preprint arXiv:2511.15151, 2025. 2

work page arXiv 2025
[43]

Damnet: Dynamic mobile architectures for alzheimer’s disease.Computers in Biology and Medicine, 185:109517, 2025

Meihua Zhou, Tianlong Zheng, Zhihua Wu, Nan Wan, and Min Cheng. Damnet: Dynamic mobile architectures for alzheimer’s disease.Computers in Biology and Medicine, 185:109517, 2025. 2

work page 2025
[44]

High-resolution encoder–decoder networks for low-contrast medical image segmentation

Sihang Zhou, Dong Nie, Ehsan Adeli, Jianping Yin, Jun Lian, and Dinggang Shen. High-resolution encoder–decoder networks for low-contrast medical image segmentation. IEEE Transactions on Image Processing, 29:461–475, 2019. 1

work page 2019
[45]

M2gcnet: Multi-modal graph convolution network for precise brain tumor segmentation across multi- ple mri sequences.IEEE Transactions on Image Processing,

Tongxue Zhou. M2gcnet: Multi-modal graph convolution network for precise brain tumor segmentation across multi- ple mri sequences.IEEE Transactions on Image Processing,

work page
[46]

An efficient secure and adaptive routing protocol based on gmm- hmm-lstm for internet of underwater things.IEEE Internet of Things Journal, 11(9):16491–16504, 2024

Rongxin Zhu, Azzedine Boukerche, and Qiuling Yang. An efficient secure and adaptive routing protocol based on gmm- hmm-lstm for internet of underwater things.IEEE Internet of Things Journal, 11(9):16491–16504, 2024. 2

work page 2024
[47]

Brain tumor segmentation in mri with multi-modality spatial information enhancement and bound- ary shape correction.Pattern Recognition, 153:110553,

Zhiqin Zhu, Ziyu Wang, Guanqiu Qi, Neal Mazur, Pan Yang, and Yu Liu. Brain tumor segmentation in mri with multi-modality spatial information enhancement and bound- ary shape correction.Pattern Recognition, 153:110553,

work page

[1] [1]

Reducing seg- mentation failures in cardiac mri via late feature fusion and gan-based augmentation.Computers in Biology and Medicine, 161:106973, 2023

Yasmina Al Khalil, Sina Amirrajab, Cristian Lorenz, J ¨urgen Weese, Josien Pluim, and Marcel Breeuwer. Reducing seg- mentation failures in cardiac mri via late feature fusion and gan-based augmentation.Computers in Biology and Medicine, 161:106973, 2023. 2

work page 2023

[2] [2]

Dynamic weighted knowledge distilla- tion for brain tumor segmentation.Pattern Recognition, 155: 110731, 2024

Dianlong An, Panpan Liu, Yan Feng, Pengju Ding, Weifeng Zhou, and Bin Yu. Dynamic weighted knowledge distilla- tion for brain tumor segmentation.Pattern Recognition, 155: 110731, 2024. 1

work page 2024

[3] [3]

An overview of load balancing in hetnets: Old myths and open problems.IEEE Wireless Communications, 21(2):18–25, 2014

Jeffrey G Andrews, Sarabjot Singh, Qiaoyang Ye, Xingqin Lin, and Harpreet S Dhillon. An overview of load balancing in hetnets: Old myths and open problems.IEEE Wireless Communications, 21(2):18–25, 2014. 2

work page 2014

[4] [4]

Lcmf-net: A lightweight collaborative multimodal fusion network for brain tumor segmentation.Neural Networks, page 108257, 2025

Guogang Cao, Zhaojun Yang, Wanying Liang, Sai Zhang, Tao Zhong, Hongdong Mao, Dong Wang, and Ming Zong. Lcmf-net: A lightweight collaborative multimodal fusion network for brain tumor segmentation.Neural Networks, page 108257, 2025. 3

work page 2025

[5] [5]

Moe-lightning: High-throughput moe inference on memory-constrained gpus

Shiyi Cao, Shu Liu, Tyler Griggs, Peter Schafhalter, Xiaox- uan Liu, Ying Sheng, Joseph E Gonzalez, Matei Zaharia, and Ion Stoica. Moe-lightning: High-throughput moe inference on memory-constrained gpus. InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, pages 715–730, 2025. 2

work page 2025

[6] [6]

Lienkamp, Thomas Brox, and Olaf Ronneberger

¨Ozg¨un C ¸ ic ¸ek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, and Olaf Ronneberger. 3d u-net: Learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Interven- tion – MICCAI 2016, pages 424–432, Cham, 2016. Springer International Publishing. 2, 6

work page 2016

[7] [7]

Denseformer-moe: A dense transformer foundation model with mixture of experts for multi-task brain image analysis.IEEE Transactions on Medical Imaging, 2025

Rizhi Ding, Hui Lu, and Manhua Liu. Denseformer-moe: A dense transformer foundation model with mixture of experts for multi-task brain image analysis.IEEE Transactions on Medical Imaging, 2025. 2

work page 2025

[8] [8]

Decoupling feature-driven and mul- timodal fusion attention for clothing-changing person re- identification.Artificial Intelligence Review, 58(8):241,

Yongkang Ding, Xiaoyin Wang, Hao Yuan, Meina Qu, and Xiangzhou Jian. Decoupling feature-driven and mul- timodal fusion attention for clothing-changing person re- identification.Artificial Intelligence Review, 58(8):241,

work page

[9] [9]

Maximum score rout- ing for mixture-of-experts

Bowen Dong, Yilong Fan, Yutao Sun, Zhenyu Li, Tengyu Pan, Zhou Xun, and Jianyong Wang. Maximum score rout- ing for mixture-of-experts. InFindings of the Association for Computational Linguistics: ACL 2025, pages 12619–12632,

work page 2025

[10] [10]

Mixture-of-experts for semantic segmentation of re- moting sensing image

Shaofeng He, Qiu Cheng, Yu Huai, Zhongke Zhu, and Jie Ding. Mixture-of-experts for semantic segmentation of re- moting sensing image. InInternational Conference on Image Processing and Artificial Intelligence (ICIPAl 2024), pages 478–483. SPIE, 2024. 2

work page 2024

[11] [11]

Swinunetr-v2: Stronger swin transformers with stagewise convolutions for 3d medi- cal image segmentation

Yufan He, Vishwesh Nath, Dong Yang, Yucheng Tang, An- driy Myronenko, and Daguang Xu. Swinunetr-v2: Stronger swin transformers with stagewise convolutions for 3d medi- cal image segmentation. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2023, pages 416– 426, Cham, 2023. Springer Nature Switzerland. 2, 6

work page 2023

[12] [12]

nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021

Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Pe- tersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmen- tation.Nature methods, 18(2):203–211, 2021. 3, 6

work page 2021

[13] [13]

Huafeng Li, Zengyi Yang, Yafei Zhang, Wei Jia, Zheng- tao Yu, and Yu Liu. Mulfs-cap: Multimodal fusion- supervised cross-modality alignment perception for unreg- istered infrared-visible image fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 2

work page 2025

[14] [14]

THOR-MoE: Hierarchical task-guided and context-responsive routing for neural machine translation

Yunlong Liang, Fandong Meng, and Jie Zhou. THOR-MoE: Hierarchical task-guided and context-responsive routing for neural machine translation. InProceedings of the 63rd An- nual Meeting of the Association for Computational Linguis- tics (Volume 1: Long Papers), pages 21433–21445, Vienna, Austria, 2025. Association for Computational Linguistics. 2

work page 2025

[15] [15]

Lightm-unet: Mamba assists in lightweight unet for medical image segmentation,

Weibin Liao, Yinghao Zhu, Xinyuan Wang, Chengwei Pan, Yasha Wang, and Liantao Ma. Lightm-unet: Mamba assists in lightweight unet for medical image segmentation.arXiv preprint arXiv:2403.05246, 2024. 2, 6

work page arXiv 2024

[16] [16]

Multimodal brain tumor segmen- tation boosted by monomodal normal brain images.IEEE Transactions on Image Processing, 33:1199–1210, 2024

Huabing Liu, Zhengze Ni, Dong Nie, Dinggang Shen, Jinda Wang, and Zhenyu Tang. Multimodal brain tumor segmen- tation boosted by monomodal normal brain images.IEEE Transactions on Image Processing, 33:1199–1210, 2024. 1

work page 2024

[17] [17]

Cswin-unet: Transformer unet with cross-shaped windows for medical image segmentation.Information Fusion, 113: 102634, 2025

Xiao Liu, Peng Gao, Tao Yu, Fei Wang, and Ru-Yue Yuan. Cswin-unet: Transformer unet with cross-shaped windows for medical image segmentation.Information Fusion, 113: 102634, 2025. 1

work page 2025

[18] [18]

Paddleseg: A high-efficient development toolkit for image segmentation, 2021

Yi Liu, Lutao Chu, Guowei Chen, Zewu Wu, Zeyu Chen, Baohua Lai, and Yuying Hao. Paddleseg: A high-efficient development toolkit for image segmentation, 2021. 2, 7

work page 2021

[19] [19]

3d mri brain tumor segmentation using autoencoder regularization

Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. InBrainlesion: Glioma, Mul- tiple Sclerosis, Stroke and Traumatic Brain Injuries, Cham,

work page

[20] [20]

Springer International Publishing. 6

work page

[21] [21]

Vcanet: Vision transformer with fusion channel and spatial attention module for 3d brain tu- mor segmentation.Computers in Biology and Medicine, 186: 109662, 2025

Dichao Pan, Jianguo Shen, Zaid Al-Huda, and Mo- hammed AA Al-Qaness. Vcanet: Vision transformer with fusion channel and spatial attention module for 3d brain tu- mor segmentation.Computers in Biology and Medicine, 186: 109662, 2025. 1

work page 2025

[22] [22]

A review of affective computing: From unimodal anal- ysis to multimodal fusion.Information fusion, 37:98–125,

Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hus- sain. A review of affective computing: From unimodal anal- ysis to multimodal fusion.Information fusion, 37:98–125,

work page

[23] [23]

Abhiram Potlapalli and Seetharam Khetavath. Exploring the use of deep learning models for image compression in em- bedded systems: Encoder and decoder architectures.Journal of Intelligent Systems & Internet of Things, 15(1), 2025. 1

work page 2025

[24] [24]

A critical review on segmentation of glioma brain tumor and prediction of 9 overall survival.Archives of Computational Methods in En- gineering, 32(3):1525–1569, 2025

Novsheena Rasool and Javaid Iqbal Bhat. A critical review on segmentation of glioma brain tumor and prediction of 9 overall survival.Archives of Computational Methods in En- gineering, 32(3):1525–1569, 2025. 1

work page 2025

[25] [25]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical image com- puting and computer-assisted intervention, pages 234–241. Springer, 2015. 1, 2

work page 2015

[26] [26]

Johansen, Dag Johansen, Michael A

Abhishek Srivastava, Debesh Jha, Sukalpa Chanda, Uma- pada Pal, H ˚avard D. Johansen, Dag Johansen, Michael A. Riegler, Sharib Ali, and P ˚al Halvorsen. Msrf-net: A multi- scale residual fusion network for biomedical image segmen- tation.IEEE Journal of Biomedical and Health Informatics, 26(5):2252–2263, 2022. 2

work page 2022

[27] [27]

Dayu Tan, Zhiyuan Yao, Xin Peng, Haiping Ma, Yike Dai, Yansen Su, and Weimin Zhong. Multi-level medical image segmentation network based on multi-scale and context in- formation fusion strategy.IEEE Transactions on Emerging Topics in Computational Intelligence, 8(1):474–487, 2023. 2

work page 2023

[28] [28]

Narrowing the semantic gaps in u-net with learnable skip connections: The case of medical image segmentation.Neu- ral Networks, 178:106546, 2024

Haonan Wang, Peng Cao, Jinzhu Yang, and Osmar Zaiane. Narrowing the semantic gaps in u-net with learnable skip connections: The case of medical image segmentation.Neu- ral Networks, 178:106546, 2024. 1

work page 2024

[29] [29]

Mff-sdd: A bidirectional guidance and multiscale multimodal fusion model for small defect detection in industrial films.IEEE Transactions on Industrial Informatics, 2025

Huiyan Wang, Ruihao Peng, Ming Ying, Fashuai Li, Jiuyi Zhang, Xiaolan Li, Yan Tian, and Guofeng Zhang. Mff-sdd: A bidirectional guidance and multiscale multimodal fusion model for small defect detection in industrial films.IEEE Transactions on Industrial Informatics, 2025. 2

work page 2025

[30] [30]

Transbts: Multimodal brain tumor segmen- tation using transformer

Wenxuan Wang, Chen Chen, Meng Ding, Hong Yu, Sen Zha, and Jiangyun Li. Transbts: Multimodal brain tumor segmen- tation using transformer. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2021, pages 109– 119, Cham, 2021. Springer International Publishing. 2, 6

work page 2021

[31] [31]

Deep multimodal fusion by chan- nel exchanging.Advances in neural information processing systems, 33:4835–4845, 2020

Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, and Junzhou Huang. Deep multimodal fusion by chan- nel exchanging.Advances in neural information processing systems, 33:4835–4845, 2020. 2

work page 2020

[32] [32]

Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation

Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, and Lei Zhu. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2024, pages 578–588, Cham, 2024. Springer Nature Switzerland. 2, 6

work page 2024

[33] [33]

Lightweight real-time semantic seg- mentation network with efficient transformer and cnn.IEEE Transactions on Intelligent Transportation Systems, 24(12): 15897–15906, 2023

Guoan Xu, Juncheng Li, Guangwei Gao, Huimin Lu, Jian Yang, and Dong Yue. Lightweight real-time semantic seg- mentation network with efficient transformer and cnn.IEEE Transactions on Intelligent Transportation Systems, 24(12): 15897–15906, 2023. 2

work page 2023

[34] [34]

Dynamic multimodal fu- sion

Zihui Xue and Radu Marculescu. Dynamic multimodal fu- sion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 2575–2584,

work page

[35] [35]

XMoE: Sparse models with fine-grained and adaptive expert selection

Yuanhang Yang, Shiyi Qi, Wenchao Gu, Chaozheng Wang, Cuiyun Gao, and Zenglin Xu. XMoE: Sparse models with fine-grained and adaptive expert selection. InFindings of the Association for Computational Linguistics: ACL 2024, pages 11664–11674, Bangkok, Thailand, 2024. Association for Computational Linguistics. 2

work page 2024

[36] [36]

All- in-one medical image restoration via task-adaptive routing

Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Yi, Hui Zhang, Dan Zhao, Bingzheng Wei, and Yan Xu. All- in-one medical image restoration via task-adaptive routing. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 67–77. Springer,

work page

[37] [37]

Su- perlightnet: Lightweight parameter aggregation network for multimodal brain tumor segmentation

Feng Yu, Jiacheng Cao, Li Liu, and Minghua Jiang. Su- perlightnet: Lightweight parameter aggregation network for multimodal brain tumor segmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5197–5206, 2025. 2, 6

work page 2025

[38] [38]

Resilient datacenter load balancing in the wild

Hong Zhang, Junxue Zhang, Wei Bai, Kai Chen, and Mosharaf Chowdhury. Resilient datacenter load balancing in the wild. InProceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 253– 266, 2017. 2

work page 2017

[39] [39]

Sed: Searching enhanced decoder with switchable skip connection for semantic segmentation.Pat- tern Recognition, 149:110196, 2024

Xian Zhang, Zhibin Quan, Qiang Li, Dejun Zhu, and Wankou Yang. Sed: Searching enhanced decoder with switchable skip connection for semantic segmentation.Pat- tern Recognition, 149:110196, 2024. 1

work page 2024

[40] [40]

Mpmoe: Memory efficient moe for pre-trained models with adaptive pipeline parallelism.IEEE Transactions on Parallel and Distributed Systems, 35(6):998–1011, 2024

Zheng Zhang, Yaqi Xia, Hulin Wang, Donglin Yang, Chuang Hu, Xiaobo Zhou, and Dazhao Cheng. Mpmoe: Memory efficient moe for pre-trained models with adaptive pipeline parallelism.IEEE Transactions on Parallel and Distributed Systems, 35(6):998–1011, 2024. 2

work page 2024

[41] [41]

Contrast-aware hybrid attention network for medi- cal image segmentation.Information Sciences, page 123000,

Meihua Zhou, Jun Feng, Tianlong Zheng, Min Cheng, and Li Yang. Contrast-aware hybrid attention network for medi- cal image segmentation.Information Sciences, page 123000,

work page

[42] [42]

Dcl-se: Dynamic curricu- lum learning for spatiotemporal encoding of brain imaging

Meihua Zhou, Xinyu Tong, Jiarui Zhao, Min Cheng, Li Yang, Lei Tian, and Nan Wan. Dcl-se: Dynamic curricu- lum learning for spatiotemporal encoding of brain imaging. arXiv preprint arXiv:2511.15151, 2025. 2

work page arXiv 2025

[43] [43]

Damnet: Dynamic mobile architectures for alzheimer’s disease.Computers in Biology and Medicine, 185:109517, 2025

Meihua Zhou, Tianlong Zheng, Zhihua Wu, Nan Wan, and Min Cheng. Damnet: Dynamic mobile architectures for alzheimer’s disease.Computers in Biology and Medicine, 185:109517, 2025. 2

work page 2025

[44] [44]

High-resolution encoder–decoder networks for low-contrast medical image segmentation

Sihang Zhou, Dong Nie, Ehsan Adeli, Jianping Yin, Jun Lian, and Dinggang Shen. High-resolution encoder–decoder networks for low-contrast medical image segmentation. IEEE Transactions on Image Processing, 29:461–475, 2019. 1

work page 2019

[45] [45]

M2gcnet: Multi-modal graph convolution network for precise brain tumor segmentation across multi- ple mri sequences.IEEE Transactions on Image Processing,

Tongxue Zhou. M2gcnet: Multi-modal graph convolution network for precise brain tumor segmentation across multi- ple mri sequences.IEEE Transactions on Image Processing,

work page

[46] [46]

An efficient secure and adaptive routing protocol based on gmm- hmm-lstm for internet of underwater things.IEEE Internet of Things Journal, 11(9):16491–16504, 2024

Rongxin Zhu, Azzedine Boukerche, and Qiuling Yang. An efficient secure and adaptive routing protocol based on gmm- hmm-lstm for internet of underwater things.IEEE Internet of Things Journal, 11(9):16491–16504, 2024. 2

work page 2024

[47] [47]

Brain tumor segmentation in mri with multi-modality spatial information enhancement and bound- ary shape correction.Pattern Recognition, 153:110553,

Zhiqin Zhu, Ziyu Wang, Guanqiu Qi, Neal Mazur, Pan Yang, and Yu Liu. Brain tumor segmentation in mri with multi-modality spatial information enhancement and bound- ary shape correction.Pattern Recognition, 153:110553,

work page