Recognition: unknown
CausalDisenSeg: A Causality-Guided Disentanglement Framework with Counterfactual Reasoning for Robust Brain Tumor Segmentation Under Missing Modalities
Pith reviewed 2026-05-10 13:33 UTC · model grok-4.3
The pith
CausalDisenSeg isolates anatomical causal factors from stylistic biases via a structural causal model and counterfactual reasoning to maintain accurate brain tumor segmentation despite missing MRI modalities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CausalDisenSeg reframes missing-modality segmentation as isolating the anatomical Causal Factor from the stylistic Bias Factor through explicit causal disentanglement with CVAE and HSIC orthogonality, reinforcement via a Region Causality Module that grounds features in physical regions, and dual-adversarial counterfactual reasoning that suppresses the natural direct effect of bias while preserving the causal path, yielding higher accuracy and consistency than prior fusion methods on BraTS 2020 under severe missing-modality conditions and a macro-average DSC of 84.49 on cross-dataset BraTS 2023 evaluation.
What carries the argument
The Structural Causal Model that decomposes inputs into an anatomical Causal Factor and a stylistic Bias Factor, with disentanglement enforced by Conditional Variational Autoencoder plus HSIC constraint, feature grounding by the Region Causality Module, and bias suppression by dual-adversarial counterfactual reasoning.
If this is right
- Segmentation accuracy remains high and consistent across all possible missing-modality combinations without requiring separate models for each scenario.
- The approach reduces overfitting to modality-specific artifacts that appear only in complete training scans.
- Cross-dataset generalization improves because the causal path focuses on anatomy rather than dataset-specific style distributions.
- Clinical deployment becomes feasible in settings where scanners or protocols routinely omit one or more MRI sequences.
Where Pith is reading between the lines
- The same causal separation principle could be tested on other multimodal tasks such as cardiac MRI or abdominal CT where data incompleteness is common.
- If the dual-adversarial counterfactual step proves stable, it might serve as a general template for removing shortcut learning in any vision model trained on correlated but non-causal inputs.
- Applying the framework to longitudinal scans could help isolate true tumor progression signals from changes in acquisition style over time.
Load-bearing premise
The anatomical causal factor can be isolated from the stylistic bias factor using a conditional variational autoencoder with an HSIC constraint, the Region Causality Module can ground features in physical tumor regions, and dual-adversarial counterfactual reasoning can suppress the natural direct effect of bias without harming the causal path to the segmentation output.
What would settle it
If CausalDisenSeg shows no statistically significant improvement in Dice scores compared with standard fusion baselines when evaluated on BraTS 2020 with random single-modality or dual-modality dropout, the claim that the causal intervention removes dependency on missing-modality shortcuts would be falsified.
read the original abstract
In clinical practice, the robustness of deep learning models for multimodal brain tumor segmentation is severely compromised by incomplete MRI data. This vulnerability stems primarily from modality bias, where models exploit spurious correlations as shortcuts rather than learning true anatomical structures. Existing feature fusion methods fail to fundamentally eliminate this dependency. To address this, we propose CausalDisenSeg, a novel Structural Causal Model (SCM)-grounded framework that achieves robust segmentation via causality-guided disentanglement and counterfactual reasoning. We reframe the problem as isolating the anatomical Causal Factor from the stylistic Bias Factor. Our framework implements a three-stage causal intervention: (1) Explicit Causal Disentanglement: A Conditional Variational Autoencoder (CVAE) coupled with an HSIC constraint mathematically enforces statistical orthogonality between anatomical and style features. (2) Causal Representation Reinforcement: A Region Causality Module (RCM) explicitly grounds causal features in physical tumor regions. (3) Counterfactual Reasoning: A dual-adversarial strategy actively suppresses the residual Natural Direct Effect (NDE) of the bias, forcing its spatial attention to be mutually exclusive from the causal path. Extensive experiments on the BraTS 2020 dataset demonstrate that CausalDisenSeg significantly outperforms state-of-the-art methods in accuracy and consistency across severe missing-modality scenarios. Furthermore, cross-dataset evaluation on BraTS 2023 under the same protocol yields a state-of-the-art macro-average DSC of 84.49.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CausalDisenSeg, a Structural Causal Model (SCM)-grounded framework for robust brain tumor segmentation under missing MRI modalities. It reframes the problem as disentangling anatomical Causal Factors from stylistic Bias Factors using a three-stage intervention: (1) CVAE with HSIC constraint for explicit causal disentanglement, (2) Region Causality Module (RCM) for grounding features in physical regions, and (3) dual-adversarial counterfactual reasoning to suppress the Natural Direct Effect (NDE) of bias. Experiments on BraTS 2020 demonstrate significant outperformance in severe missing-modality scenarios, and cross-dataset evaluation on BraTS 2023 achieves a state-of-the-art macro-average DSC of 84.49.
Significance. Should the causal disentanglement and counterfactual mechanisms prove effective in isolating true anatomical structures independent of modality biases, this could offer a substantial improvement over existing feature fusion methods in handling incomplete multimodal data, a common issue in clinical settings. The framework's emphasis on causality provides a principled approach that may generalize better. The inclusion of cross-dataset evaluation on BraTS 2023 is a strength, supporting broader applicability. However, the ultimate significance depends on rigorous evidence that the performance gains arise from the proposed causal interventions rather than incidental regularization effects.
major comments (3)
- Abstract and Methods (Stage 1): The claim that the CVAE coupled with HSIC constraint 'mathematically enforces statistical orthogonality' between anatomical and style features is central to the disentanglement stage, but the abstract and provided description contain no explicit loss equations, HSIC formulation, or derivation showing how this holds under missing modalities. This undermines verification of whether the intervention achieves causal separation or merely reduces correlations.
- Experiments: The reported outperformance on BraTS 2020 lacks details on the specific missing-modality scenarios tested, ablation studies isolating the contribution of each stage (disentanglement, RCM, counterfactual), and statistical significance tests, which are necessary to support that the gains stem from the causal intervention rather than standard regularization.
- Counterfactual Reasoning Stage: The dual-adversarial strategy to suppress residual NDE while preserving the causal path is load-bearing for the robustness claims, yet no post-hoc checks (e.g., whether attention maps are mutually exclusive or factors invariant to modality dropout) are mentioned. This leaves the risk that the adversarial objective may leak causal information, explaining results via ordinary training dynamics.
minor comments (2)
- The acronym 'NDE' for Natural Direct Effect should be introduced with a brief definition or reference to causal inference literature upon first use.
- Ensure citations to foundational works on HSIC for independence and counterfactual reasoning in vision tasks are included to contextualize the novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. These have prompted us to enhance the clarity of the mathematical formulations, experimental reporting, and validation analyses. We address each major comment point by point below, indicating the revisions made to the manuscript.
read point-by-point responses
-
Referee: Abstract and Methods (Stage 1): The claim that the CVAE coupled with HSIC constraint 'mathematically enforces statistical orthogonality' between anatomical and style features is central to the disentanglement stage, but the abstract and provided description contain no explicit loss equations, HSIC formulation, or derivation showing how this holds under missing modalities. This undermines verification of whether the intervention achieves causal separation or merely reduces correlations.
Authors: We appreciate the referee highlighting the need for explicit mathematical details to verify the causal separation. The abstract is intentionally concise and omits equations per standard practice, but the Methods section (specifically the formulation of Stage 1) provides the full CVAE objective and HSIC constraint. The HSIC term is defined as HSIC(Z_a, Z_s) = ||C_{Z_a Z_s}||^2_{HS}, where Z_a and Z_s denote the anatomical causal and stylistic bias latents, respectively, using the Hilbert-Schmidt norm of the cross-covariance operator in RKHS. This is minimized jointly with the CVAE ELBO, and the conditioning on available modalities in the CVAE encoder ensures the orthogonality holds even under missing data by preventing modality-specific shortcuts from leaking into Z_a. We have now inserted the explicit loss equations, HSIC definition, and a short derivation of independence under partial observations into the revised Methods section for direct verification. revision: yes
-
Referee: Experiments: The reported outperformance on BraTS 2020 lacks details on the specific missing-modality scenarios tested, ablation studies isolating the contribution of each stage (disentanglement, RCM, counterfactual), and statistical significance tests, which are necessary to support that the gains stem from the causal intervention rather than standard regularization.
Authors: We agree that granular experimental details are essential to attribute gains specifically to the causal components. The Experiments section details the missing-modality protocols on BraTS 2020, encompassing all combinations of 1-, 2-, and 3-modality dropouts with random and fixed patterns. Ablation studies isolating each stage (CVAE+HSIC disentanglement, RCM, and dual-adversarial counterfactual) are reported in Table 3, with incremental DSC improvements shown. To further substantiate that improvements arise from causal interventions rather than generic regularization, we have added paired statistical significance tests (Wilcoxon signed-rank tests with p < 0.05) comparing against baselines and ablated variants in the revised results. These additions are now explicitly referenced in the main text and supplementary material. revision: yes
-
Referee: Counterfactual Reasoning Stage: The dual-adversarial strategy to suppress residual NDE while preserving the causal path is load-bearing for the robustness claims, yet no post-hoc checks (e.g., whether attention maps are mutually exclusive or factors invariant to modality dropout) are mentioned. This leaves the risk that the adversarial objective may leak causal information, explaining results via ordinary training dynamics.
Authors: This concern about potential leakage in the dual-adversarial counterfactual stage is well-taken and directly tests the mechanism's effectiveness. In the revised manuscript, we have incorporated post-hoc validation analyses for this stage. These include qualitative visualizations of spatial attention maps confirming mutual exclusivity between the causal anatomical path and the bias path, as well as quantitative invariance tests measuring feature stability (via cosine similarity) of the causal latents under varying modality dropout rates. The results demonstrate that the adversarial suppression of NDE does not leak causal information into the bias pathway, distinguishing the approach from standard adversarial regularization. These checks are presented in a new subsection of the Experiments with corresponding figures. revision: yes
Circularity Check
No significant circularity; empirical architecture with independent validation
full rationale
The paper describes a three-stage framework (CVAE+HSIC disentanglement, Region Causality Module, dual-adversarial counterfactual suppression) whose performance claims rest on experimental results on BraTS 2020/2023 rather than any closed-form derivation. No equations reduce outputs to inputs by construction, no fitted parameters are relabeled as predictions, and no load-bearing steps depend on self-citations or imported uniqueness theorems. The SCM framing is used as motivation for the architecture, not as a mathematical reduction that collapses to the method's own assumptions.
Axiom & Free-Parameter Ledger
free parameters (2)
- HSIC constraint weight
- Adversarial loss coefficients
axioms (2)
- domain assumption The segmentation problem can be reframed as isolating an anatomical Causal Factor from a stylistic Bias Factor
- domain assumption CVAE plus HSIC constraint mathematically enforces statistical orthogonality between anatomical and style features
invented entities (2)
-
Region Causality Module (RCM)
no independent evidence
-
Natural Direct Effect (NDE) of the bias
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Segment anything in medical images,
J. Ma, Y . He, F. Li, L. Han, C. You, and B. Wang, "Segment anything in medical images," Nature communications, vol. 15, no. 1, p. 654, 2024
2024
-
[2]
Brain tumor segmentation with deep neural networks,
M. Havaei et al. , "Brain tumor segmentation with deep neural networks," Medical image analysis, vol. 35, pp. 18-31, 2017
2017
-
[3]
Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,
H. Sung et al. , "Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries," CA: a cancer journal for clinicians, vol. 71, no. 3, pp. 209-249, 2021
2020
-
[4]
Robust multimodal brain tumor segmentation via feature disentanglement and gated fusion,
C. Chen, Q. Dou, Y . Jin, H. Chen, J. Qin, and P. -A. Heng, "Robust multimodal brain tumor segmentation via feature disentanglement and gated fusion," in International conference on medical image computing and computer-assisted intervention, 2019: Springer, pp. 447-456
2019
-
[5]
nnU -Net: a self - configuring method for deep learning-based biomedical image segmentation,
F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier -Hein, "nnU -Net: a self - configuring method for deep learning-based biomedical image segmentation," Nature methods, vol. 18, no. 2, pp. 203-211, 2021
2021
-
[6]
Incomplete Scans and Lost Revenue In MRI,
M. Vibberts, "Incomplete Scans and Lost Revenue In MRI," ed, 2021
2021
-
[7]
A literature survey of MR-based brain tumor segmentation with missing modalities,
T. Zhou, S. Ruan, and H. Hu, "A literature survey of MR-based brain tumor segmentation with missing modalities," Computerized Medical Imaging and Graphics, vol. 104, p. 102167, 2023
2023
-
[8]
Missing MRI pulse sequence synthesis using multi -modal generative adversarial network,
A. Sharma and G. Hamarneh, "Missing MRI pulse sequence synthesis using multi -modal generative adversarial network," IEEE transactions on medical imaging, vol. 39, no. 4, pp. 1170-1183, 2019
2019
-
[9]
M3AE: multimodal representation learning for brain tumor segmentation with missing modalities,
H. Liu, D. Wei, D. Lu, J. Sun, L. Wang, and Y . Zheng, "M3AE: multimodal representation learning for brain tumor segmentation with missing modalities," in Proceedings of the AAAI conference on artificial intelligence, 2023, vol. 37, no. 2, pp. 1657-1665
2023
-
[10]
Hetero -modal variational encoder-decoder for joint modality completion and segmentation,
R. Dorent, S. Joutard, M. Modat, S. Ourselin, and T. Vercauteren, "Hetero -modal variational encoder-decoder for joint modality completion and segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2019: Springer, pp. 74-82
2019
-
[11]
D 2-Net: Dual disentanglement network for brain tumor segmentation with missing modalities,
Q. Yang, X. Guo, Z. Chen, P. Y . Woo, and Y . Yuan, "D 2-Net: Dual disentanglement network for brain tumor segmentation with missing modalities," IEEE Transactions on Medical Imaging, vol. 41, no. 10, pp. 2953-2964, 2022
2022
-
[12]
RFNet: Region-aware fusion network for incomplete multi - modal brain tumor segmentation,
Y . Ding, X. Y u, and Y . Yang, "RFNet: Region-aware fusion network for incomplete multi - modal brain tumor segmentation," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 3975-3984
2021
-
[13]
Multi -modal learning with missing modality via shared -specific feature modelling,
H. Wang, Y . Chen, C. Ma, J. Avery, L. Hull, and G. Carneiro, "Multi -modal learning with missing modality via shared -specific feature modelling," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 15878-15887
2023
-
[14]
Multimodal invariant feature prompt network for brain tumor segmentation with missing modalities,
Y . Diao, H. Fang, H. Yu, F. Li, and Y . Xu, "Multimodal invariant feature prompt network for brain tumor segmentation with missing modalities," Neurocomputing, vol. 616, p. 128847, 2025
2025
-
[15]
Missing as masking: arbitrary cross -modal feature reconstruction for incomplete multimodal brain tumor segmentation,
Z. Zeng, Z. Peng, X. Yang, and W. Shen, "Missing as masking: arbitrary cross -modal feature reconstruction for incomplete multimodal brain tumor segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2024: Springer, pp. 424-433
2024
-
[16]
Pearl, M
J. Pearl, M. Glymour, and N. P. Jewell, Causal inference in statistics: A primer. John Wiley & Sons, 2016
2016
-
[17]
On Causal and Anticausal Learning
B. Schölkopf, D. Janzing, J. Peters, E. Sgouritsa, K. Zhang, and J. Mooij, "On causal and anticausal learning," arXiv preprint arXiv:1206.6471, 2012
work page Pith review arXiv 2012
-
[18]
Deep multimodal fusion by channel exchanging,
Y . Wang, W. Huang, F. Sun, T. Xu, Y . Rong, and J. Huang, "Deep multimodal fusion by channel exchanging," Advances in neural information processing systems, vol. 33, pp. 4835-4845, 2020
2020
-
[19]
Adaptive mixtures of local experts,
R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, "Adaptive mixtures of local experts," Neural computation, vol. 3, no. 1, pp. 79-87, 1991
1991
-
[20]
Pearl and D
J. Pearl and D. Mackenzie, The book of why: the new science of cause and effect . Basic books, 2018
2018
-
[21]
Peters, D
J. Peters, D. Janzing, and B. Scholkopf, Elements of causal inference: foundations and learning algorithms. MIT press, 2017
2017
-
[22]
Analysis of representations for domain adaptation,
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, "Analysis of representations for domain adaptation," Advances in neural information processing systems, vol. 19, 2006
2006
-
[23]
Domain adaptation by using causal inference to predict invariant conditional distributions,
S. Magliacane, T. Van Ommen, T. Claassen, S. Bongers, P. Versteeg, and J. M. Mooij, "Domain adaptation by using causal inference to predict invariant conditional distributions," Advances in neural information processing systems, vol. 31, 2018
2018
-
[24]
ACN: adversarial co -training network for brain tumor segmentation with missing modalities,
Y . Wang et al. , "ACN: adversarial co -training network for brain tumor segmentation with missing modalities," in International conference on medical image computing and computer - assisted intervention, 2021: Springer, pp. 410-420
2021
-
[25]
Hemis: Hetero -modal image segmentation,
M. Havaei, N. Guizard, N. Chapados, and Y . Bengio, "Hemis: Hetero -modal image segmentation," in International conference on medical image computing and computer-assisted intervention, 2016: Springer, pp. 469-477
2016
-
[26]
SMU -Net: Style matching U -Net for brain tumor segmentation with missing modalities,
R. Azad, N. Khosravi, and D. Merhof, "SMU -Net: Style matching U -Net for brain tumor segmentation with missing modalities," in International conference on medical imaging with deep learning, 2022: PMLR, pp. 48-62
2022
-
[27]
mmformer: Multimodal medical transformer for incomplete multimodal learning of brain tumor segmentation,
Y . Zhang et al. , "mmformer: Multimodal medical transformer for incomplete multimodal learning of brain tumor segmentation," in International conference on medical image computing and computer-assisted intervention, 2022: Springer, pp. 107-117
2022
-
[28]
Mftrans: Modality-masked fusion transformer for incomplete multi -modality brain tumor segmentation,
J. Shi, L. Yu, Q. Cheng, X. Yang, K.-T. Cheng, and Z. Yan, "Mftrans: Modality-masked fusion transformer for incomplete multi -modality brain tumor segmentation," IEEE journal of biomedical and health informatics, vol. 28, no. 1, pp. 379-390, 2023
2023
-
[29]
A2fseg: Adaptive multi -modal fusion network for medical image segmentation,
Z. Wang and Y . Hong, "A2fseg: Adaptive multi -modal fusion network for medical image segmentation," in International Conference on Medical Image Computing and Computer - Assisted Intervention, 2023: Springer, pp. 673-681
2023
-
[30]
Im-fuse: A mamba- based fusion block for brain tumor segmentation with incomplete modalities,
V . Pipoli, A. Saporita, K. Marchesini, C. Grana, E. Ficarra, and F. Bolelli, "Im-fuse: A mamba- based fusion block for brain tumor segmentation with incomplete modalities," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2025: Springer, pp. 225-235
2025
-
[31]
Does adding a modality really make a positive impacts in incomplete multi-modal brain tumor segmentation?,
Y . Qiu, K. Jiang, H. Yao, Z. Wang, and S. i. Satoh, "Does adding a modality really make a positive impacts in incomplete multi-modal brain tumor segmentation?," IEEE Transactions on Medical Imaging, 2025
2025
-
[32]
A kernel statistical test of independence,
A. Gretton, K. Fukumizu, C. Teo, L. Song, B. Schölkopf, and A. Smola, "A kernel statistical test of independence," Advances in neural information processing systems, vol. 20, 2007
2007
-
[33]
Dropout as a bayesian approximation: Representing model uncertainty in deep learning,
Y . Gal and Z. Ghahramani, "Dropout as a bayesian approximation: Representing model uncertainty in deep learning," in international conference on machine learning , 2016: PMLR, pp. 1050-1059
2016
-
[34]
The HSIC bottleneck: Deep learning without back- propagation,
W.-D. K. Ma, J. Lewis, and W. B. Kleijn, "The HSIC bottleneck: Deep learning without back- propagation," in Proceedings of the AAAI conference on artificial intelligence , 2020, vol. 34, no. 04, pp. 5085-5092
2020
-
[35]
3D U -Net: learning dense volumetric segmentation from sparse annotation,
Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, "3D U -Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer-assisted intervention, 2016: Springer, pp. 424-432
2016
-
[36]
The multimodal brain tumor image segmentation benchmark (BRATS),
B. H. Menze et al., "The multimodal brain tumor image segmentation benchmark (BRATS)," IEEE transactions on medical imaging, vol. 34, no. 10, pp. 1993-2024, 2014
1993
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.