arxiv: 2510.04705 · v4 · submitted 2025-10-06 · 💻 cs.CV

Label-Efficient Cross-Modality Generalization for Liver Segmentation in Multi-Phase MRI

Quang-Khai Bui-Tran , Minh-Toan Dinh , Thanh-Huy Nguyen , Ba-Thinh Lam , Mai-Anh Vu , Ulas Bagci This is my paper

Pith reviewed 2026-05-18 10:28 UTC · model grok-4.3

classification 💻 cs.CV

keywords liver segmentationmulti-phase MRIcross-modality generalizationlabel-efficient learningcross pseudo supervisionfoundation model adaptationmedical image segmentationmulti-vendor MRI

0 comments p. Extension

The pith

A fine-tuned foundation 3D model combined with cross pseudo supervision segments livers accurately in multi-phase MRI across vendors using limited labels and no registration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to build a segmentation system that maintains high performance on multi-phase MRI even when labels exist only for the hepatobiliary phase and the remaining sequences remain unlabeled. It adapts a large pre-trained 3D backbone through fine-tuning, adds co-training that generates pseudo-labels across phases, and applies a fixed preprocessing pipeline. The approach matters because liver segmentation supports fibrosis assessment yet real clinical scans routinely arrive misaligned, with missing phases, and from different scanner vendors. By avoiding any explicit spatial registration step, the method shows that generalization can occur directly from the mixed labeled and unlabeled data.

Core claim

The model integrates a foundation-scale 3D segmentation backbone that is adapted via fine-tuning, then co-trained with cross pseudo supervision on unlabeled volumes together with a standardized preprocessing pipeline; without requiring spatial registration, this combination produces robust liver segmentation across MRI phases and vendors in both labeled and unlabeled domains.

What carries the argument

Foundation-scale 3D segmentation backbone adapted by fine-tuning and co-trained via cross pseudo supervision on unlabeled multi-phase volumes.

If this is right

The same backbone and co-training scheme can be applied directly to non-contrast sequences without new annotations.
Segmentation accuracy holds across different scanner vendors without additional domain adaptation modules.
Real-world clinical pipelines can skip the registration step that is often unavailable or error-prone.
Foundation-model adaptation plus cross pseudo supervision forms a practical baseline for other label-scarce medical segmentation tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The technique could lower annotation costs enough to make routine multi-phase liver quantification feasible in smaller hospitals.
Similar pseudo-supervision patterns may transfer to other organs where phases or sequences are routinely misaligned.
Testing on longitudinal patient data would reveal whether the model tracks fibrosis progression without re-labeling each visit.

Load-bearing premise

Cross pseudo supervision can still produce reliable training signals from unlabeled MRI volumes even when the volumes are spatially misaligned, have missing phases, and come from different vendors.

What would settle it

Performance measured by Dice score on held-out non-contrast phases from a new vendor falls below a standard supervised baseline once spatial misalignment exceeds a few millimeters or when one or more phases are absent.

Figures

Figures reproduced from arXiv: 2510.04705 by Ba-Thinh Lam, Mai-Anh Vu, Minh-Toan Dinh, Quang-Khai Bui-Tran, Thanh-Huy Nguyen, Ulas Bagci.

**Figure 2.** Figure 2: Qualitative results on GED4 under different pretraining weights and preprocessing pipelines [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative results on GED4 using different semi-supervised methods. Our models are designed to address multiple subtasks concurrently. As reported in [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Accurate liver segmentation in multi-phase MRI is vital for liver fibrosis assessment, yet labeled data is often scarce and unevenly distributed across imaging modalities and vendor systems. We propose a label-efficient segmentation approach that promotes cross-modality generalization under real-world conditions, where GED4 hepatobiliary-phase annotations are limited, non-contrast sequences (T1WI, T2WI, DWI) are unlabeled, and spatial misalignment and missing phases are common. Our method integrates a foundation-scale 3D segmentation backbone adapted via fine-tuning, co-training with cross pseudo supervision to leverage unlabeled volumes, and a standardized preprocessing pipeline. Without requiring spatial registration, the model learns to generalize across MRI phases and vendors, demonstrating robust segmentation performance in both labeled and unlabeled domains. Our results exhibit the effectiveness of our proposed label-efficient baseline for liver segmentation in multi-phase, multi-vendor MRI and highlight the potential of combining foundation model adaptation with co-training for real-world clinical imaging tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This adapts a foundation model with cross pseudo supervision for label-efficient liver segmentation across multi-phase multi-vendor MRI without registration, but the abstract gives no numbers to show whether the pseudo labels actually help or hurt under misalignment.

read the letter

Colleague, The punchline for this paper is that it offers a label-efficient method for liver segmentation in multi-phase MRI by fine-tuning a foundation model and using cross pseudo supervision on unlabeled volumes from different phases and vendors, all without needing spatial registration. This directly tackles the scarcity of annotations and the common issues of misalignment and missing data in clinical settings. What the paper does well is to clearly describe how these pieces fit together: adapting the 3D segmentation backbone, co-training with pseudo labels to use the unlabeled T1WI, T2WI, and DWI data, and applying a standardized preprocessing pipeline. This combination allows generalization across phases and vendors, which is a practical step forward for tasks like fibrosis assessment where full labeled multi-phase datasets are rare. The authors engage honestly with the real-world constraints rather than assuming ideal conditions. The soft spots are mainly around the evidence. The abstract claims robust performance but provides no specific metrics, baselines, or ablation studies, making it hard to evaluate if the approach truly improves over simpler fine-tuning or if the pseudo supervision adds value. The stress-test concern about pseudo labels being spatially inconsistent due to misalignment is important here. Without registration, the supervision signals could be noisy, especially with vendor and sequence shifts, potentially limiting the label efficiency. If the full paper includes experiments that measure this and show net benefits, that would be key; otherwise, it remains a question mark. This work is for researchers in medical computer vision and clinical imaging teams dealing with multi-modal MRI data under limited labeling budgets. A reader focused on practical deployment in hospitals would appreciate the pipeline details and the emphasis on no-registration generalization. It deserves serious peer review because the problem is clinically meaningful and the method is a reasonable extension of known techniques, even if the current presentation needs more empirical support to be fully convincing. I recommend engaging with it through review to see the detailed results and any additional validations.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a label-efficient segmentation method for liver in multi-phase MRI under real-world constraints: limited GED4 hepatobiliary-phase labels, unlabeled T1WI/T2WI/DWI volumes, spatial misalignment, missing phases, and multi-vendor shifts. The approach fine-tunes a foundation-scale 3D segmentation backbone, applies co-training with cross pseudo supervision on the unlabeled data, and uses a standardized preprocessing pipeline. It claims that the model generalizes across phases and vendors without any spatial registration step and achieves robust performance in both labeled and unlabeled domains.

Significance. If the empirical claims hold with proper validation, the work would supply a practical, registration-free baseline for label-efficient liver segmentation in heterogeneous clinical MRI datasets. The combination of foundation-model adaptation and cross pseudo supervision addresses a common pain point in multi-phase, multi-vendor imaging where labeled hepatobiliary data are scarce.

major comments (2)

[Abstract and §4] Abstract and §4 (Results): The central claim of 'robust segmentation performance' and 'effectiveness of our proposed label-efficient baseline' is stated without any quantitative metrics (Dice, HD95, ASD), baseline comparisons, dataset sizes, or error bars. This absence leaves the label-efficiency and cross-modality generalization assertions without visible empirical support.
[§3.2] §3.2 (Cross Pseudo Supervision): The load-bearing assumption that cross pseudo supervision can generate reliable signals from misaligned, missing-phase, multi-vendor unlabeled volumes without registration is not accompanied by an ablation or pseudo-label quality analysis. In the presence of spatial inconsistency between phases, noisy pseudo-labels could degrade rather than improve generalization; a controlled experiment isolating this component is required to substantiate the claim.

minor comments (2)

[§3.1] The preprocessing pipeline is described as 'standardized' but the exact intensity normalization, resampling, and cropping parameters are not listed; these details should be provided for reproducibility.
[Figures] Figure captions and axis labels in the results figures should explicitly state the evaluation metric, number of test cases, and whether the reported values are means over cross-validation folds.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to strengthen the empirical presentation and methodological validation.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Results): The central claim of 'robust segmentation performance' and 'effectiveness of our proposed label-efficient baseline' is stated without any quantitative metrics (Dice, HD95, ASD), baseline comparisons, dataset sizes, or error bars. This absence leaves the label-efficiency and cross-modality generalization assertions without visible empirical support.

Authors: We agree that the abstract would benefit from explicit quantitative support to better convey the claims. The results section (§4) already contains comparative tables reporting Dice, HD95, and ASD across labeled and unlabeled phases/vendors, with baseline comparisons to supervised and semi-supervised methods, dataset sizes, and standard deviations. To address the concern directly, we have revised the abstract to include a concise summary of the key metrics (e.g., mean Dice improvements) and added a consolidated results table with error bars in §4 for immediate visibility. revision: yes
Referee: [§3.2] §3.2 (Cross Pseudo Supervision): The load-bearing assumption that cross pseudo supervision can generate reliable signals from misaligned, missing-phase, multi-vendor unlabeled volumes without registration is not accompanied by an ablation or pseudo-label quality analysis. In the presence of spatial inconsistency between phases, noisy pseudo-labels could degrade rather than improve generalization; a controlled experiment isolating this component is required to substantiate the claim.

Authors: We recognize the value of isolating the contribution of cross pseudo supervision (CPS) to confirm it does not introduce harmful noise under misalignment. Our design relies on the foundation backbone's robustness and phase-invariant feature learning via CPS. We have added a controlled ablation study in the revised manuscript comparing performance with and without CPS, along with pseudo-label quality metrics (e.g., overlap with available ground truth on validation subsets) and qualitative examples. These results demonstrate consistent gains from CPS without degradation, substantiating the approach. revision: yes

Circularity Check

0 steps flagged

No circularity: standard empirical ML pipeline with independent experimental validation

full rationale

The paper describes an empirical label-efficient segmentation method using fine-tuning of a foundation-scale 3D backbone, cross pseudo supervision on unlabeled multi-phase MRI volumes, and a preprocessing pipeline. No derivation chain, equations, or first-principles result is presented that reduces by construction to fitted inputs or self-citations. Claims of cross-modality generalization rest on experimental outcomes across labeled and unlabeled domains rather than any definitional equivalence or renamed known result. The approach follows conventional semi-supervised and domain-adaptation practices without load-bearing self-citation chains or ansatz smuggling.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method rests on standard deep-learning assumptions about pseudo-label quality in semi-supervised settings and the transferability of a 3D foundation model to MRI domains; no new entities or ad-hoc parameters are introduced in the abstract.

axioms (2)

domain assumption Cross pseudo supervision produces reliable supervisory signals from unlabeled volumes despite domain shifts and missing phases.
Invoked when claiming generalization without registration.
domain assumption A pre-trained 3D segmentation foundation model can be effectively adapted to multi-phase MRI via fine-tuning.
Central to the adaptation step described.

pith-pipeline@v0.9.0 · 5718 in / 1404 out tokens · 32119 ms · 2026-05-18T10:28:08.566787+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We utilize a dual-network training strategy ... Cross Pseudo Supervision Loss: ... Lu_cps = ... ℓ_ce(p1i, y2i) + ℓ_ce(p2i, y1i)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

STU-Net ... pretrained on the TotalSegmentator dataset ... fine-tune it on the ATLAS liver segmentation dataset

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)

Azad,R.,Aghdam,E.K.,Rauland,A.,Jia,Y.,Avval,A.H.,Bozorgpour,A.,Karim- ijafarbigloo, S., Cohen, J.P., Adeli, E., Merhof, D.: Medical image segmentation review: The success of u-net. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)

work page 2024
[2]

In: Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition

Bai, Y., Chen, D., Li, Q., Shen, W., Wang, Y.: Bidirectional copy-paste for semi- supervised medical image segmentation. In: Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition. pp. 11514–11524 (2023)

work page 2023
[3]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Chen, X., Yuan, Y., Zeng, G., Wang, J.: Semi-supervised semantic segmentation with cross pseudo supervision. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2613–2622 (2021)

work page 2021
[4]

Radiographics30(1), e38 (2010)

Curvo-Semedo, L., Brito, J.B., Seco, M.F., Costa, J.F., Marques, C.B., Caseiro- Alves, F.: The hypointense liver lesion on t2-weighted mr images and what it means. Radiographics30(1), e38 (2010)

work page 2010
[5]

In: International Con- ference on Medical Image Computing and Computer-Assisted Intervention

Gao, Z., Liu, Y., Wu, F., Shi, N., Shi, Y., Zhuang, X.: A reliable and interpretable framework of multi-view learning for liver fibrosis staging. In: International Con- ference on Medical Image Computing and Computer-Assisted Intervention. pp. 178–188 (2023)

work page 2023
[6]

arXiv preprint arXiv:2307.01984 (2023)

Heller, N., Isensee, F., Trofimova, D., Tejpaul, R., Zhao, Z., Chen, H., Wang, L., Golts, A., Khapun, D., Shats, D., et al.: The kits21 challenge: Automatic segmen- tation of kidneys, renal tumors, and renal cysts in corticomedullary-phase ct. arXiv preprint arXiv:2307.01984 (2023)

work page arXiv 2023
[7]

arXiv preprint arXiv:2304.06716 (2023)

Huang, Z., Wang, H., Deng, Z., Ye, J., Su, Y., Sun, H., He, J., Gu, Y., Gu, L., Zhang, S., et al.: Stu-net: Scalable and transferable medical image segmen- tation models empowered by large-scale supervised pre-training. arXiv preprint arXiv:2304.06716 (2023)

work page arXiv 2023
[8]

Nature methods18(2), 203–211 (2021)

Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods18(2), 203–211 (2021)

work page 2021
[9]

In: International Conference on Medical Image Computing and Computer- Assisted Intervention

Isensee, F., Wald, T., Ulrich, C., Baumgartner, M., Roy, S., Maier-Hein, K., Jaeger, P.F.: nnu-net revisited: A call for rigorous validation in 3d medical image segmen- tation. In: International Conference on Medical Image Computing and Computer- Assisted Intervention. pp. 488–498. Springer (2024)

work page 2024
[10]

Clinical Anatomy27(5), 764–769 (2014)

Juza, R.M., Pauli, E.M.: Clinical and surgical anatomy of the liver: a review for clinicians. Clinical Anatomy27(5), 764–769 (2014)

work page 2014
[11]

Scientific reports14(1), 27883 (2024)

Kang, B., Nam, H., Kang, M., Heo, K.S., Lim, M., Oh, J.H., Kam, T.E.: Target- aware cross-modality unsupervised domain adaptation for vestibular schwannoma and cochlea segmentation. Scientific reports14(1), 27883 (2024)

work page 2024
[12]

Medical Image Analysis102, 103507 (2025)

Liu, Y., Gao, Z., Shi, N., Wu, F., Shi, Y., Chen, Q., Zhuang, X.: Merit: Multi- view evidential learning for reliable and interpretable liver fibrosis staging. Medical Image Analysis102, 103507 (2025)

work page 2025
[13]

Nature Communications15(1), 654 (2024)

Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nature Communications15(1), 654 (2024)

work page 2024
[14]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Ma, Q., Zhang, J., Qi, L., Yu, Q., Shi, Y., Gao, Y.: Constructing and exploring intermediate domains in mixed domain semi-supervised medical image segmenta- tion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11642–11651 (2024)

work page 2024
[15]

ipsilateral dual-views breast cancer analysis

Nguyen, T.H., Kha, Q.H., Truong, T.N.T., Lam, B.T., Ngo, B.H., Dinh, Q.V., Le, N.Q.K.: Towards robust natural-looking mammography lesion synthesis on 12 Quang-Khai, Minh-Toan, Thanh-Huy et al. ipsilateral dual-views breast cancer analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 2564–2573 (2023)

work page 2023
[16]

In: Medical Imaging with Deep Learning

Nguyen, T.H., Nguyen, T., Nguyen, X.B., Vu, N.L.V., Dinh, V.Q., MERI- AUDEAU, F.: Semi-supervised skin lesion segmentation under dual mask ensemble with feature discrepancy co-training. In: Medical Imaging with Deep Learning

work page
[17]

In: AAAI Bridge Program on AI for Medicine and Healthcare

Nguyen, T.H., Vu, N.L.V., Nguyen, H.T., Dinh, Q.V., Li, X., Xu, M.: Semi- supervised histopathology image segmentation with feature diversified collabora- tive learning. In: AAAI Bridge Program on AI for Medicine and Healthcare. pp. 165–172. PMLR (2025)

work page 2025
[18]

In: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI)

Pham, H.H., Nguyen, H.T., Vu, N.L.V., Dinh, Q.V., Nguyen, T.H., Li, X., Xu, M., et al.: Fetal-bcp: Addressing empirical distribution gap in semi-supervised fe- tal ultrasound segmentation. In: 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI). pp. 1–4. IEEE (2025)

work page 2025
[19]

Data8(5), 79 (2023)

Quinton, F., Popoff, R., Presles, B., Leclerc, S., Meriaudeau, F., Nodari, G., Lopez, O., Pellegrinelli, J., Chevallier, O., Ginhac, D., et al.: A tumour and liver automatic segmentation (atlas) dataset on contrast-enhanced magnetic resonance imaging for hepatocellular carcinoma. Data8(5), 79 (2023)

work page 2023
[20]

In: Proceedings of the International conference on machine learning, Long Beach, CA, USA

Tan, M., Le, Q.E., et al.: Rethinking model scaling for convolutional neural net- works. In: Proceedings of the International conference on machine learning, Long Beach, CA, USA. vol. 15 (2019)

work page 2019
[21]

IEEE Transactions on Pattern Analysis and Machine Intelligence45(5), 6021–6036 (2023)

Wu, F., Zhuang, X.: Minimizing estimated risks on unlabeled data: A new for- mulation for semi-supervised medical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence45(5), 6021–6036 (2023)

work page 2023
[22]

arXiv preprint arXiv:2408.00874 (2024)

Zhu, J., Hamdi, A., Qi, Y., Jin, Y., Wu, J.: Medical sam 2: Segment medical images as video via segment anything model 2. arXiv preprint arXiv:2408.00874 (2024)

work page arXiv 2024