arxiv: 2604.15893 · v1 · submitted 2026-04-17 · 💻 cs.CV

Recognition: unknown

PolarMAE: Efficient Fetal Ultrasound Pre-training via Semantic Screening and Polar-Guided Masking

Meng Lv , Yapeng Li , Hang Su , Juhua Liu , Bo Du

Authors on Pith no claims yet

Pith reviewed 2026-05-10 08:59 UTC · model grok-4.3

classification 💻 cs.CV

keywords fetal ultrasoundpre-trainingself-supervised learningmasked autoencoderssemantic screeningpolar maskingmedical image analysisultrasound imaging

0 comments

The pith

PolarMAE tailors masked autoencoding for fetal ultrasound by using semantic screening to cut redundancy and polar-guided masking to focus on acoustic regions and radial patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes PolarMAE as a pre-training framework for fetal ultrasound images that accounts for modality-specific traits ignored by standard methods. It introduces progressive visual-semantic screening to adaptively select high-value samples from redundant continuous scans, an acoustic-bounded region constraint to restrict attention to valid fan-shaped areas, and polar-texture collaborative masking to leverage beamforming priors for learning tissue structures. These changes aim to make unsupervised pre-training both faster and more effective for tasks where labeled data is scarce due to high annotation costs and operator variance. A sympathetic reader would care because successful adaptation could reduce reliance on expensive expert labels while improving interpretation accuracy in prenatal diagnosis.

Core claim

PolarMAE is a pre-training framework that mitigates severe data redundancy through Progressive Visual-Semantic Screening, enforces focus on valid acoustic regions via Acoustic-Bounded Region Constraint, and captures radial imaging patterns with Polar-Texture Collaborative Masking, resulting in state-of-the-art performance across diverse fetal ultrasound datasets and downstream interpretation tasks along with improved pre-training scalability and efficiency.

What carries the argument

Polar-Texture Collaborative Masking (PTCM) together with Progressive Visual-Semantic Screening (PVSS) and Acoustic-Bounded Region Constraint (ABRC), which adapt a masked autoencoder to ultrasound redundancy, fan-shaped locality, and polar coordinate beamforming.

If this is right

The framework reduces the effects of data redundancy and operator variance during pre-training.
Models achieve higher accuracy on fetal ultrasound interpretation tasks with fewer labeled examples.
Pre-training becomes more scalable and computationally efficient across multiple datasets.
The approach enables better capture of radial beamforming patterns and critical tissue structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The semantic screening step could be tested on other high-redundancy medical imaging streams such as continuous video or CT slices.
Polar-guided masking might generalize to other fan-shaped modalities like certain radar or echocardiogram data.
If the acoustic region constraint proves robust, it could be adapted as a general locality prior for any imaging geometry with known invalid background zones.

Load-bearing premise

Existing pre-training methods are limited mainly because they ignore ultrasound-specific characteristics of severe data redundancy, fan-shaped locality, and polar coordinate beamforming.

What would settle it

A direct comparison in which a standard masked autoencoder pre-trained on the same fetal ultrasound datasets matches or exceeds PolarMAE performance on the same downstream interpretation tasks.

Figures

Figures reproduced from arXiv: 2604.15893 by Bo Du, Hang Su, Juhua Liu, Meng Lv, Yapeng Li.

**Figure 1.** Figure 1: Motivation of PolarMAE. Generic MIM methods (b) mismatch the unique physical characteristics of ultrasound data [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Overview of the proposed PolarMAE framework. Our pre-training method consists of three stages. First, semantic de [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative results on private downstream segmen [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 6.** Figure 6: Training efficiency with progressive module intro [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Intelligent fetal ultrasound (US) interpretation is crucial for prenatal diagnosis, but high annotation costs and operator-induced variance make unsupervised pre-training a highly promising paradigm. However, existing pre-training methods largely ignore US-specific characteristics -- severe data redundancy, fan-shaped locality, and polar coordinate beamforming -- limiting their effectiveness in downstream tasks. To address this, we propose PolarMAE, a novel and efficient pre-training framework tailored for US images. Specifically, to mitigate continuous scanning redundancy, we introduce a Progressive Visual-Semantic Screening (PVSS) that adaptively extracts high-value samples, significantly boosting pre-training efficiency. Furthermore, we design an Acoustic-Bounded Region Constraint (ABRC) to accommodate US locality, forcing the model to focus strictly on valid acoustic regions rather than invalid dark backgrounds. Finally, leveraging the beamforming prior and local details, we propose a Polar-Texture Collaborative Masking (PTCM), enabling the model to capture underlying radial imaging patterns and critical tissue structures. Extensive experiments across diverse datasets and downstream interpretation tasks demonstrate that our method achieves state-of-the-art performance with strong pre-training scalability and efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PolarMAE adds three targeted tweaks to MAE for fetal US but the polar masking's benefit is unclear without explicit handling of scan-converted Cartesian inputs.

read the letter

The key point is that PolarMAE combines progressive visual-semantic screening to drop redundant frames, an acoustic-bounded region constraint to ignore dark backgrounds, and polar-texture collaborative masking to respect radial beam patterns. These are practical adjustments for ultrasound rather than broad theoretical advances. The paper does a reasonable job spelling out why generic pre-training wastes effort on US data: continuous scanning creates duplicates and the fan shape leaves large invalid areas. PVSS and ABRC look like straightforward ways to improve efficiency and focus, and they align with the domain's annotation scarcity problem. The experiments are described as extensive across datasets and tasks, which is the right direction for a methods paper. The soft spot is the PTCM component. Standard public fetal US datasets are almost always scan-converted to Cartesian grids, so any polar masking needs an inverse transform using probe depth, angle, and curvature. If the paper approximates this without DICOM metadata or shows only heuristic masking, the claimed radial pattern capture may not be doing the work attributed to it, and gains could trace back to the screening step instead. The abstract asserts SOTA results, but the full paper needs clear ablations and numbers to separate the contributions. This work is aimed at researchers doing self-supervised learning on medical ultrasound who already have access to raw or semi-raw data. A reader looking for domain-adapted MAE variants would pick up usable ideas here, even if they end up modifying the masking. It deserves peer review because the motivation is solid and the combination is new enough to warrant feedback, though revisions will be needed on the geometry details and quantitative breakdowns.

Referee Report

2 major / 1 minor

Summary. The paper proposes PolarMAE, a self-supervised pre-training framework for fetal ultrasound images that addresses data redundancy, fan-shaped locality, and polar beamforming via three components: Progressive Visual-Semantic Screening (PVSS) to adaptively select high-value samples, Acoustic-Bounded Region Constraint (ABRC) to restrict focus to valid acoustic regions, and Polar-Texture Collaborative Masking (PTCM) to capture radial imaging patterns. It claims state-of-the-art results on diverse downstream interpretation tasks with improved pre-training efficiency and scalability.

Significance. If the empirical results and attribution to the US-specific components hold, the work would be significant for domain-adapted self-supervised learning in medical imaging, as it targets ultrasound-specific challenges that generic MAE-style methods overlook, potentially enabling more efficient use of large unlabeled US datasets for prenatal diagnosis tasks.

major comments (2)

PTCM description (and related ABRC): The claim that PTCM captures 'underlying radial imaging patterns' via beamforming priors rests on the assumption that native polar geometry is preserved. Standard public fetal US datasets use post-scan-conversion Cartesian grids, so polar masking requires an explicit inverse transform with probe parameters (depth, angle, curvature). The manuscript does not specify whether DICOM metadata is used or if an approximation is applied; without this, the polar-guided component risks reducing to heuristic masking whose benefit over random or block masking is unclear, undermining attribution of any SOTA gains to the proposed US-specific design rather than generic MAE improvements plus PVSS.
Experiments section: The abstract and central claim assert SOTA performance and strong scalability, yet the provided description contains no quantitative metrics, ablation tables, error analysis, or direct comparisons showing the contribution of PVSS, ABRC, and PTCM individually. This creates a verification gap for the soundness of the empirical claims.

minor comments (1)

Abstract: Claims of 'state-of-the-art performance' and 'strong pre-training scalability and efficiency' would be strengthened by including at least one key quantitative result (e.g., mIoU or Dice improvement on a downstream task) to allow immediate assessment.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the framework introduces named techniques but no mathematical derivations or fitted constants are visible.

pith-pipeline@v0.9.0 · 5502 in / 1046 out tokens · 22013 ms · 2026-05-10T08:59:02.632141+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 12 canonical work pages · 1 internal anchor

[1]

2020.Fetal Intracranial Structure Detection Dataset

Vahid Ashkani. 2020.Fetal Intracranial Structure Detection Dataset. doi:10.17632/ n2rbrb9t4f.1

2020
[2]

Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, and Michael Auli. 2022. Data2vec: A general framework for self-supervised learning in speech, PolarMAE: Efficient Fetal Ultrasound Pre-training via Semantic Screening and Polar-Guided Masking vision and language. InInternational conference on machine learning. PMLR, 1298– 1312

2022
[3]

Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2022. BEiT: BERT Pre- Training of Image Transformers. InInternational Conference on Learning Repre- sentations (ICLR 2022). https://openreview.net/forum?id=dwg5rXg1WS Oral Presentation, ICLR 2022

2022
[4]

Behzad Bozorgtabar, Dwarikanath Mahapatra, and Jean-Philippe Thiran. 2023. Amae: Adaptation of pre-trained masked autoencoder for dual-distribution anom- aly detection in chest x-rays. InInternational Conference on Medical Image Com- puting and Computer-Assisted Intervention. Springer, 195–205

2023
[5]

2013.Manual of Diag- nostic Ultrasound(2 ed.)

Elisabetta Buscarini, Harald Lutz, and Paoletta Mirk (Eds.). 2013.Manual of Diag- nostic Ultrasound(2 ed.). Vol. 2. World Health Organization, Geneva, Switzerland. https://iris.who.int/handle/10665/85386

2013
[6]

Peiya Cai, Tiantian Yang, Qinglai Xie, Peizhong Liu, and Ping Li. 2024. A light- weight hybrid model for the automatic recognition of uterine fibroid ultrasound images based on deep learning.Journal of Clinical Ultrasound52, 6 (2024), 753–762

2024
[7]

Zhiyuan Cai, Li Lin, Huaqing He, and Xiaoying Tang. 2022. Uni4Eye: Unified 2D and 3D self-supervised pre-training via masked image modeling transformer for ophthalmic image classification. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 88–98

2022
[8]

Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, and William T Freeman. 2022. Maskgit: Masked generative image transformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11315–11325

2022
[9]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. InInter- national conference on machine learning. PmLR, 1597–1607

2020
[10]

Zekai Chen, Devansh Agarwal, Kshitij Aggarwal, Wiem Safta, Mariann Micsinai Balan, and Kevin Brown. 2023. Masked image modeling advances 3d medical image analysis. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1970–1980

2023
[11]

David Fan, Jue Wang, Shuai Liao, Zhikang Zhang, Vimal Bhat, and Xinyu Li. 2024. Text-guided video masked autoencoder. InEuropean Conference on Computer Vision. Springer, 282–298

2024
[12]

2020.Fetal Ultrasound Image Dataset for Classification

William Ferguson et al. 2020.Fetal Ultrasound Image Dataset for Classification. doi:10.5281/zenodo.3904280

work page doi:10.5281/zenodo.3904280 2020
[13]

Maria Chiara Fiorentino, Francesca Pia Villani, Mariachiara Di Cosmo, Emanuele Frontoni, and Sara Moccia. 2023. A review on deep-learning algorithms for fetal ultrasound-image analysis.Medical image analysis83 (2023), 102629

2023
[14]

Letian Fu, Long Lian, Renhao Wang, Baifeng Shi, XuDong Wang, Adam Yala, Trevor Darrell, Alexei A Efros, and Ken Goldberg. 2025. Rethinking Patch De- pendence for Masked Autoencoders.Transactions on Machine Learning Research (2025). https://openreview.net/forum?id=JT2KMuo2BV

2025
[15]

Jun Gao, Qicheng Lao, Paul Liu, Huahui Yi, Qingbo Kang, Zekun Jiang, Xiaohu Wu, Kang Li, Yuanyuan Chen, and Le Zhang. 2023. Anatomically guided cross- domain repair and screening for ultrasound fetal biometry.IEEE Journal of Biomedical and Health Informatics27, 10 (2023), 4914–4925

2023
[16]

Yuxin Guo, Siyang Sun, Shuailei Ma, Kecheng Zheng, Xiaoyi Bao, Shijie Ma, Wei Zou, and Yun Zheng. 2024. Crossmae: Cross-modality masked autoencoders for region-aware audio-visual pre-training. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 26721–26731

2024
[17]

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick
[18]

MViTv2: Improved Multiscale Vision Transformers for Classification and Detection , isbn =

Masked Autoencoders Are Scalable Vision Learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, New Orleans, LA, USA, 16000–16009. doi:10.1109/CVPR52688.2022.01553

work page doi:10.1109/cvpr52688.2022.01553 2022
[19]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Mo- mentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738

2020
[20]

Qi He, Xianghao Cui, Qingjing Fei, Wen Xiong, Yongjie Pang, Wenying Liu, Zhi Chen, and Fang Hou. 2025. Masked pretraining of U-Net for ultrasound image segmentation.Scientific Reports15, 1 (2025), 31713

2025
[21]

Yuncheng Jiang et al. 2025. From Pretraining to Privacy: Federated Ultrasound Foundation Model with Self-Supervised Learning.npj Digital Medicine8 (2025),

2025
[22]

doi:10.1038/s41746-025-02085-0

work page doi:10.1038/s41746-025-02085-0
[23]

Jing Jiao, Jin Zhou, Xiaokang Li, Menghua Xia, Yi Huang, Lihong Huang, Na Wang, Xiaofan Zhang, Shichong Zhou, Yuanyuan Wang, and Yi Guo. 2024. USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis.Medical Image Analysis96 (2024), 103202. doi:10.1016/j.media.2024.103202

work page doi:10.1016/j.media.2024.103202 2024
[24]

Longlong Jing and Yingli Tian. 2021. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence43, 11 (2021), 4037–4058. doi:10.1109/TPAMI.2020.2992393

work page doi:10.1109/tpami.2020.2992393 2021
[25]

Qingbo Kang, Jun Gao, Kang Li, and Qicheng Lao. 2023. Deblurring masked autoencoder is better recipe for ultrasound image recognition. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 352–362

2023
[26]

Qingbo Kang, Jun Gao, Kang Li, and Qicheng Lao. 2023. Deblurring Masked Autoencoder is Better Recipe for Ultrasound Image Recognition. arXiv:2306.08249 [eess.IV] doi:10.48550/arXiv.2306.08249

work page doi:10.48550/arxiv.2306.08249 2023
[27]

Qingbo Kang, Jun Gao, Hongkai Zhao, Zhu He, Kang Li, and Qicheng Lao. 2025. D 2 MAE: Diffusional Deblurring MAE for Ultrasound Image Pre-training. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 107–117

2025
[28]

Qingbo Kang, Qicheng Lao, Jun Gao, Jingyan Liu, Huahui Yi, Buyun Ma, Xiaofan Zhang, and Kang Li. 2024. Deblurring masked image modeling for ultrasound image analysis.Medical Image Analysis97 (2024), 103256

2024
[29]

Tianhong Li, Huiwen Chang, Shlok Mishra, Han Zhang, Dina Katabi, and Dilip Krishnan. 2023. Mage: Masked generative encoder to unify representation learn- ing and image synthesis. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2142–2152

2023
[30]

Youssef Megahed, Robin Ducharme, Aylin Erman, Mark Walker, Steven Hawken, and Adrian DC Chan. 2025. USF-MAE: Ultrasound Self-Supervised Foundation Model with Masked Autoencoding.arXiv preprint arXiv:2510.22990(2025)

work page arXiv 2025
[31]

Sneha Rahul Mhatre and Jagdish W Bakal. 2024. Fetal anomaly detection in ultrasound images: A review of deep learning-based approaches. In2024 3rd Inter- national Conference on Automation, Computing and Renewable Systems (ICACRS). IEEE, 936–942

2024
[32]

P. E. S. Palmer (Ed.). 1995.Manual of Diagnostic Ultrasound. World Health Organization, Geneva, Switzerland. https://iris.who.int/handle/10665/38652

1995
[33]

Shruti Phutke, Amit Shakya, Chetan Gupta, Rupesh Kumar, and Lalit Sharma
[34]

InInternational Conference on Computer Vision and Image Processing

OSCMamba: Omni-Directional Selective Scan Convolution Mamba for Medical Image Classification. InInternational Conference on Computer Vision and Image Processing. Springer, 461–475
[35]

Aimon Rahman and Vishal M Patel. 2024. Ultramae: multi-modal masked au- toencoder for ultrasound pre-training. InMedical Imaging with Deep Learning

2024
[36]

R Ramirez Zegarra and Tullio Ghi. 2023. Use of artificial intelligence and deep learning in fetal ultrasound imaging.Ultrasound in Obstetrics & Gynecology62, 2 (2023), 185–194

2023
[37]

Carla Sendra-Balcells, Víctor M Campello, Jordina Torrents-Barrena, Yahya Ali Ahmed, Mustafa Elattar, Benard Ohene-Botwe, Pempho Nyangulu, William Stones, Mohammed Ammar, Lamya Nawal Benamer, et al. 2023. Generalisability of fetal ultrasound deep learning models to low-resource imaging settings in five African countries.Scientific reports13, 1 (2023), 2728

2023
[38]

Harshita Sharma, Lior Drukker, Pierre Chatelain, Richard Droste, Aris T Papa- georghiou, and J Alison Noble. 2021. Knowledge representation and learning of operator clinical workflow from full-length routine fetal ultrasound scan videos. Medical Image Analysis69 (2021), 101973

2021
[39]

Hongru Shen, Yang Li, Mengyao Feng, Xilin Shen, Dan Wu, Chao Zhang, Yichen Yang, Meng Yang, Jiani Hu, Jilei Liu, et al . 2021. Miscell: An efficient self- supervised learning approach for dissecting single-cell transcriptome.Iscience 24, 11 (2021)

2021
[40]

Chenxin Tao, Xizhou Zhu, Weijie Su, Gao Huang, Bin Li, Jie Zhou, Yu Qiao, Xi- aogang Wang, and Jifeng Dai. 2023. Siamese image modeling for self-supervised vision representation learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2132–2141

2023
[41]

Roberto Vega, Masood Dehghan, Arun Nagdev, Brian Buchanan, Jeevesh Kapur, Jacob L Jaremko, and Dornoosh Zonoobi. 2025. Overcoming barriers in the use of artificial intelligence in point of care ultrasound.NPJ Digital Medicine8, 1 (2025), 213

2025
[42]

Fengxiang Wang, Hongzhen Wang, Di Wang, Zonghao Guo, Zhenyu Zhong, Long Lan, Wenjing Yang, and Jing Zhang. 2025. Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6935–6947

2025
[43]

Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, and Kai Han. 2023. Masked Image Modeling With Local Multi-Scale Reconstruc- tion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2122–2131

2023
[44]

Yuran Wang, Zhijing Wan, Yansheng Qiu, and Zheng Wang. 2024. Devil is in details: Locality-aware 3d abdominal ct volume generation for self-supervised organ segmentation. InProceedings of the 32nd ACM International Conference on Multimedia. 10640–10648

2024
[45]

Chen Wei, Haoqi Fan, Saining Xie, Chao-Yuan Wu, Alan Yuille, and Christoph Feichtenhofer. 2022. Masked feature prediction for self-supervised visual pre- training. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14668–14678

2022
[46]

Ronald Xie, Kuan Pang, Gary D Bader, and Bo Wang. 2023. Maester: masked autoencoder guided segmentation at pixel resolution for accurate, self-supervised subcellular structure recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3292–3301

2023
[47]

Zhenda Xie, Zheng Zhang, Yue Cao, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai, and Han Hu. 2022. SimMIM: A Simple Framework for Masked Image Mod- eling. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, New Orleans, LA, USA, 9653–9663. doi:10.1109/CVPR52688. 2022.00943

work page doi:10.1109/cvpr52688 2022
[48]

Kele Xu, Kang You, Boqing Zhu, Ming Feng, Dawei Feng, and Cheng Yang. 2024. Masked Modeling-Based Ultrasound Image Classification via Self-Supervised Learning.IEEE Open Journal of Engineering in Medicine and Biology5 (2024), 226–237. doi:10.1109/OJEMB.2024.3374966

work page doi:10.1109/ojemb.2024.3374966 2024
[49]

Xuzhe Zhang, Yuhao Wu, Elsa Angelini, Ang Li, Jia Guo, Jerod M Rasmussen, Thomas G O’Connor, Pathik D Wadhwa, Andrea Parolin Jackowski, Hai Li, et al. 2024. Mapseg: Unified unsupervised domain adaptation for heterogeneous medical image segmentation based on 3d masked autoencoding and pseudo- labeling. InProceedings of the IEEE/CVF Conference on Computer V...

2024
[50]

Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, and Tao Kong. 2021. ibot: Image bert pre-training with online tokenizer.arXiv preprint arXiv:2111.07832(2021)

work page internal anchor Pith review arXiv 2021
[51]

Lei Zhou, Huidong Liu, Joseph Bae, Junjun He, Dimitris Samaras, and Prateek Prasanna. 2022. Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation. arXiv:2203.05573 [eess.IV] https://arxiv.org/ abs/2203.05573

work page arXiv 2022
[52]

Reza Azarpazhooh, Haitao Gan, Zhiwei Ye, J

Ran Zhou, Yanghan Ou, Xiaoyue Fang, M. Reza Azarpazhooh, Haitao Gan, Zhiwei Ye, J. David Spence, Xiangyang Xu, and Aaron Fenster. 2023. Ultrasound carotid plaque segmentation via image reconstruction-based self-supervised learning with limited training labels.Mathematical Biosciences and Engineering20, 2 (2023), 1617–1636. doi:10.3934/mbe.2023074

work page doi:10.3934/mbe.2023074 2023
[53]

Rongzhou Zhou, Ziqi Shu, Weixing Xie, Junfeng Yao, and Qingqi Hong. 2024. S 2 CCT: Self-Supervised Collaborative CNN-Transformer for Few-shot Medical Image Segmentation. In2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2991–2998

2024