pith. sign in

arxiv: 2510.19239 · v2 · submitted 2025-10-22 · 📡 eess.IV

TinyUSFM: Towards Compact and Efficient Ultrasound Foundation Models

Pith reviewed 2026-05-18 05:20 UTC · model grok-4.3

classification 📡 eess.IV
keywords ultrasoundfoundation modelknowledge distillationlightweight modelimage classificationimage segmentationmedical imagingcoreset selection
0
0 comments X

The pith

A distilled compact ultrasound model matches its large predecessor on classification and segmentation using just 6 percent of the parameters and computation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to shrink a large ultrasound foundation model into a practical lightweight version without losing most of its ability to handle many organs and tasks. It does this by carefully picking a small set of the most useful training images and then transferring knowledge through a distillation process that respects different image domains. With only 200,000 images the resulting TinyUSFM reaches 84.91 percent average accuracy on classification and 85.78 percent average Dice on segmentation while cutting parameters and operations to roughly one-sixteenth of the original model. The new UniUS-Bench benchmark lets the authors test this across eight classification and ten segmentation datasets from fifteen organs and many different devices. If the approach works as described, advanced ultrasound analysis becomes feasible on ordinary clinical hardware instead of requiring large servers.

Core claim

TinyUSFM preserves the organ versatility and task adaptability of the large USFM through knowledge distillation on a curated subset of 200K images, using a feature-gradient driven coreset selection strategy together with domain-separated masked image modeling and consistency-driven dynamic distillation, while delivering 84.91 percent average classification accuracy and 85.78 percent average segmentation Dice score with only 6.36 percent of the parameters and 6.40 percent of the GFLOPs.

What carries the argument

Feature-gradient driven coreset selection to curate high-quality compact training data, paired with domain-separated masked image modeling that transfers spatial and frequency characteristics via teacher consistency across masks.

If this is right

  • TinyUSFM outperforms other lightweight models by 9.45 percent in classification and 7.72 percent in segmentation on the UniUS-Bench.
  • The model maintains strong results across diverse medical devices and imaging centers.
  • Deployment becomes feasible in resource-limited clinical settings that cannot host the full-scale USFM.
  • The UniUS-Bench provides a standardized public testbed for future ultrasound foundation models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same data-selection and distillation recipe could be tested on foundation models for CT or MRI to produce similarly compact versions.
  • Real-time inference on portable ultrasound hardware would become realistic if the model runs locally without cloud support.
  • Further reductions in size might allow embedding into even smaller diagnostic devices for point-of-care use.

Load-bearing premise

The chosen small set of images and the domain-aware distillation will carry over the essential ultrasound features from the teacher model without meaningful loss in performance.

What would settle it

Running TinyUSFM on a fresh collection of ultrasound scans collected from an entirely new medical center and scanner type and measuring whether average classification accuracy or segmentation Dice falls more than a few points below the reported figures.

Figures

Figures reproduced from arXiv: 2510.19239 by Chen Ma, Jing Jiao, Junhu Fu, Qin Wang, Shuyu Liang, Yi Guo, Yuanyuan Wang, Zeju Li.

Figure 1
Figure 1. Figure 1: Overview of proposed TinyUSFM. Traditional approaches adopt various selection criteria, such as prototype distance [27], gradient norm [28], influence function scores [29], and uncertainty metrics [30], often combined with clustering algorithms (e.g., K-means) to miti￾gate redundancy [31]. However, coreset selection in medical imaging poses unique challenges. Medical datasets com￾monly exhibit severe class… view at source ↗
Figure 2
Figure 2. Figure 2: The key idea is to evaluate the reliability of the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 2
Figure 2. Figure 2: Details of Consistency-Driven Dynamic Distillation. Given the domain-separated masked inputs Ispa and If req obtained by the masking strategy, the teacher model Ft(·) extracts deep-layer representations as: h spa t = Ft(Ispa), hf req t = Ft(If req). (8) The teacher consistency score is computed based on cosine similarity between two representations: scons = cos(h spa t , hf req t ) + 1 2 , (9) where cos(·)… view at source ↗
Figure 3
Figure 3. Figure 3: Ablation study on coreset selection strategy. strategy. We investigate the impact of subset size by con￾structing training sets ranging from 10K to the full 3M-US dataset containing over 2.1 million images. This experiment investigates the relationship between dataset size and its effects on the performance of lightweight models and the efficiency of distillation. In addition, after determining 200K as the… view at source ↗
Figure 4
Figure 4. Figure 4: UMAP visualization of TinyUSFM feature representations. improving from shallow layers to mid-level layers, peaking at the 8th layer with 84.91% average classification accuracy and 85.78% average segmentation Dice score. Notably, per￾formance significantly degrades when applying reconstruction at the deep-layer for both tasks, dropping to 83.57% in classification and 84.34% in segmentation. This suggests th… view at source ↗
Figure 5
Figure 5. Figure 5: b, the baseline without MIM achieves 83.50% classifica￾tion accuracy and 84.59% segmentation Dice score. Incorpo￾rating spatial reconstruction (S-MIM) raises performance by +0.51% and +0.71%, while frequency reconstruction (F-MIM) further improves it by +1.06% and +0.18%, confirming that both domains contribute complementary information beneficial for ultrasound representation learning. The domain-separate… view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison of segmentation results across different organs on UniUS-Bench. Depicted in yellow is the ground truth, and in green is the predicted result. Tumor-level datasets are indicated by underlined titles. quisition conditions. These results demonstrate that our framework effectively transfers ultrasound-specific knowledge to lightweight archi￾tectures, surpassing all lightweight models and… view at source ↗
read the original abstract

Foundation models for medical imaging demonstrate superior generalization capabilities across diverse anatomical structures and clinical applications. Their outstanding performance relies on substantial computational resources, limiting deployment in resource-constrained clinical environments. This paper presents TinyUSFM, the first lightweight ultrasound foundation model that maintains superior organ versatility and task adaptability of our large-scale Ultrasound Foundation Model (USFM) through knowledge distillation with strategically curated small datasets, delivering significant computational efficiency without sacrificing performance. Considering the limited capacity and representation ability of lightweight models, we propose a feature-gradient driven coreset selection strategy to curate high-quality compact training data, avoiding training degradation from low-quality redundant images. To preserve the essential spatial and frequency domain characteristics during knowledge transfer, we develop domain-separated masked image modeling assisted consistency-driven dynamic distillation. This novel framework adaptively transfers knowledge from large foundation models by leveraging teacher model consistency across different domain masks, specifically tailored for ultrasound interpretation. For evaluation, we establish the UniUS-Bench, the largest publicly available ultrasound benchmark comprising 8 classification and 10 segmentation datasets across 15 organs. Using only 200K images in distillation, TinyUSFM matches USFM's performance with just 6.36% of parameters and 6.40% of GFLOPs. TinyUSFM significantly outperforms the vanilla model by 9.45% in classification and 7.72% in segmentation, surpassing all state-of-the-art lightweight models, and achieving 84.91% average classification accuracy and 85.78% average segmentation Dice score across diverse medical devices and centers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces TinyUSFM, a lightweight ultrasound foundation model distilled from the larger USFM using a curated set of 200K images. It proposes a feature-gradient driven coreset selection strategy to curate high-quality compact data and a domain-separated masked image modeling assisted consistency-driven dynamic distillation framework to transfer knowledge while preserving spatial and frequency domain characteristics. The model is evaluated on the newly proposed UniUS-Bench (8 classification and 10 segmentation datasets across 15 organs and diverse devices/centers), achieving 84.91% average classification accuracy and 85.78% average segmentation Dice score with 6.36% of USFM's parameters and 6.40% of its GFLOPs, while outperforming vanilla lightweight models by 9.45% (classification) and 7.72% (segmentation).

Significance. If the empirical results hold after proper verification, this work would be significant for enabling deployment of ultrasound foundation models in resource-constrained clinical settings. The creation of UniUS-Bench as a large multi-task, multi-center benchmark is a clear positive contribution to the field. The reported efficiency gains without apparent performance loss could accelerate practical adoption of medical imaging AI. The paper provides concrete performance numbers on a held-out multi-dataset benchmark, which is a strength.

major comments (2)
  1. [Abstract and Experiments section] Abstract and Experiments section: The central claim that the feature-gradient driven coreset selection combined with domain-separated MIM consistency distillation enables matching USFM performance on UniUS-Bench with only 200K images is load-bearing, yet the manuscript provides no ablation comparing the proposed coreset selection against random selection of 200K images or the domain-separated consistency distillation against standard knowledge distillation. Without these, the 9.45% classification and 7.72% segmentation gains cannot be confidently attributed to the described techniques rather than data volume or generic distillation.
  2. [UniUS-Bench and Results section] UniUS-Bench and Results section: The reported average accuracies (84.91% classification, 85.78% Dice) across 15 organs/devices/centers lack per-dataset breakdowns, standard deviations, or statistical tests. This is critical because the multi-device, multi-center nature of the benchmark could mask uneven performance or data leakage if the 200K distillation images overlap with test sets; the absence of such details undermines the generalization claims.
minor comments (2)
  1. [Methods section] The method description would benefit from explicit pseudocode or a step-by-step algorithm box for the feature-gradient coreset selection to improve reproducibility.
  2. [Figures] Ensure all figures include error bars where applicable and clear legends distinguishing TinyUSFM, USFM, and all baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. We address each major comment below and have revised the manuscript to strengthen the presentation of our contributions and results.

read point-by-point responses
  1. Referee: [Abstract and Experiments section] The central claim that the feature-gradient driven coreset selection combined with domain-separated MIM consistency distillation enables matching USFM performance on UniUS-Bench with only 200K images is load-bearing, yet the manuscript provides no ablation comparing the proposed coreset selection against random selection of 200K images or the domain-separated consistency distillation against standard knowledge distillation. Without these, the 9.45% classification and 7.72% segmentation gains cannot be confidently attributed to the described techniques rather than data volume or generic distillation.

    Authors: We agree that explicit ablations are necessary to attribute performance gains to the proposed components. In the revised manuscript, we have added ablation studies in the Experiments section comparing (i) our feature-gradient driven coreset selection against random sampling of 200K images and (ii) the domain-separated masked image modeling assisted consistency-driven dynamic distillation against standard knowledge distillation. The results show that both proposed elements contribute measurably to the reported improvements over baselines, supporting the central claims. revision: yes

  2. Referee: [UniUS-Bench and Results section] The reported average accuracies (84.91% classification, 85.78% Dice) across 15 organs/devices/centers lack per-dataset breakdowns, standard deviations, or statistical tests. This is critical because the multi-device, multi-center nature of the benchmark could mask uneven performance or data leakage if the 200K distillation images overlap with test sets; the absence of such details undermines the generalization claims.

    Authors: We have revised the Results section to include full per-dataset performance tables for all 8 classification and 10 segmentation tasks, along with standard deviations computed over multiple runs and statistical significance tests (paired t-tests) against baselines. We also added an explicit statement confirming that the 200K distillation images were drawn exclusively from training splits with no overlap against any UniUS-Bench test sets; the data partitioning procedure is now described in detail to rule out leakage. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results on held-out benchmarks are self-contained

full rationale

The paper's central claims consist of empirical performance numbers (84.91% classification accuracy, 85.78% Dice) obtained by training TinyUSFM on 200K curated images and evaluating on the newly introduced UniUS-Bench across 18 datasets. These metrics are measured directly on held-out test sets and do not reduce to any author-defined equations, fitted parameters renamed as predictions, or self-referential derivations. The reference to the prior USFM teacher model supplies an external baseline whose performance is independently reported and falsifiable outside this paper; it does not render the student-model results circular. No load-bearing step equates a claimed outcome to its own inputs by construction.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The work relies on standard assumptions in knowledge distillation and self-supervised learning for medical images; no new physical axioms or invented entities are introduced. Hyperparameters for the distillation process and coreset size are implicit free parameters.

free parameters (2)
  • coreset selection ratio / size
    The number of images retained (200K) and the gradient-feature threshold for selection are chosen to balance quality and efficiency.
  • distillation temperature and loss weights
    Hyperparameters controlling consistency across domain masks are tuned to achieve the reported transfer performance.
axioms (2)
  • domain assumption High-quality compact training data can be selected via feature gradients without losing representation of the full data distribution
    Invoked in the description of the coreset selection strategy to avoid training degradation.
  • domain assumption Masked image modeling in separate spatial and frequency domains preserves ultrasound-specific characteristics during distillation
    Central to the consistency-driven dynamic distillation framework.

pith-pipeline@v0.9.0 · 5823 in / 1456 out tokens · 24603 ms · 2026-05-18T05:20:24.455198+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Defining Robust Ultrasound Quality Metrics via an Ultrasound Foundation Model

    eess.IV 2026-04 unverdicted novelty 6.0

    TinyUSFM-uLPIPS and TinyUSFM-NRQ provide task-linked, cross-organ, and clinically predictive quality assessment for ultrasound images that outperforms conventional metrics in calibration with segmentation performance ...

  2. Defining Robust Ultrasound Quality Metrics via an Ultrasound Foundation Model

    eess.IV 2026-04 unverdicted novelty 6.0

    Proposes TinyUSFM-uLPIPS and TinyUSFM-NRQ metrics that show better alignment with segmentation task performance and expert preference than PSNR or VGG-LPIPS in ultrasound imaging.

  3. Unified Ultrasound Intelligence Toward an End-to-End Agentic System

    cs.CV 2026-04 unverdicted novelty 6.0

    USTri is a tri-stage ultrasound system that trains a generalist model, fine-tunes specialists while frozen, and deploys an agent for workflow orchestration, claiming top performance across 4 task types and 27 datasets.

  4. Understanding Task Aggregation for Generalizable Ultrasound Foundation Models

    eess.IV 2026-03 unverdicted novelty 4.0

    Systematic tests of 27 ultrasound tasks show that unified training is more consistent than clinically-grouped training, with performance hinging on data availability and task characteristics.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 3 Pith papers · 7 internal anchors

  1. [1]

    Deep learning in medical ultrasound analysis: a review,

    S. Liu, Y . Wang, X. Yang, B. Lei, L. Liu, S. X. Li, D. Ni, and T. Wang, “Deep learning in medical ultrasound analysis: a review,”Engineering, vol. 5, no. 2, pp. 261–275, 2019

  2. [2]

    An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention- matching downstream synchronization,

    Z. Lin, S. Li, S. Wang, Z. Gao, Y . Sun, C.-T. Lam, X. Hu, X. Yang, D. Ni, and T. Tan, “An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention- matching downstream synchronization,”Medical Image Analysis, p. 103639, 2025

  3. [3]

    Foundation models for generalist medical artificial intelligence,

    M. Moor, O. Banerjee, Z. S. H. Abad, H. M. Krumholz, J. Leskovec, E. J. Topol, and P. Rajpurkar, “Foundation models for generalist medical artificial intelligence,”Nature, vol. 616, no. 7956, pp. 259–265, 2023

  4. [4]

    Usfm: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis,

    J. Jiao, J. Zhou, X. Li, M. Xia, Y . Huang, L. Huang, N. Wang, X. Zhang, S. Zhou, Y . Wang,et al., “Usfm: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis,”Medical image analysis, vol. 96, p. 103202, 2024

  5. [5]

    General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis,

    S. Lu, Y . Chen, Y . Chen, P. Li, J. Sun, C. Zheng, Y . Zou, B. Liang, M. Li, Q. Jin,et al., “General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis,” Nature Communications, vol. 16, no. 1, p. 2097, 2025

  6. [6]

    MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer

    S. Mehta and M. Rastegari, “Mobilevit: light-weight, general- purpose, and mobile-friendly vision transformer,”arXiv preprint arXiv:2110.02178, 2021. AUTHORet al.: TITLE 13

  7. [7]

    Efficientnet: Rethinking model scaling for convolutional neural networks,

    M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” inInternational conference on machine learning, pp. 6105–6114, PMLR, 2019

  8. [8]

    Distilling the Knowledge in a Neural Network

    G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015

  9. [9]

    Machine learning for medical ultrasound: status, methods, and future opportunities,

    L. J. Brattain, B. A. Telfer, M. Dhyani, J. R. Grajo, and A. E. Samir, “Machine learning for medical ultrasound: status, methods, and future opportunities,”Abdominal radiology, vol. 43, no. 4, pp. 786–799, 2018

  10. [10]

    Impact of imperfection in medical imaging data on deep learning-based segmentation performance: An experimental study using synthesized data,

    A. M. G ¨unes ¸, W. van Rooij, S. Gulshad, B. Slotman, M. Dahele, and W. Verbakel, “Impact of imperfection in medical imaging data on deep learning-based segmentation performance: An experimental study using synthesized data,”Medical physics, vol. 50, no. 10, pp. 6421–6432, 2023

  11. [11]

    When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations,

    R. Compton, L. Zhang, A. Puli, and R. Ranganath, “When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations,” inMachine learning for healthcare conference, pp. 110–127, PMLR, 2023

  12. [12]

    Efficient ultrasound image analysis models with sonographer gaze assisted distillation,

    A. Patra, Y . Cai, P. Chatelain, H. Sharma, L. Drukker, A. T. Papageorghiou, and J. A. Noble, “Efficient ultrasound image analysis models with sonographer gaze assisted distillation,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 394–402, Springer, 2019

  13. [13]

    Large ai models in health informatics: Applications, challenges, and the future,

    J. Qiu, L. Li, J. Sun, J. Peng, P. Shi, R. Zhang, Y . Dong, K. Lam, F. P.-W. Lo, B. Xiao,et al., “Large ai models in health informatics: Applications, challenges, and the future,”IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 12, pp. 6074–6087, 2023

  14. [14]

    Mis-fm: 3d medical image segmentation using foundation models pretrained on a large-scale unannotated dataset,

    G. Wang, J. Wu, X. Luo, X. Liu, K. Li, and S. Zhang, “Mis-fm: 3d medical image segmentation using foundation models pretrained on a large-scale unannotated dataset,”arXiv preprint arXiv:2306.16925, 2023

  15. [15]

    A foundation model for generalizable disease detection from retinal images,

    Y . Zhou, M. A. Chia, S. K. Wagner, M. S. Ayhan, D. J. Williamson, R. R. Struyven, T. Liu, M. Xu, M. G. Lozano, P. Woodward-Court, et al., “A foundation model for generalizable disease detection from retinal images,”Nature, vol. 622, no. 7981, pp. 156–163, 2023

  16. [16]

    Foundation model for endoscopy video analysis via large-scale self-supervised pre-train,

    Z. Wang, C. Liu, S. Zhang, and Q. Dou, “Foundation model for endoscopy video analysis via large-scale self-supervised pre-train,” in International Conference on Medical Image Computing and Computer- Assisted Intervention, pp. 101–111, Springer, 2023

  17. [17]

    Segment anything in medical images,

    J. Ma, Y . He, F. Li, L. Han, C. You, and B. Wang, “Segment anything in medical images,”Nature Communications, vol. 15, no. 1, p. 654, 2024

  18. [18]

    On the challenges and perspectives of foundation models for medical image analysis,

    S. Zhang and D. Metaxas, “On the challenges and perspectives of foundation models for medical image analysis,”Medical image analysis, vol. 91, p. 102996, 2024

  19. [19]

    FitNets: Hints for Thin Deep Nets

    A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y . Bengio, “Fitnets: Hints for thin deep nets. arxiv 2014,”arXiv preprint arXiv:1412.6550, 2014

  20. [20]

    Structured knowledge distillation for semantic segmentation,

    Y . Liu, K. Chen, C. Liu, Z. Qin, Z. Luo, and J. Wang, “Structured knowledge distillation for semantic segmentation,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2604–2613, 2019

  21. [21]

    Efficient medical image segmentation based on knowledge distillation,

    D. Qin, J.-J. Bu, Z. Liu, X. Shen, S. Zhou, J.-J. Gu, Z.-H. Wang, L. Wu, and H.-F. Dai, “Efficient medical image segmentation based on knowledge distillation,”IEEE Transactions on Medical Imaging, vol. 40, no. 12, pp. 3820–3831, 2021

  22. [22]

    Deep mutual distillation for semi- supervised medical image segmentation,

    Y . Xie, Y . Yin, Q. Li, and Y . Wang, “Deep mutual distillation for semi- supervised medical image segmentation,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 540–550, Springer, 2023

  23. [23]

    Mdvit: Multi-domain vision transformer for small medical image segmentation datasets,

    S. Du, N. Bayasi, G. Hamarneh, and R. Garbi, “Mdvit: Multi-domain vision transformer for small medical image segmentation datasets,” in International Conference on Medical Image Computing and Computer- Assisted Intervention, pp. 448–458, Springer, 2023

  24. [24]

    Graph flow: Cross-layer graph flow distillation for dual efficient medical image segmentation,

    W. Zou, X. Qi, W. Zhou, M. Sun, Z. Sun, and C. Shan, “Graph flow: Cross-layer graph flow distillation for dual efficient medical image segmentation,”IEEE Transactions on Medical Imaging, vol. 42, no. 4, pp. 1159–1171, 2022

  25. [25]

    Exploring generalizable distillation for efficient medical image segmentation,

    X. Qi, Z. Wu, W. Zou, M. Ren, Y . Gao, M. Sun, S. Zhang, C. Shan, and Z. Sun, “Exploring generalizable distillation for efficient medical image segmentation,”IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 7, pp. 4170–4183, 2024

  26. [26]

    Deep learning on a data diet: Finding important examples early in training,

    M. Paul, S. Ganguli, and G. K. Dziugaite, “Deep learning on a data diet: Finding important examples early in training,”Advances in neural information processing systems, vol. 34, pp. 20596–20607, 2021

  27. [27]

    Coresets for data- efficient training of machine learning models,

    B. Mirzasoleiman, J. Bilmes, and J. Leskovec, “Coresets for data- efficient training of machine learning models,” inInternational Conference on Machine Learning, pp. 6950–6960, PMLR, 2020

  28. [28]

    Not all samples are created equal: Deep learning with importance sampling,

    A. Katharopoulos and F. Fleuret, “Not all samples are created equal: Deep learning with importance sampling,” inInternational conference on machine learning, pp. 2525–2534, PMLR, 2018

  29. [29]

    Understanding black-box predictions via influence functions,

    P. W. Koh and P. Liang, “Understanding black-box predictions via influence functions,” inInternational conference on machine learning, pp. 1885–1894, PMLR, 2017

  30. [30]

    Deep Active Learning over the Long Tail

    Y . Geifman and R. El-Yaniv, “Deep active learning over the long tail,” arXiv preprint arXiv:1711.00941, 2017

  31. [31]

    Semantic Redundancies in Image-Classification Datasets: The 10% You Don't Need

    V . Birodkar, H. Mobahi, and S. Bengio, “Semantic redundancies in image-classification datasets: The 10% you don’t need,”arXiv preprint arXiv:1901.11409, 2019

  32. [32]

    Towards long-tailed, multi-label disease classification from chest x-ray: Overview of the cxr-lt challenge,

    G. Holste, Y . Zhou, S. Wang, A. Jaiswal, M. Lin, S. Zhuge, Y . Yang, D. Kim, T.-H. Nguyen-Mau, M.-T. Tran,et al., “Towards long-tailed, multi-label disease classification from chest x-ray: Overview of the cxr-lt challenge,”Medical Image Analysis, vol. 97, p. 103224, 2024

  33. [33]

    K. M. Meiburger, F. Marzola, G. Zahnd, F. Faita, C. P. Loizou, N. Lain ´e, C. Carvalho, D. A. Steinman, L. Gibello, R. M. Bruno,et al., “Carotid ultrasound boundary study (cubs): Technical considerations on an open multi-center analysis of computerized measurement systems for intima- media thickness measurement on common carotid artery longitudinal b- mod...

  34. [34]

    A lightweight hybrid model for the automatic recognition of uterine fibroid ultrasound images based on deep learning,

    P. Cai, T. Yang, Q. Xie, P. Liu, and P. Li, “A lightweight hybrid model for the automatic recognition of uterine fibroid ultrasound images based on deep learning,”Journal of Clinical Ultrasound, vol. 52, no. 6, pp. 753– 762, 2024

  35. [35]

    Thyroid region prior guided attention for ultrasound segmentation of thyroid nodules,

    H. Gong, J. Chen, G. Chen, H. Li, G. Li, and F. Chen, “Thyroid region prior guided attention for ultrasound segmentation of thyroid nodules,” Computers in biology and medicine, vol. 155, p. 106389, 2023

  36. [36]

    Deep learning segmentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment,

    F. Marzola, N. Van Alfen, J. Doorduin, and K. M. Meiburger, “Deep learning segmentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment,”Computers in Biology and Medicine, vol. 135, p. 104623, 2021

  37. [37]

    Annotated ultrasound liver images,

    X. Yiming, Z. Bowen, L. Xiaohong, W. Tao, J. Jinxiu, W. Shijie, L. Yufan, Z. Hongjun, L. Tong, S. Ye, J. Rui, W. Guangyu, R. Jie, and C. Ting, “Annotated ultrasound liver images,” Nov. 2022

  38. [38]

    Dataset of breast ultrasound images,

    W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. Fahmy, “Dataset of breast ultrasound images,”Data in brief, vol. 28, p. 104863, 2020

  39. [39]

    A multi-modality ovarian tumor ultrasound image dataset for unsupervised cross-domain semantic segmentation,

    Q. Zhao, S. Lyu, W. Bai, L. Cai, B. Liu, M. Wu, X. Sang, M. Yang, and L. Chen, “A multi-modality ovarian tumor ultrasound image dataset for unsupervised cross-domain semantic segmentation,”CoRR, 2022

  40. [40]

    Evaluation of deep convolutional neural networks for automatic classification of common maternal fetal ultrasound planes,

    X. P. Burgos-Artizzu, D. Coronado-Guti ´errez, B. Valenzuela-Alcaraz, E. Bonet-Carne, E. Eixarch, F. Crispi, and E. Gratac ´os, “Evaluation of deep convolutional neural networks for automatic classification of common maternal fetal ultrasound planes,”Scientific Reports, vol. 10, no. 1, p. 10200, 2020

  41. [41]

    Luminous database: lumbar multifidus muscle segmentation from ultrasound images,

    C. J. Belasso, B. Behboodi, H. Benali, M. Boily, H. Rivaz, and M. Fortin, “Luminous database: lumbar multifidus muscle segmentation from ultrasound images,”BMC Musculoskeletal Disorders, vol. 21, no. 1, p. 703, 2020

  42. [42]

    The open kidney ultrasound data set,

    R. Singla, C. Ringstrom, G. Hu, V . Lessoway, J. Reid, C. Nguan, and R. Rohling, “The open kidney ultrasound data set,” inInternational Workshop on Advances in Simplifying Medical Ultrasound, pp. 155– 164, Springer, 2023

  43. [43]

    Query2: Query over queries for improving gastrointestinal stromal tumour detection in an endoscopic ultrasound,

    Q. He, S. Bano, J. Liu, W. Liu, D. Stoyanov, and S. Zuo, “Query2: Query over queries for improving gastrointestinal stromal tumour detection in an endoscopic ultrasound,”Computers in Biology and Medicine, vol. 152, p. 106424, 2023

  44. [44]

    An open access thyroid ultrasound image database,

    L. Pedraza, C. Vargas, F. Narv ´aez, O. Dur ´an, E. Mu ˜noz, and E. Romero, “An open access thyroid ultrasound image database,” in 10th International symposium on medical information processing and analysis, vol. 9287, pp. 188–193, SPIE, 2015

  45. [45]

    Bus-bra: a breast ultrasound dataset for assessing computer-aided diagnosis systems,

    W. G ´omez-Flores, M. J. Gregorio-Calas, and W. Coelho de Albu- querque Pereira, “Bus-bra: a breast ultrasound dataset for assessing computer-aided diagnosis systems,”Medical Physics, vol. 51, no. 4, pp. 3110–3123, 2024

  46. [46]

    Ultrasound nerve segmentation

    A. Montoya, Hasnin, kaggle446, shirzad, W. Cukierski, and yffud, “Ultrasound nerve segmentation.”https://kaggle.com/ competitions/ultrasound-nerve-segmentation, 2016. Kaggle

  47. [47]

    Lung ultrasound covid phantom dataset used for training machine learning model,

    J. R. McLaughlan, L. Howell, and N. Ingram, “Lung ultrasound covid phantom dataset used for training machine learning model,” 2024

  48. [48]

    Pubic symphysis-fetal head segmentation and angle of progression,

    B. Jieyun and O. ZhanHong, “Pubic symphysis-fetal head segmentation and angle of progression,” Apr. 2023

  49. [49]

    Deep learning for segmentation using an open large-scale dataset in 14 IEEE TRANSACTIONS AND JOURNALS TEMPLATE 2d echocardiography,

    S. Leclerc, E. Smistad, J. Pedrosa, A. Østvik, F. Cervenansky, F. Espinosa, T. Espeland, E. A. R. Berg, P.-M. Jodoin, T. Grenier,et al., “Deep learning for segmentation using an open large-scale dataset in 14 IEEE TRANSACTIONS AND JOURNALS TEMPLATE 2d echocardiography,”IEEE transactions on medical imaging, vol. 38, no. 9, pp. 2198–2210, 2019

  50. [50]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016

  51. [51]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly,et al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

  52. [52]

    Training data-efficient image transformers & distillation through attention,

    H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. J ´egou, “Training data-efficient image transformers & distillation through attention,” inInternational conference on machine learning, pp. 10347–10357, PMLR, 2021

  53. [53]

    Simmim: A simple framework for masked image modeling,

    Z. Xie, Z. Zhang, Y . Cao, Y . Lin, J. Bao, Z. Yao, Q. Dai, and H. Hu, “Simmim: A simple framework for masked image modeling,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9653–9663, 2022

  54. [54]

    TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,”arXiv preprint arXiv:2102.04306, 2021

  55. [55]

    Swin-unet: Unet-like pure transformer for medical image segmentation,

    H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-unet: Unet-like pure transformer for medical image segmentation,” inEuropean conference on computer vision, pp. 205– 218, Springer, 2022

  56. [56]

    Segformer: Simple and efficient design for semantic segmentation with transformers,

    E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,”Advances in neural information processing systems, vol. 34, pp. 12077–12090, 2021

  57. [57]

    Seaformer: Squeeze- enhanced axial transformer for mobile semantic segmentation,

    Q. Wan, Z. Huang, J. Lu, G. Yu, and L. Zhang, “Seaformer: Squeeze- enhanced axial transformer for mobile semantic segmentation,” inThe eleventh international conference on learning representations, 2023

  58. [58]

    Uniform manifold approximation and projection,

    J. Healy and L. McInnes, “Uniform manifold approximation and projection,”Nature Reviews Methods Primers, vol. 4, no. 1, p. 82, 2024