pith. sign in

arxiv: 2605.16427 · v1 · pith:F5A3IZZJnew · submitted 2026-05-14 · 💻 cs.CV · cs.AI

EAGT: Echocardiography Augmentation for Generalisability and Transferability

Pith reviewed 2026-05-20 20:36 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords echocardiography segmentationdata augmentationgeneralisabilitytransfer learninggeometric transformationsU-Netleft ventriclecross-dataset evaluation
0
0 comments X

The pith

Anatomically plausible geometric augmentations improve cross-dataset performance in echocardiography segmentation models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The study conducts a large-scale evaluation of 29 data augmentation methods and their pairs for training U-Net models on left ventricular segmentation in echocardiography images from three different datasets. It finds that geometric transformations which maintain anatomical realism, including affine, shift-scale-rotate, perspective, and random horizontal flip, enhance the models' ability to perform well on data from unseen sources. In contrast, strong alterations to image intensity or addition of artifacts tend to reduce this transfer performance. Pairwise combinations, especially those centered on horizontal flips combined with affine transforms, deliver the most reliable improvements in cross-dataset tests. This approach offers practical ways to make deep learning models more adaptable without needing vast amounts of new labeled data from every clinical setting.

Core claim

The paper establishes through extensive experiments that geometric data augmentations preserving anatomical structure lead to better generalisability and transferability of echocardiography segmentation models across institutions and scanners, with specific pairwise combinations providing superior results compared to individual or intensity-based methods.

What carries the argument

Systematic testing of geometric transformations such as affine and random horizontal flip, applied during training of U-Net models for 2D left ventricular segmentation, to enhance model robustness to dataset shifts.

If this is right

  • Geometric augmentations lead to higher Dice and IoU scores in cross-dataset evaluation scenarios.
  • Combinations of augmentations, particularly flip with affine, outperform single augmentations in transfer tasks.
  • Avoiding aggressive intensity augmentations prevents degradation of model generalisability.
  • These augmentation strategies provide empirically supported guidance for improving model transferability in echocardiography analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If validated further, these augmentation choices could reduce the need for extensive new data collection in clinical AI deployment.
  • Similar principles might apply to segmentation tasks in other ultrasound or imaging modalities facing domain shifts.
  • Exploring these augmentations on models beyond U-Net could test the broader applicability of the findings.
  • Integrating these policies into standard training pipelines may improve real-world performance of automated cardiac analysis tools.

Load-bearing premise

The variability across the three chosen datasets sufficiently represents differences in scanners, institutions, and patient populations encountered in practice.

What would settle it

Evaluating the top-performing augmentation combinations on an additional echocardiography dataset from a new source and finding no statistically significant improvement in cross-dataset metrics would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.16427 by Julie Wall, Massoud Zolgharni, Nasim Dadashi Serej, Sara Adibzadeh, Soroush Elyasi.

Figure 1
Figure 1. Figure 1: Overall framework illustration Although data augmentation is widely used, there remains limited systematic evidence on which augmen￾tation strategies improve cross-dataset generalisation in echocardiographic segmentation. To the best of our knowledge, no prior work has systematically investigated individual and pairwise augmentation strategies as the central focus of a large-scale cross-dataset evaluation … view at source ↗
Figure 2
Figure 2. Figure 2: Segmentation and masks by human annotator [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Representative echocardiographic images from the datasets used in this study [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Intensity and contrast statistics across the four datasets, computed after excluding zero-valued [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Grouping of the 29 data augmentation techniques into five categories [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Row-Mean Dice Performance Across Datasets and Relative Improvements Over the NONE [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Dice performance heatmap for all augmentation strategies [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Relative Dice performance with respect to the NONE [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Row-Mean IoU Performance Across Datasets and Relative Improvements Over the NONE [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: IoU performance heatmap for all augmentation strategies [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Relative IoU performance with respect to the NONE [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Row-Mean Dice Performance Across Datasets and Relative Improvements Over the NONE for [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Row-Mean IoU Performance Across Datasets and Relative Improvements Over the NONE for [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Examples of our CNN-based fan-shaped ultrasound sector mask generation [PITH_FULL_IMAGE:figures/full_fig_p034_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Grad-CAM results on CS images 35 [PITH_FULL_IMAGE:figures/full_fig_p035_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Grad-CAM results on CA and ED images APPENDIX C. DICE RESULTS IN DETAIL Tables 3 (individual augmentations) and 4 (pairwise augmentation combinations) present the Dice score results in detail. Green cells indicate statistically significant improvements, whereas red cells denote non-significant results. Furthermore, the row-wise mean Dice score and the number of statistically significant outcomes are repor… view at source ↗
Figure 17
Figure 17. Figure 17: Cross-Dataset Dice Performance and Relative Gains Over the NONE for Pairwise Augmentations [PITH_FULL_IMAGE:figures/full_fig_p040_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Cross-Dataset IoU Performance and Relative Gains Over the NONE for Pairwise Augmentations [PITH_FULL_IMAGE:figures/full_fig_p041_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: L: A.Affine(rotate=(-15, 15), translate percent=(0.1, 0.1), scale=(0.7, 1.3), p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p042_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: H: A.Affine(rotate=(-30, 30), translate percent=(0.2, 0.2), scale=(0.6, 1.5), p=0.7) 42 [PITH_FULL_IMAGE:figures/full_fig_p042_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: C1: A.Affine(rotate=(-25, 25), translate [PITH_FULL_IMAGE:figures/full_fig_p043_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: L: A.CLAHE(clip limit=2.0, tile grid size=(8, 8), p=0.2) [PITH_FULL_IMAGE:figures/full_fig_p043_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: H: A.CLAHE(clip limit=4.0, tile grid size=(4, 4), p=0.4) [PITH_FULL_IMAGE:figures/full_fig_p043_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: C1: A.CLAHE(clip limit=1.8, tile grid size=(8, 8), p=0.25) 43 [PITH_FULL_IMAGE:figures/full_fig_p043_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: L: ColorJitter(brightness=0.8, contrast=0.8, saturation=0.0, hue=0.0, p=0.8) [PITH_FULL_IMAGE:figures/full_fig_p044_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: H: A.ColorJitter(brightness=1.0, contrast=1.0, saturation=0.0, hue=0.0, p=0.8)) [PITH_FULL_IMAGE:figures/full_fig_p044_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: C1: A.ColorJitter(brightness=1.5, contrast=1.5, saturation=0.0, hue=0.0, p=0.35) [PITH_FULL_IMAGE:figures/full_fig_p044_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: C2: A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.0, hue=0.0, p=0.5) [PITH_FULL_IMAGE:figures/full_fig_p044_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: L: A.ElasticTransform(alpha=5.0, sigma=10.0, alpha [PITH_FULL_IMAGE:figures/full_fig_p045_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: H: A.ElasticTransform( alpha=30.0, sigma=20.0, alpha [PITH_FULL_IMAGE:figures/full_fig_p045_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: C1: A.ElasticTransform( alpha=7.5, sigma=13.0, alpha [PITH_FULL_IMAGE:figures/full_fig_p045_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: L: A.GaussianBlur(blur limit=(3, 3), p=0.15) 45 [PITH_FULL_IMAGE:figures/full_fig_p045_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: H: A.GaussianBlur(blur limit=(5, 9), p=0.35) [PITH_FULL_IMAGE:figures/full_fig_p046_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: C1: A.GaussianBlur(blur limit=(3, 5),sigma limit=(0.1, 0.6), p=0.15) D.6 GaussNoise [PITH_FULL_IMAGE:figures/full_fig_p046_34.png] view at source ↗
Figure 35
Figure 35. Figure 35: L: A.GaussNoise(var limit=(5.0, 20.0), p=0.4) [PITH_FULL_IMAGE:figures/full_fig_p046_35.png] view at source ↗
Figure 36
Figure 36. Figure 36: H: A.GaussNoise(var limit=(15.0, 50.0), p=0.5) 46 [PITH_FULL_IMAGE:figures/full_fig_p046_36.png] view at source ↗
Figure 37
Figure 37. Figure 37: C1: A.GaussNoise(var limit=(3.0, 15.0), p=0.35, per channel=False)) [PITH_FULL_IMAGE:figures/full_fig_p047_37.png] view at source ↗
Figure 38
Figure 38. Figure 38: C2: A.GaussNoise(var limit=(0.001, 0.001), p=0.3) D.7 MotionBlur [PITH_FULL_IMAGE:figures/full_fig_p047_38.png] view at source ↗
Figure 39
Figure 39. Figure 39: L: A.MotionBlur(blur limit=3, p=0.2) [PITH_FULL_IMAGE:figures/full_fig_p047_39.png] view at source ↗
Figure 40
Figure 40. Figure 40: H: A.MotionBlur(blur limit=(5, 10), p=0.4)) 47 [PITH_FULL_IMAGE:figures/full_fig_p047_40.png] view at source ↗
Figure 41
Figure 41. Figure 41: C1: A.MotionBlur(blur limit=4, p=0.30) D.8 MultiplicativeNoise [PITH_FULL_IMAGE:figures/full_fig_p048_41.png] view at source ↗
Figure 42
Figure 42. Figure 42: L: A.MultiplicativeNoise(multiplier=(0.9, 1.1), per [PITH_FULL_IMAGE:figures/full_fig_p048_42.png] view at source ↗
Figure 43
Figure 43. Figure 43: H: A.MultiplicativeNoise(multiplier=(0.8, 1.2), per [PITH_FULL_IMAGE:figures/full_fig_p048_43.png] view at source ↗
Figure 44
Figure 44. Figure 44: C1: A.MultiplicativeNoise(multiplier=(0.70, 1.3)), per [PITH_FULL_IMAGE:figures/full_fig_p048_44.png] view at source ↗
Figure 45
Figure 45. Figure 45: L: A.Perspective(scale=(0.05, 0.1), p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p049_45.png] view at source ↗
Figure 46
Figure 46. Figure 46: H: A.Perspective(scale=(0.08, 0.20), p=0.7) [PITH_FULL_IMAGE:figures/full_fig_p049_46.png] view at source ↗
Figure 47
Figure 47. Figure 47: C1: A.Perspective(scale=(0.07, 0.15), p=0.6) [PITH_FULL_IMAGE:figures/full_fig_p049_47.png] view at source ↗
Figure 48
Figure 48. Figure 48: L: A.RandomBrightnessContrast(brightness [PITH_FULL_IMAGE:figures/full_fig_p049_48.png] view at source ↗
Figure 49
Figure 49. Figure 49: H: A.RandomBrightnessContrast(brightness [PITH_FULL_IMAGE:figures/full_fig_p050_49.png] view at source ↗
Figure 50
Figure 50. Figure 50: C1: A.RandomBrightnessContrast(brightness [PITH_FULL_IMAGE:figures/full_fig_p050_50.png] view at source ↗
Figure 51
Figure 51. Figure 51: L: A.RandomGamma(gamma limit=(80, 120), p=0.4) [PITH_FULL_IMAGE:figures/full_fig_p050_51.png] view at source ↗
Figure 52
Figure 52. Figure 52: H: A.RandomGamma(gamma limit=(40, 160), p=0.5) 50 [PITH_FULL_IMAGE:figures/full_fig_p050_52.png] view at source ↗
Figure 53
Figure 53. Figure 53: C1: A.RandomGamma(gamma limit=(90, 110), p=0.30) D.12 HorizontalFlip [PITH_FULL_IMAGE:figures/full_fig_p051_53.png] view at source ↗
Figure 54
Figure 54. Figure 54: L: A.HorizontalFlip(p=0.5), H: A.HorizontalFlip(p=0.75), A.HorizontalFlip(p=0.30) [PITH_FULL_IMAGE:figures/full_fig_p051_54.png] view at source ↗
Figure 55
Figure 55. Figure 55: L: A.RandomResizedCrop(size=(512, 512), scale=(0.9, 1.0), ratio=(0.95, 1.05), p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p051_55.png] view at source ↗
Figure 56
Figure 56. Figure 56: H: A.RandomResizedCrop(size=(512, 512), scale=(0.6, 1.0), ratio=(0.8, 1.2), p=0.7) [PITH_FULL_IMAGE:figures/full_fig_p051_56.png] view at source ↗
Figure 57
Figure 57. Figure 57: C1: A.RandomResizedCrop(size=(512, 512),scale=(0.50, 0.9),ratio=(0.7, 1.3),p=0.9) [PITH_FULL_IMAGE:figures/full_fig_p052_57.png] view at source ↗
Figure 58
Figure 58. Figure 58: L: A.Sharpen(alpha=(0.05, 0.10), lightness=(1.0, 1.0), p=0.15) [PITH_FULL_IMAGE:figures/full_fig_p052_58.png] view at source ↗
Figure 59
Figure 59. Figure 59: H: A.Sharpen(alpha=(0.15, 0.35), lightness=(0.8, 1.2), p=0.4) [PITH_FULL_IMAGE:figures/full_fig_p052_59.png] view at source ↗
Figure 60
Figure 60. Figure 60: C1: A.Sharpen(alpha=(0.06, 0.12), lightness=(1.0, 1.0), p=0.25) [PITH_FULL_IMAGE:figures/full_fig_p052_60.png] view at source ↗
Figure 61
Figure 61. Figure 61: L: A.ShiftScaleRotate( shift limit=0.05, scale limit=0.10, rotate limit=10, border mode=0,value=0, mask value=0, p=0.9) [PITH_FULL_IMAGE:figures/full_fig_p053_61.png] view at source ↗
Figure 62
Figure 62. Figure 62: H: A.ShiftScaleRotate(shift limit=0.15, scale limit=0.25, rotate limit=30, border mode=0,value=0, mask value=0, p=0.7) [PITH_FULL_IMAGE:figures/full_fig_p053_62.png] view at source ↗
Figure 63
Figure 63. Figure 63: C1: A.ShiftScaleRotate(shift limit=0.25, scale limit=0.35, rotate limit=35, border mode=0,value=0, mask value=0, p=0.6) 53 [PITH_FULL_IMAGE:figures/full_fig_p053_63.png] view at source ↗
Figure 64
Figure 64. Figure 64: L: A.CenterCrop(448, 448, p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p054_64.png] view at source ↗
Figure 65
Figure 65. Figure 65: H: A.CenterCrop(384, 384, p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p054_65.png] view at source ↗
Figure 66
Figure 66. Figure 66: C1: A.CenterCrop(480, 480, p=1.0) 54 [PITH_FULL_IMAGE:figures/full_fig_p054_66.png] view at source ↗
Figure 67
Figure 67. Figure 67: L: A.CoarseDropout( max holes=2, max height=int(0.08*H),max width=int(0.08*W), min holes=1, min height=int(0.03*H),min width=int(0.03*W), fill value=0, p=0.15 ) [PITH_FULL_IMAGE:figures/full_fig_p055_67.png] view at source ↗
Figure 68
Figure 68. Figure 68: C1: A.CLAHE(clip limit=1.8, tile grid size=(8, 8), p=0.25) [PITH_FULL_IMAGE:figures/full_fig_p055_68.png] view at source ↗
Figure 69
Figure 69. Figure 69: H: A.CoarseDropout(max holes=4, max height=int(0.15 * H), max width=int(0.15 * W), min holes=2, min height=int(0.05 * H), min width=int(0.05 * W), fill value=0, p=0.25) 55 [PITH_FULL_IMAGE:figures/full_fig_p055_69.png] view at source ↗
Figure 70
Figure 70. Figure 70: C1: A.CoarseDropout(max holes=1, min holes=1, min height=int(0.02 * H), min width=int(0.02 * W), max height=int(0.04 * H), max width=int(0.04 * W), fill value=0.2, p=0.05) D.18 CropNonEmptyMaskIfExists [PITH_FULL_IMAGE:figures/full_fig_p056_70.png] view at source ↗
Figure 71
Figure 71. Figure 71: L: A.CropNonEmptyMaskIfExists(height=448, width=448,p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p056_71.png] view at source ↗
Figure 72
Figure 72. Figure 72: H1: A.CropNonEmptyMaskIfExists(height=384, width=384, p=1.0) [PITH_FULL_IMAGE:figures/full_fig_p056_72.png] view at source ↗
Figure 73
Figure 73. Figure 73: C1: A.CropNonEmptyMaskIfExists(height=480, width=480,p=0.7) [PITH_FULL_IMAGE:figures/full_fig_p056_73.png] view at source ↗
Figure 74
Figure 74. Figure 74: L: A.Downscale(scale min=0.65, scale max=0.85, interpolation=1, p=0.25) [PITH_FULL_IMAGE:figures/full_fig_p057_74.png] view at source ↗
Figure 75
Figure 75. Figure 75: H: A.Downscale(scale min=0.4, scale max=0.8, interpolation=1, p=0.35) [PITH_FULL_IMAGE:figures/full_fig_p057_75.png] view at source ↗
Figure 76
Figure 76. Figure 76: C1: A.Downscale(scale min=0.5, scale max=0.80, interpolation=1, p=0.45) [PITH_FULL_IMAGE:figures/full_fig_p057_76.png] view at source ↗
Figure 77
Figure 77. Figure 77: C2: A.Downscale(scale range=[0.35, 0.6], interpolation pair=”upscale”:0,”downscale”:0) 57 [PITH_FULL_IMAGE:figures/full_fig_p057_77.png] view at source ↗
Figure 78
Figure 78. Figure 78: L: A.ImageCompression(quality lower=40, quality upper=70,p=0.30) [PITH_FULL_IMAGE:figures/full_fig_p058_78.png] view at source ↗
Figure 79
Figure 79. Figure 79: H: A.ImageCompression(quality lower=10, quality upper=50, p=0.40) [PITH_FULL_IMAGE:figures/full_fig_p058_79.png] view at source ↗
Figure 80
Figure 80. Figure 80: C1: A.ImageCompression(quality lower=30, quality upper=80, p=0.35) 58 [PITH_FULL_IMAGE:figures/full_fig_p058_80.png] view at source ↗
Figure 81
Figure 81. Figure 81: L: IntensityWindowing( window center=(0.35, 0.65), window width=(0.25,0.55), p=0.5 ) [PITH_FULL_IMAGE:figures/full_fig_p059_81.png] view at source ↗
Figure 82
Figure 82. Figure 82: H: IntensityWindowing(window center=(-0.20, 1.20),window width=(0.08, 1.40),p=0.5) [PITH_FULL_IMAGE:figures/full_fig_p059_82.png] view at source ↗
Figure 83
Figure 83. Figure 83: C1: IntensityWindowing(window center=(0.48,0.55), window width=(0.38, 0.48),p=0.15) D.22 UnsharpMask [PITH_FULL_IMAGE:figures/full_fig_p059_83.png] view at source ↗
Figure 84
Figure 84. Figure 84: L: A.UnsharpMask(blur limit=(3, 5), alpha=(0.10, 0.25),p=0.20) 59 [PITH_FULL_IMAGE:figures/full_fig_p059_84.png] view at source ↗
Figure 85
Figure 85. Figure 85: H: A.UnsharpMask(blur limit=(5, 13), alpha=(0.30, 0.70), p=0.60) [PITH_FULL_IMAGE:figures/full_fig_p060_85.png] view at source ↗
Figure 86
Figure 86. Figure 86: C1: A.UnsharpMask(blur limit=(3,7), alpha=(0.15,0.40), p=0.30) [PITH_FULL_IMAGE:figures/full_fig_p060_86.png] view at source ↗
Figure 87
Figure 87. Figure 87: C2: A.UnsharpMask(blur limit=(3,7), alpha=( 0.15,0.40), p=0.30) [PITH_FULL_IMAGE:figures/full_fig_p060_87.png] view at source ↗
Figure 88
Figure 88. Figure 88: CH: A.UnsharpMask(blur limit=(10,20), sigma limit=10, alpha=(1,1), threshold=1, p=0.30) 60 [PITH_FULL_IMAGE:figures/full_fig_p060_88.png] view at source ↗
Figure 89
Figure 89. Figure 89: L: A.GridDistortion(num steps=5, distort limit=0.2, p=0.4) [PITH_FULL_IMAGE:figures/full_fig_p061_89.png] view at source ↗
Figure 90
Figure 90. Figure 90: H: A.GridDistortion(num steps=8, distort limit=0.35, p=0.6) [PITH_FULL_IMAGE:figures/full_fig_p061_90.png] view at source ↗
Figure 91
Figure 91. Figure 91: C1: A.GridDistortion(num steps=7, distort limit=0.25,p=0.35) [PITH_FULL_IMAGE:figures/full_fig_p061_91.png] view at source ↗
Figure 92
Figure 92. Figure 92: C2: GridDistortion(num steps=9, distort limit=0.38, p=0.55) 61 [PITH_FULL_IMAGE:figures/full_fig_p061_92.png] view at source ↗
Figure 93
Figure 93. Figure 93: C3: GridDistortion(num steps=3, distort limit=[-0.4, 0.4], interpolation=cv2.INTER AREA, normalized=True, mask interpolation=cv2.INTER AREA, keypoint remapping method=”mask”, border mode=cv2.BORDER REPLICATE, fill=0, fill mask=0) D.24 RandomErasing [PITH_FULL_IMAGE:figures/full_fig_p062_93.png] view at source ↗
Figure 94
Figure 94. Figure 94: L: T.RandomErasing( p=1.0, scale=(0.02, 0.06), ratio=(0.6, 1.6), value=0.0, inplace=False, [PITH_FULL_IMAGE:figures/full_fig_p062_94.png] view at source ↗
Figure 95
Figure 95. Figure 95: H: T.RandomErasing( p=1.0, scale=(0.02, 0.06), ratio=(0.6, 1.6), value=0.0, inplace=False, [PITH_FULL_IMAGE:figures/full_fig_p062_95.png] view at source ↗
Figure 96
Figure 96. Figure 96: C1: T.RandomErasing( p=0.30, scale=(0.01, 0.07), ratio=(0.15, 4.5), [PITH_FULL_IMAGE:figures/full_fig_p063_96.png] view at source ↗
Figure 97
Figure 97. Figure 97: C2: T.RandomErasing( p=0.35, scale=(0.01, 0.08), ratio=(0.2, 3.0), [PITH_FULL_IMAGE:figures/full_fig_p063_97.png] view at source ↗
Figure 98
Figure 98. Figure 98: L: sp noise uint8(img, amount=0.003,ratio=0.5, p=0.25) 63 [PITH_FULL_IMAGE:figures/full_fig_p063_98.png] view at source ↗
Figure 99
Figure 99. Figure 99: H: sp noise uint8(img, amount=float(np.random.uniform(0.008, 0.030)), ratio=float(np.random.uniform(0.35, 0.65)), p=0.5) [PITH_FULL_IMAGE:figures/full_fig_p064_99.png] view at source ↗
Figure 100
Figure 100. Figure 100: C1: sp noise uint8(img, amount=float(np.random.uniform(0.004, 0.012)), ratio=float(np.random.uniform(0.45, 0.55)) , p=0.20) [PITH_FULL_IMAGE:figures/full_fig_p064_100.png] view at source ↗
Figure 101
Figure 101. Figure 101: C2: sp noise uint8(img, amount=float(np.random.uniform(0.001, 0.004)), ratio=float(np.random.uniform(0.48, 0.52)) , p=0.20) 64 [PITH_FULL_IMAGE:figures/full_fig_p064_101.png] view at source ↗
Figure 102
Figure 102. Figure 102: C3: SaltAndPepperOnMask(amount=(0.03, 0.06), ratio=(0.48, 0.52), p=1.0) mask-based [PITH_FULL_IMAGE:figures/full_fig_p065_102.png] view at source ↗
Figure 103
Figure 103. Figure 103: L: DepthAttenuation(attenuation rate=1.0,max attenuation=0.6, p=0.3) [PITH_FULL_IMAGE:figures/full_fig_p065_103.png] view at source ↗
Figure 104
Figure 104. Figure 104: C1: DepthAttenuation(attenuation rate=(0.25, 0.9),max attenuation=0.08, p=0.2) [PITH_FULL_IMAGE:figures/full_fig_p065_104.png] view at source ↗
Figure 105
Figure 105. Figure 105: C2: DepthAttenuation(attenuation rate=1,max attenuation=2, p=0.3) D.27 GaussianShadow [PITH_FULL_IMAGE:figures/full_fig_p065_105.png] view at source ↗
Figure 106
Figure 106. Figure 106: 100 L: GaussianShadow(strength=0.2, sigma [PITH_FULL_IMAGE:figures/full_fig_p065_106.png] view at source ↗
Figure 107
Figure 107. Figure 107: C1: GaussianShadow(strength=(0.12, 0.28), sigma [PITH_FULL_IMAGE:figures/full_fig_p066_107.png] view at source ↗
Figure 108
Figure 108. Figure 108: L: HazeArtifact(radius=0.2, sigma=0.2,p=0.2) [PITH_FULL_IMAGE:figures/full_fig_p066_108.png] view at source ↗
Figure 109
Figure 109. Figure 109: C1: HazeArtifact(radius=(0.15, 0.45), sigma=(0.03, 0.06), p=0.15) [PITH_FULL_IMAGE:figures/full_fig_p066_109.png] view at source ↗
Figure 110
Figure 110. Figure 110: C1:SpeckleReduction(sigma spatial=0.2, sigma color=0.2, window size=5, p=0.3) 66 [PITH_FULL_IMAGE:figures/full_fig_p066_110.png] view at source ↗
Figure 111
Figure 111. Figure 111: C1:SpeckleReduction(sigma spatial=0.2, sigma color=0.2, window size=5, p=0.3) 67 [PITH_FULL_IMAGE:figures/full_fig_p067_111.png] view at source ↗
read the original abstract

Deep learning models for echocardiography segmentation often struggle to generalise across institutions, scanners, and patient populations, where collecting large, consistently annotated datasets is infeasible. Data augmentation is widely used to improve the robustness of deep learning models; however, its role in enhancing cross-dataset generalisability in echocardiography remains insufficiently understood. This study presents a large-scale multi-dataset evaluation of 29 data augmentation techniques and their pairwise combinations for 2D left ventricular segmentation using a U-Net trained on Unity, CAMUS, and EchoNet Dynamic datasets. Each augmentation was explored under several hyperparameter settings and assessed through repeated runs using Dice and IoU in both in-domain and cross-dataset scenarios, with statistical significance quantified via independent t-tests. Results show that anatomically plausible geometric transformations, particularly affine, shift-scale-rotate, perspective, and random horizontal flip, substantially improve cross-dataset performance, whereas aggressive intensity- or artefact-based augmentations often degrade generalisability. Pairwise augmentation combinations outperform individual augmentations and show that moderate flip-centric combinations, especially random horizontal flip with affine, yield consistent gains across most transfer scenarios. These findings provide empirically grounded guidance for designing augmentation policies that enhance the robustness and transferability of echocardiography segmentation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript conducts a large-scale evaluation of 29 data augmentation techniques and all their pairwise combinations for 2D left-ventricular segmentation with a U-Net. Training and testing are performed on the Unity, CAMUS and EchoNet-Dynamic datasets, with performance measured by Dice and IoU in both in-domain and cross-dataset settings. Repeated runs and independent t-tests are used to identify that anatomically plausible geometric transforms (affine, shift-scale-rotate, perspective, random horizontal flip) improve cross-dataset generalisability while aggressive intensity- or artefact-based augmentations degrade it, and that moderate flip-centric pairwise combinations yield the most consistent gains.

Significance. If the statistical claims survive correction for multiple testing and the experimental details are fully reported, the work supplies empirically grounded, practical guidance for augmentation policy design in echocardiography segmentation—an area where domain shift remains a central obstacle. The breadth of the augmentation sweep and the explicit cross-dataset protocol constitute a clear methodological contribution.

major comments (2)
  1. [Results] Results / Statistical analysis: The central claims rest on statistical significance obtained from a large number of independent t-tests performed across 29 augmentations, multiple hyper-parameter settings per augmentation, repeated runs, and all pairwise combinations. No correction for multiple comparisons (Bonferroni, Holm, or FDR) is mentioned. This directly affects which specific geometric transforms can be confidently declared beneficial or detrimental for generalisability.
  2. [Methods] Methods: Hyper-parameter ranges, exact implementation details for each of the 29 augmentations, and complete numerical results tables (including per-run Dice/IoU values) are not supplied. Without these, independent verification of the reported cross-dataset improvements is impossible.
minor comments (1)
  1. [Abstract] The abstract and results text would benefit from a concise statement of the total number of statistical tests performed so that readers can immediately appreciate the multiple-testing burden.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed review. We address each major comment below and outline the changes we will implement to enhance the statistical robustness and reproducibility of the manuscript.

read point-by-point responses
  1. Referee: [Results] Results / Statistical analysis: The central claims rest on statistical significance obtained from a large number of independent t-tests performed across 29 augmentations, multiple hyper-parameter settings per augmentation, repeated runs, and all pairwise combinations. No correction for multiple comparisons (Bonferroni, Holm, or FDR) is mentioned. This directly affects which specific geometric transforms can be confidently declared beneficial or detrimental for generalisability.

    Authors: We acknowledge that the large number of t-tests performed raises a legitimate issue of multiple comparisons. Our primary conclusions, however, are grounded in consistent performance patterns observed across independent datasets and augmentation families, rather than isolated p-values. In the revised manuscript we will apply Bonferroni correction to all reported tests, present the adjusted p-values, and explicitly state which geometric augmentations retain statistical support after correction. We will also clarify that practical recommendations are informed by both significance and effect size consistency. revision: yes

  2. Referee: [Methods] Methods: Hyper-parameter ranges, exact implementation details for each of the 29 augmentations, and complete numerical results tables (including per-run Dice/IoU values) are not supplied. Without these, independent verification of the reported cross-dataset improvements is impossible.

    Authors: We agree that these details are essential for reproducibility. The revised manuscript will expand the Methods section to list all 29 augmentations together with their explored hyper-parameter ranges and exact Albumentations implementations. A new supplementary section will contain full results tables reporting mean and standard deviation of Dice and IoU for every setting and run. Raw per-run values will be deposited in a public repository linked from the paper. revision: yes

standing simulated objections not resolved
  • Whether every individual statistical claim will remain significant after Bonferroni correction, which requires re-analysis of the complete set of raw experimental results.

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation of augmentations

full rationale

The paper performs an experimental study training U-Net models on Unity, CAMUS, and EchoNet Dynamic datasets, testing 29 augmentations and their pairwise combinations under multiple hyperparameters with repeated runs, then measuring Dice/IoU in in-domain and cross-dataset settings with independent t-tests. No equations, derivations, fitted parameters presented as predictions, or self-citations that bear the central load are present; all claims reduce directly to observed performance differences rather than any self-referential construction or renaming of inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on the representativeness of the three public datasets and on the assumption that U-Net behavior under these augmentations generalizes to other architectures and clinical settings.

free parameters (1)
  • augmentation hyperparameters
    Each of the 29 augmentations was tested under several hyperparameter settings whose exact values are not enumerated in the abstract.
axioms (1)
  • domain assumption The selected datasets capture sufficient real-world scanner and population variability for cross-dataset evaluation.
    Invoked when defining in-domain versus cross-dataset performance scenarios.

pith-pipeline@v0.9.0 · 5763 in / 1165 out tokens · 85085 ms · 2026-05-20T20:36:32.121697+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

  1. [1]

    A systematic review of deep learning data augmentation in medical imaging: Recent advances and future research directions,

    Islam, T., Hafiz, M. S., Jim, J. R., Kabir, M. M., and Mridha, M., “A systematic review of deep learning data augmentation in medical imaging: Recent advances and future research directions,”Healthcare Analytics5, 100340 (June 2024)

  2. [2]

    Improving deep learning-based au- tomatic cranial defect reconstruction by heavy data augmentation: From image registration to latent diffusion models,

    Wodzinski, M., Kwarciak, K., Daniol, M., and Hemmerling, D., “Improving deep learning-based au- tomatic cranial defect reconstruction by heavy data augmentation: From image registration to latent diffusion models,”Computers in Biology and Medicine182, 109129 (Nov. 2024)

  3. [3]

    A Com- prehensive Survey on Data Augmentation,

    Wang, Z., Wang, P., Liu, K., Wang, P., Fu, Y., Lu, C.-T., Aggarwal, C. C., Pei, J., and Zhou, Y., “A Com- prehensive Survey on Data Augmentation,”IEEE Transactions on Knowledge and Data Engineering38, 47–66 (Jan. 2026)

  4. [4]

    Revisiting Data Augmentation for Ultrasound Images,

    Tupper, A. and Gagn´ e, C., “Revisiting Data Augmentation for Ultrasound Images,” (2025). Version Number: 2

  5. [5]

    Image Data Augmentation Approaches: A Comprehensive Survey and Future Directions,

    Kumar, T., Brennan, R., Mileo, A., and Bendechache, M., “Image Data Augmentation Approaches: A Comprehensive Survey and Future Directions,”IEEE Access12, 187536–187571 (2024)

  6. [6]

    Data Augmentation in Classification and Segmentation: A Survey and New Strategies,

    Alomar, K., Aysel, H. I., and Cai, X., “Data Augmentation in Classification and Segmentation: A Survey and New Strategies,”Journal of Imaging9, 46 (Feb. 2023)

  7. [7]

    Medical image data augmentation: techniques, comparisons and interpretations,

    Goceri, E., “Medical image data augmentation: techniques, comparisons and interpretations,”Artificial Intelligence Review56, 12561–12605 (Nov. 2023). 29

  8. [8]

    Segment anything model for medical image analysis: An experimental study,

    Mazurowski, M. A., Dong, H., Gu, H., Yang, J., Konz, N., and Zhang, Y., “Segment anything model for medical image analysis: An experimental study,”Medical Image Analysis89, 102918 (Oct. 2023)

  9. [9]

    Segment anything in medical images,

    Ma, J., He, Y., Li, F., Han, L., You, C., and Wang, B., “Segment anything in medical images,”Nature Communications15, 654 (Jan. 2024)

  10. [10]

    Contrastive Pretraining for Echocardiography Segmentation with Limited Data,

    Saeed, M., Muhtaseb, R., and Yaqub, M., “Contrastive Pretraining for Echocardiography Segmentation with Limited Data,” (2022). Version Number: 3

  11. [11]

    Efficient deep learning-based automated diagnosis from echocardiography with contrastive self-supervised learning,

    Holste, G., Oikonomou, E. K., Mortazavi, B. J., Wang, Z., and Khera, R., “Efficient deep learning-based automated diagnosis from echocardiography with contrastive self-supervised learning,”Communications Medicine4, 133 (July 2024)

  12. [12]

    An improved contrastive learning network for semi-supervised multi-structure segmentation in echocardiography,

    Guo, Z., Zhang, Y., Qiu, Z., Dong, S., He, S., Gao, H., Zhang, J., Chen, Y., He, B., Kong, Z., Qiu, Z., Li, Y., and Li, C., “An improved contrastive learning network for semi-supervised multi-structure segmentation in echocardiography,”Frontiers in Cardiovascular Medicine10, 1266260 (Sept. 2023)

  13. [13]

    Enhancing Radio- logical Diagnosis: A Comprehensive Review of Image Quality Assessment and Optimization Strategies,

    Varghese, A. P., Naik, S., Asrar Up Haq Andrabi, S., Luharia, A., and Tivaskar, S., “Enhancing Radio- logical Diagnosis: A Comprehensive Review of Image Quality Assessment and Optimization Strategies,” Cureus(June 2024)

  14. [14]

    Bone shadow segmentation from ultrasound data for orthopedic surgery using GAN,

    Alsinan, A. Z., Patel, V. M., and Hacihaliloglu, I., “Bone shadow segmentation from ultrasound data for orthopedic surgery using GAN,”International Journal of Computer Assisted Radiology and Surgery15, 1477–1485 (Sept. 2020)

  15. [15]

    UltraAugment: Fan-shape and Artifact- based Data Augmentation for 2D Ultrasound Images,

    Ramakers, F., Vercauteren, T., Deprest, J., and Williams, H., “UltraAugment: Fan-shape and Artifact- based Data Augmentation for 2D Ultrasound Images,” in [2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)], 2422–2431, IEEE, Seattle, WA, USA (June 2024)

  16. [16]

    Medical ul- trasound image speckle reduction and resolution enhancement using texture compensated multi-resolution convolution neural network,

    Moinuddin, M., Khan, S., Alsaggaf, A. U., Abdulaal, M. J., Al-Saggaf, U. M., and Ye, J. C., “Medical ul- trasound image speckle reduction and resolution enhancement using texture compensated multi-resolution convolution neural network,”Frontiers in Physiology13, 961571 (Nov. 2022)

  17. [17]

    Enhancing fetal ultrasound image quality and anatomical plane recognition in low-resource settings using super-resolution models,

    Boumeridja, H., Ammar, M., Alzubaidi, M., Mahmoudi, S., Benamer, L. N., Agus, M., Househ, M., Lekadir, K., and El Habib Daho, M., “Enhancing fetal ultrasound image quality and anatomical plane recognition in low-resource settings using super-resolution models,”Scientific Reports15, 8376 (Mar. 2025)

  18. [18]

    Ultrasam: a foundation model for ul- trasound using large open-access segmentation datasets,

    Meyer, A., Murali, A., Zarin, F., Mutter, D., and Padoy, N., “Ultrasam: a foundation model for ul- trasound using large open-access segmentation datasets,”International Journal of Computer Assisted Radiology and Surgery(Sept. 2025)

  19. [19]

    GUDU: Geometrically-constrained Ultrasound Data augmentation in U-Net for echocardiography semantic segmentation,

    Sfakianakis, C., Simantiris, G., and Tziritas, G., “GUDU: Geometrically-constrained Ultrasound Data augmentation in U-Net for echocardiography semantic segmentation,”Biomedical Signal Processing and Control82, 104557 (Apr. 2023)

  20. [20]

    The impact of scanner domain shift on deep learning performance in medical imaging: an experimental study,

    Guo, B., Lu, D., Szumel, G., Gui, R., Wang, T., Konz, N., and Mazurowski, M. A., “The impact of scanner domain shift on deep learning performance in medical imaging: an experimental study,”arXiv preprint arXiv:2409.04368(2024)

  21. [21]

    Semi-supervised Active Learning for Left Ventricle Segmentation in Echocardiography,

    Alajrami, E., DadashiSerej, N., Jevsikov, J., Fernandes, P., Abdi, A., Ufumaka, I., Francis, D. P., and Zolgharni, M., “Semi-supervised Active Learning for Left Ventricle Segmentation in Echocardiography,” in [Medical Imaging with Deep Learning], 30

  22. [22]

    Self-supervised learning for label-free segmen- tation in cardiac ultrasound,

    Ferreira, D. L., Lau, C., Salaymang, Z., and Arnaout, R., “Self-supervised learning for label-free segmen- tation in cardiac ultrasound,”Nature Communications16, 4070 (Apr. 2025)

  23. [23]

    A self-supervised semi-supervised echocardiographic video left ventricle segmen- tation method,

    Wang, T. and Dai, Q., “A self-supervised semi-supervised echocardiographic video left ventricle segmen- tation method,”Biomedical Signal Processing and Control101, 107211 (Mar. 2025)

  24. [24]

    Consensus-guided evaluation of self-supervised learning in echocardiographic segmentation,

    Naidoo, P., Fernandes, P., Dadashi Serej, N., Manisty, C. H., Shun-Shin, M. J., Francis, D. P., and Zol- gharni, M., “Consensus-guided evaluation of self-supervised learning in echocardiographic segmentation,” Computers in Biology and Medicine198, 111148 (Nov. 2025)

  25. [25]

    Rethinking Self-Supervised Semantic Segmentation: Achieving End-to-End Segmentation,

    Liu, Y., Zeng, J., Tao, X., and Fang, G., “Rethinking Self-Supervised Semantic Segmentation: Achieving End-to-End Segmentation,”IEEE Transactions on Pattern Analysis and Machine Intelligence46, 10036– 10046 (Dec. 2024)

  26. [26]

    A Survey on Self-Supervised Learning: Algorithms, Applications, and Future Trends,

    Gui, J., Chen, T., Zhang, J., Cao, Q., Sun, Z., Luo, H., and Tao, D., “A Survey on Self-Supervised Learning: Algorithms, Applications, and Future Trends,”IEEE Transactions on Pattern Analysis and Machine Intelligence46, 9052–9071 (Dec. 2024)

  27. [27]

    Contrastive Learning for View Classification of Echocardiograms,

    Chartsias, A., Gao, S., Mumith, A., Oliveira, J., Bhatia, K., Kainz, B., and Beqiri, A., “Contrastive Learning for View Classification of Echocardiograms,” in [Simplifying Medical Ultrasound], Noble, J. A., Aylward, S., Grimwood, A., Min, Z., Lee, S.-L., and Hu, Y., eds.,12967, 149–158, Springer International Publishing, Cham (2021). Series Title: Lecture...

  28. [28]

    Segmenting Cardiac Ultrasound Videos Using Self-Supervised Learning,

    Lamoureux, E., Ayromlou, S., Ahmadi Amiri, S. N., and Rhodin, H., “Segmenting Cardiac Ultrasound Videos Using Self-Supervised Learning,” in [2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)], 1–7, IEEE, Sydney, Australia (July 2023)

  29. [29]

    EchoFM: Foundation Model for Generalizable Echocardiogram Analysis,

    Kim, S., Jin, P., Song, S., Chen, C., Li, Y., Ren, H., Li, X., Liu, T., and Li, Q., “EchoFM: Foundation Model for Generalizable Echocardiogram Analysis,”IEEE Transactions on Medical Imaging44, 4049– 4062 (Oct. 2025)

  30. [30]

    Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation,

    Vepa, A. M., Yang, Z., Choi, A., Joo, J., Scalzo, F., and Sun, Y., “Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation,” (2024). Version Number: 1

  31. [31]

    MDAL: Modality- difference-based active learning for multimodal medical image analysis via contrastive learning and point- wise mutual information,

    Wang, H., Jin, Q., Du, X., Wang, L., Guo, Q., Li, H., Wang, M., and Song, Z., “MDAL: Modality- difference-based active learning for multimodal medical image analysis via contrastive learning and point- wise mutual information,”Computerized Medical Imaging and Graphics123, 102544 (July 2025)

  32. [32]

    Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation,

    Zhang, J., Zhang, S., Shen, X., Lukasiewicz, T., and Xu, Z., “Multi-ConDoS: Multimodal Contrastive Domain Sharing Generative Adversarial Networks for Self-Supervised Medical Image Segmentation,” IEEE Transactions on Medical Imaging43, 76–95 (Jan. 2024)

  33. [33]

    EchoGNN with Contrastive Learning,

    Kondori, N., Fung, A., and Goco, J. A. D., “EchoGNN with Contrastive Learning,”

  34. [34]

    RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning,

    Mei, X., Liu, Z., Robson, P. M., Marinelli, B., Huang, M., Doshi, A., Jacobi, A., Cao, C., Link, K. E., Yang, T., Wang, Y., Greenspan, H., Deyer, T., Fayad, Z. A., and Yang, Y., “RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning,”Radiology: Artificial Intel- ligence4, e210315 (Sept. 2022)

  35. [35]

    Toward Generalizability in the Deployment of Artificial Intelligence in Radiology: Role of Computation Stress Testing to Overcome Underspecification,

    Eche, T., Schwartz, L. H., Mokrane, F.-Z., and Dercle, L., “Toward Generalizability in the Deployment of Artificial Intelligence in Radiology: Role of Computation Stress Testing to Overcome Underspecification,” Radiology: Artificial Intelligence3, e210097 (Nov. 2021). 31

  36. [36]

    Active learning for left ventricle segmentation in echocardiography,

    Alajrami, E., Ng, T., Jevsikov, J., Naidoo, P., Fernandes, P., Azarmehr, N., Dinmohammadi, F., Shun- shin, M. J., Dadashi Serej, N., Francis, D. P., and Zolgharni, M., “Active learning for left ventricle segmentation in echocardiography,”Computer Methods and Programs in Biomedicine248, 108111 (May 2024)

  37. [37]

    Rethinking deep active learning for medical image segmentation: A diffusion and angle-based framework,

    Qu, L., Jin, Q., Fu, K., Wang, M., and Song, Z., “Rethinking deep active learning for medical image segmentation: A diffusion and angle-based framework,”Biomedical Signal Processing and Control96, 106493 (Oct. 2024)

  38. [38]

    Two-View Left Ventricular Segmentation and Ejection Fraction Estimation in 2D Echocardiograms.,

    Tabuco, F. C. A., Magno, J. D. A., Orillaza Jr, N. S., Domingo, R. A. V., and Naval, P. C., “Two-View Left Ventricular Segmentation and Ejection Fraction Estimation in 2D Echocardiograms.,” in [BMVC], 176 (2022)

  39. [39]

    Fast and accurate view classification of echocar- diograms using deep learning,

    Madani, A., Arnaout, R., Mofrad, M., and Arnaout, R., “Fast and accurate view classification of echocar- diograms using deep learning,”npj Digital Medicine1, 6 (Mar. 2018)

  40. [40]

    Optimizing Object Detection Algorithms for Congenital Heart Diseases in Echocardiography: Exploring Bounding Box Sizes and Data Augmentation Techniques,

    Chen, S.-H., Weng, K.-P., Hsieh, K.-S., Chen, Y.-H., Shih, J.-H., Li, W.-R., Zhang, R.-Y., Chen, Y.-C., Tsai, W.-R., and Kao, T.-Y., “Optimizing Object Detection Algorithms for Congenital Heart Diseases in Echocardiography: Exploring Bounding Box Sizes and Data Augmentation Techniques,”Reviews in Cardiovascular Medicine25, 335 (Sept. 2024)

  41. [41]

    Unsupervised Echocardiography Registration Through Patch-Based MLPs and Transformers,

    Wang, Z., Yang, Y., Sermesant, M., and Delingette, H., “Unsupervised Echocardiography Registration Through Patch-Based MLPs and Transformers,” in [Statistical Atlases and Computational Models of the Heart. Regular and CMRxMotion Challenge Papers], Camara, O., Puyol-Ant´ on, E., Qin, C., Sermesant, M., Suinesiaputra, A., Wang, S., and Young, A., eds.,13593...

  42. [42]

    Artificial intelligence-based classification of echocardiographic views,

    Naser, J. A., Lee, E., Pislaru, S. V., Tsaban, G., Malins, J. G., Jackson, J. I., Anisuzzaman, D. M., Rostami, B., Lopez-Jimenez, F., Friedman, P. A., Kane, G. C., Pellikka, P. A., and Attia, Z. I., “Artificial intelligence-based classification of echocardiographic views,”European Heart Journal - Digital Health5, 260–269 (May 2024)

  43. [43]

    Generative augmentations for improved cardiac ultrasound segmentation using diffusion models,

    Van De Vyver, G., Lenz, A. T., Smistad, E., Olaisen, S. H., Grenne, B., Holte, E., Dalen, H., and Løvstakken, L., “Generative augmentations for improved cardiac ultrasound segmentation using diffusion models,”Scientific Reports15, 38013 (Oct. 2025)

  44. [44]

    Annotation Cost Minimization for Ultrasound Image Segmentation using Cross-domain Transfer Learning,

    Monkam, P., Jin, S., and Lu, W., “Annotation Cost Minimization for Ultrasound Image Segmentation using Cross-domain Transfer Learning,”IEEE Journal of Biomedical and Health Informatics, 1–11 (2023)

  45. [45]

    Myocardial Function Imaging in Echocardiography Using Deep Learning,

    Ostvik, A., Salte, I. M., Smistad, E., Nguyen, T. M., Melichova, D., Brunvand, H., Haugaa, K., Edvard- sen, T., Grenne, B., and Lovstakken, L., “Myocardial Function Imaging in Echocardiography Using Deep Learning,”IEEE Transactions on Medical Imaging40, 1340–1351 (May 2021)

  46. [46]

    Fast Speckle Noise Suppression Algorithm in Breast Ultrasound Image Using Three-Dimensional Deep Learning,

    Li, X., Wang, Y., Zhao, Y., and Wei, Y., “Fast Speckle Noise Suppression Algorithm in Breast Ultrasound Image Using Three-Dimensional Deep Learning,”Frontiers in Physiology13, 880966 (Apr. 2022)

  47. [47]

    Comprehensive echocardiogram evaluation with view primed vision language AI,

    Vukadinovic, M., Chiu, I.-M., Tang, X., Yuan, N., Chen, T.-Y., Cheng, P., Li, D., Cheng, S., He, B., and Ouyang, D., “Comprehensive echocardiogram evaluation with view primed vision language AI,”Nature (Nov. 2025). 32

  48. [48]

    Highlighting nerves and blood vessels for ultrasound-guided axillary nerve block procedures using neural networks,

    Smistad, E., Johansen, K. F., Iversen, D. H., and Reinertsen, I., “Highlighting nerves and blood vessels for ultrasound-guided axillary nerve block procedures using neural networks,”Journal of Medical Imaging5, 1 (Nov. 2018)

  49. [49]

    Video-based AI for beat-to-beat assessment of cardiac function,

    Ouyang, D., He, B., Ghorbani, A., Yuan, N., Ebinger, J., Langlotz, C. P., Heidenreich, P. A., Harrington, R. A., Liang, D. H., Ashley, E. A., and Zou, J. Y., “Video-based AI for beat-to-beat assessment of cardiac function,”Nature580, 252–256 (Apr. 2020)

  50. [50]

    EchoNet-Dynamic,

    Ouyang, D., He, B., Ghorbani, A., Yuan, N., Ebinger, J., Langlotz, C. P., Heidenreich, P. A., Harrington, R. A., Liang, D. H., Ashley, E. A., and Zou, J. Y., “EchoNet-Dynamic,” (2020)

  51. [51]

    Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography,

    Leclerc, S., Smistad, E., Pedrosa, J., Ostvik, A., Cervenansky, F., Espinosa, F., Espeland, T., Berg, E. A. R., Jodoin, P.-M., Grenier, T., Lartizien, C., Dhooge, J., Lovstakken, L., and Bernard, O., “Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography,”IEEE Trans- actions on Medical Imaging38, 2198–2210 (Sept. 2019)

  52. [52]

    Automated left ventricular dimension assessment using artificial intelligence developed and validated by a uk-wide collaborative,

    Howard, J. P., Stowell, C. C., Cole, G. D., Ananthan, K., Demetrescu, C. D., Pearce, K., Rajani, R., Sehmi, J., Vimalesvaran, K., Kanaganayagam, G. S., et al., “Automated left ventricular dimension assessment using artificial intelligence developed and validated by a uk-wide collaborative,”Circulation: Cardiovascular Imaging14(5), e011951 (2021)

  53. [53]

    Echonet- dynamic: a large new cardiac motion video data resource for medical machine learning,

    Ouyang, D., He, B., Ghorbani, A., Lungren, M. P., Ashley, E. A., Liang, D. H., and Zou, J. Y., “Echonet- dynamic: a large new cardiac motion video data resource for medical machine learning,”Conference on Neural Information Processing Systems (NeurIPS)33(2019)

  54. [54]

    Dsmri: domain shift analyzer for multi-center mri datasets,

    Kushol, R., Wilman, A. H., Kalra, S., and Yang, Y.-H., “Dsmri: domain shift analyzer for multi-center mri datasets,”Diagnostics13(18), 2947 (2023)

  55. [55]

    Mri manufacturer shift and adaptation: increasing the generalizability of deep learning segmentation for mr images acquired with different scanners,

    Yan, W., Huang, L., Xia, L., Gu, S., Yan, F., Wang, Y., and Tao, Q., “Mri manufacturer shift and adaptation: increasing the generalizability of deep learning segmentation for mr images acquired with different scanners,”Radiology: Artificial Intelligence2(4), e190195 (2020)

  56. [56]

    A simple framework for contrastive learning of visual representations,

    Chen, T., Kornblith, S., Norouzi, M., and Hinton, G., “A simple framework for contrastive learning of visual representations,” in [International conference on machine learning], 1597–1607, PmLR (2020)

  57. [57]

    Momentum contrast for unsupervised visual repre- sentation learning,

    He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R., “Momentum contrast for unsupervised visual repre- sentation learning,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition], 9729–9738 (2020)

  58. [58]

    Bootstrap your own latent-a new approach to self-supervised learning,

    Grill, J.-B., Strub, F., Altch´ e, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., et al., “Bootstrap your own latent-a new approach to self-supervised learning,”Advances in neural information processing systems33, 21271–21284 (2020). 33 APPENDIX A. ECHO-SPECIFIC FAN-SHAPE MASKING Some of the tar...