A European Multi-Center Breast Cancer MRI Dataset

Aitor Lopez; Alexandra Athanasiou; Alfredo Miguel Soro Busto; Christiane Kuhl; Daniel Truhn; Debora Jutz; Fiona J. Gilbert; Gustav M\"uller-Franzes; Jakob Nikolas Kather; JieFu Zhu

arxiv: 2506.00474 · v3 · pith:NWKUERH2new · submitted 2025-05-31 · 📡 eess.IV · cs.CV

A European Multi-Center Breast Cancer MRI Dataset

Gustav M\"uller-Franzes , Lorena Escudero S\'anchez , Nicholas Payne , Alexandra Athanasiou , Michael Kalogeropoulos , Aitor Lopez , Alfredo Miguel Soro Busto , Julia Camps Herrero

show 13 more authors

Nika Rasoolzadeh Tianyu Zhang Ritse Mann Debora Jutz Maike Bode Christiane Kuhl Yuan Gao Wouter Veldhuis Oliver Lester Saldanha JieFu Zhu Jakob Nikolas Kather Daniel Truhn Fiona J. Gilbert

This is my paper

Pith reviewed 2026-05-25 08:18 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords breast cancerMRI datasetmulti-centerpublic dataAI medical imagingbreast MRItransformer model

0 comments

The pith

A public dataset of 741 breast MRI examinations from six European sites is now available for AI research.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a new public breast MRI dataset collected from multiple clinical centers to help develop artificial intelligence tools for detecting breast cancer. Current AI efforts are limited by small or single-site datasets that do not reflect real clinical differences in equipment and procedures. By releasing this collection of 741 scans that include cancer cases, benign findings, and normal exams from varied scanners, the work aims to support more robust model training and testing. Baseline results using a transformer model are included to show how the data can be used.

Core claim

We present a publicly available, multi-centre breast MRI dataset collected across six clinical institutions in five European countries. The dataset comprises 741 examinations from women undergoing screening or diagnostic breast MRI and includes malignant, benign, and non-lesion cases. Data were acquired using heterogeneous scanners, field strengths, and acquisition protocols, reflecting real-world clinical variability. In addition, we report baseline benchmark experiments using a transformer-based model to illustrate potential use cases of the dataset and to provide reference performance for future methodological comparisons.

What carries the argument

The multi-center breast MRI dataset with 741 examinations reflecting real-world clinical variability across six sites.

If this is right

Supports development of AI methods for breast MRI interpretation that account for real-world variability.
Provides reference performance metrics from a transformer-based model for future comparisons.
Includes a mix of malignant, benign, and non-lesion cases to enable comprehensive model evaluation.
Allows testing of AI tools on data acquired under heterogeneous conditions typical of clinical practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Models trained on this data may generalize better to new clinical sites than those from single-center collections.
The dataset could be used to test whether multi-site training reduces the need for site-specific retraining in medical AI.
Future work might combine this with other public datasets to create even larger training resources for breast cancer detection.

Load-bearing premise

The differences in scanners, field strengths, and protocols across the six institutions are representative enough of clinical variation to improve AI generalization.

What would settle it

An experiment in which a model trained on the dataset shows no improvement in accuracy when tested on MRI scans from a new hospital compared to models trained on single-site data.

read the original abstract

Early detection of breast cancer is critical for improving patient outcomes. While mammography remains the primary screening modality, magnetic resonance imaging (MRI) is increasingly recommended as a supplemental tool for women with dense breast tissue and those at elevated risk. However, the acquisition and interpretation of multiparametric breast MRI are time-consuming and require specialized expertise, limiting scalability in clinical practice. Artificial intelligence (AI) methods have shown promise in supporting breast MRI interpretation, but their development is hindered by the limited availability of large, diverse, and publicly accessible datasets. To address this gap, we present a publicly available, multi-centre breast MRI dataset collected across six clinical institutions in five European countries. The dataset comprises 741 examinations from women undergoing screening or diagnostic breast MRI and includes malignant, benign, and non-lesion cases. Data were acquired using heterogeneous scanners, field strengths, and acquisition protocols, reflecting real-world clinical variability. In addition, we report baseline benchmark experiments using a transformer-based model to illustrate potential use cases of the dataset and to provide reference performance for future methodological comparisons.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper releases a 741-case multi-center European breast MRI dataset with scanner and protocol variation, plus basic transformer baselines.

read the letter

The main thing here is a new public dataset of 741 breast MRI exams collected from six sites in five countries. It covers screening and diagnostic cases, includes malignant, benign, and non-lesion examples, and was acquired on different vendors and field strengths. That combination is the concrete addition over smaller or single-center collections already out there. They also run a transformer model as a reference benchmark so others have a number to beat or compare against. That setup is useful on its face for groups that need varied training data without starting from scratch on collection and ethics approvals. The description of the acquisition protocols and the decision to release the data openly are the parts that actually move the needle for the field. The paper does not claim the heterogeneity automatically produces generalizable models, and it does not present cross-site validation results, which keeps the claims aligned with what is shown. Inclusion and exclusion details are presumably in the full text, but the abstract leaves them light. For a dataset paper that is normal rather than a flaw. The baselines are presented as reference points, not as proof of superiority. Readers working on breast MRI AI, especially those testing transformers or other architectures on real multi-vendor data, will find this directly usable. It is the kind of resource that lets more labs run experiments without reinventing the data pipeline. The work is straightforward and the central claim holds without circularity or unsupported leaps. I would send it to peer review so the community can check the release details, licensing, and any missing metadata before wider adoption.

Referee Report

3 major / 1 minor

Summary. The manuscript presents a publicly available multi-center breast MRI dataset collected across six clinical institutions in five European countries. The dataset comprises 741 examinations from women undergoing screening or diagnostic breast MRI, including malignant, benign, and non-lesion cases, acquired with heterogeneous scanners, field strengths, and protocols. Baseline benchmark experiments using a transformer-based model are reported to illustrate use cases and provide reference performance for future comparisons.

Significance. If released with complete documentation and accessibility, the dataset would address a documented gap in large, diverse public breast MRI resources for AI development. The multi-center, multi-country collection with real-world protocol variability could support studies on model generalization across clinical sites, providing a resource that single-center datasets cannot.

major comments (3)

[Abstract] Abstract: the baseline experiments are described as providing 'reference performance' but no quantitative metrics, error bars, model details, or cross-site results are supplied, leaving the claimed utility for methodological comparisons unsupported by evidence in the text.
[Dataset description] Dataset description: no inclusion or exclusion criteria are stated for the 741 examinations, which is load-bearing for users to assess case composition, selection bias, and applicability of the malignant/benign/non-lesion distribution.
[Data availability] Data availability: the central claim that the dataset is 'publicly available' is not accompanied by a repository link, access instructions, or licensing details, which prevents verification and use of the resource as asserted.

minor comments (1)

Table or supplementary material listing scanner vendors, field strengths, and sequence parameters per site would improve clarity and reproducibility of the heterogeneity claim.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major point below and indicate the corresponding revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the baseline experiments are described as providing 'reference performance' but no quantitative metrics, error bars, model details, or cross-site results are supplied, leaving the claimed utility for methodological comparisons unsupported by evidence in the text.

Authors: We agree that the abstract would be strengthened by including quantitative details. In the revision we will add a concise summary of the key performance metrics (with error bars), model architecture details, and cross-site evaluation results from the benchmark experiments already reported in the main text. revision: yes
Referee: [Dataset description] Dataset description: no inclusion or exclusion criteria are stated for the 741 examinations, which is load-bearing for users to assess case composition, selection bias, and applicability of the malignant/benign/non-lesion distribution.

Authors: This is a valid observation. The revised manuscript will include an explicit subsection stating the inclusion and exclusion criteria applied to the 741 examinations, together with the rationale for the malignant/benign/non-lesion distribution. revision: yes
Referee: [Data availability] Data availability: the central claim that the dataset is 'publicly available' is not accompanied by a repository link, access instructions, or licensing details, which prevents verification and use of the resource as asserted.

Authors: We acknowledge the omission. The revised version will provide the repository link, step-by-step access instructions, and licensing information in both the data availability statement and the main text. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a factual data-release description of a 741-examination multi-center breast MRI dataset collected across six European sites. Its central claim is the public availability and documented heterogeneity of the data; no mathematical derivations, predictions, fitted parameters, or uniqueness theorems are asserted. Baseline transformer experiments are presented only as reference performance for future work, not as cross-site validation results or self-referential claims. No load-bearing steps reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains. The derivation chain is self-contained as a descriptive release with no internal circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical data-release paper. No free parameters, mathematical axioms, or invented entities are introduced or required by the central claim.

pith-pipeline@v0.9.0 · 5805 in / 1226 out tokens · 26307 ms · 2026-05-25T08:18:50.725105+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 1 internal anchor

[1]

Barrios, C. H. Global challenges in breast cancer detection and treatment. The Breast 62 , S3–S6 (2022)

work page 2022
[2]

& Gathani, T

Wilkinson, L. & Gathani, T. Understanding breast cancer as a global health concern. The British Journal of Radiology 95 , 20211033 (2022)

work page 2022
[3]

Mann, R. M. et al. Breast cancer screening in women with extremely dense breasts recommendations of the European Society of Breast Imaging (EUSOBI). Eur Radiol 32 , 4036–4045 (2022)

work page 2022
[4]

Abdullah, K. A. et al. Deep learning-based breast cancer diagnosis in breast MRI: systematic review and meta-analysis. Eur Radiol (2025) doi:10.1007/s00330-025-11406-6

work page doi:10.1007/s00330-025-11406-6 2025
[5]

Garrucho, L. et al. A large-scale multicenter breast cancer DCE-MRI benchmark dataset with expert segmentations. Sci Data 12 , 453 (2025)

work page 2025
[6]

Saha, A. et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. Br J Cancer 119 , 508–516 (2018)

work page 2018
[7]

Zhao, X. et al. BreastDM: A DCE-MRI dataset for breast tumor image segmentation and classification. Computers in Biology and Medicine 164 , 107255 (2023)

work page 2023
[8]

Solomon, E. et al. FastMRI Breast: A Publicly Available Radial k-Space Dataset of Breast Dynamic Contrast-enhanced MRI. Radiology: Artificial Intelligence 7 , e240345 (2025)

work page 2025
[9]

CORDIS | European Commission https://doi.org/10.3030/101057091

Open Consortium for Decentralized Medical Artificial Intelligence | ODELIA Project | Fact Sheet | HORIZON. CORDIS | European Commission https://doi.org/10.3030/101057091

work page doi:10.3030/101057091
[10]

& Sklair-Levy, M

Daniels, D., Last, D., Cohen, K., Mardor, Y. & Sklair-Levy, M. Standard and Delayed Contrast-Enhanced MRI of Malignant and Benign Breast Lesions with Histological and Clinical Supporting Data (Advanced-MRI-Breast-Lesions). The Cancer Imaging Archive https://doi.org/10.7937/C7X1-YN57 (2024)

work page doi:10.7937/c7x1-yn57 2024
[11]

Saha, A. et al. Dynamic contrast-enhanced magnetic resonance images of breast cancer patients with tumor locations. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.E3SV-RE93 (2022)

work page doi:10.7937/tcia.e3sv-re93 2022
[12]

Müller-Franzes, G. et al. Medical Slice Transformer: Improved Diagnosis and Explainability on 3D Medical Images with DINOv2. Preprint at https://doi.org/10.48550/ARXIV.2411.15802 (2024)

work page doi:10.48550/arxiv.2411.15802 2024
[13]

Oquab, M. et al. DINOv2: Learning Robust Visual Features without Supervision. Preprint at https://doi.org/10.48550/ARXIV.2304.07193 (2023). Supplementary Materials Imaging Protocols Table S1: Acquisition Hardware CAM (BRAID1) CAM (TRICKS) MHA RSH RUMC UKA UMCU Manufacturer GE GE Siemens Philips Siemens Philips Philips Scanner SIGNA Artist DISCOVERY MR750 ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.07193 2023

[1] [1]

Barrios, C. H. Global challenges in breast cancer detection and treatment. The Breast 62 , S3–S6 (2022)

work page 2022

[2] [2]

& Gathani, T

Wilkinson, L. & Gathani, T. Understanding breast cancer as a global health concern. The British Journal of Radiology 95 , 20211033 (2022)

work page 2022

[3] [3]

Mann, R. M. et al. Breast cancer screening in women with extremely dense breasts recommendations of the European Society of Breast Imaging (EUSOBI). Eur Radiol 32 , 4036–4045 (2022)

work page 2022

[4] [4]

Abdullah, K. A. et al. Deep learning-based breast cancer diagnosis in breast MRI: systematic review and meta-analysis. Eur Radiol (2025) doi:10.1007/s00330-025-11406-6

work page doi:10.1007/s00330-025-11406-6 2025

[5] [5]

Garrucho, L. et al. A large-scale multicenter breast cancer DCE-MRI benchmark dataset with expert segmentations. Sci Data 12 , 453 (2025)

work page 2025

[6] [6]

Saha, A. et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. Br J Cancer 119 , 508–516 (2018)

work page 2018

[7] [7]

Zhao, X. et al. BreastDM: A DCE-MRI dataset for breast tumor image segmentation and classification. Computers in Biology and Medicine 164 , 107255 (2023)

work page 2023

[8] [8]

Solomon, E. et al. FastMRI Breast: A Publicly Available Radial k-Space Dataset of Breast Dynamic Contrast-enhanced MRI. Radiology: Artificial Intelligence 7 , e240345 (2025)

work page 2025

[9] [9]

CORDIS | European Commission https://doi.org/10.3030/101057091

Open Consortium for Decentralized Medical Artificial Intelligence | ODELIA Project | Fact Sheet | HORIZON. CORDIS | European Commission https://doi.org/10.3030/101057091

work page doi:10.3030/101057091

[10] [10]

& Sklair-Levy, M

Daniels, D., Last, D., Cohen, K., Mardor, Y. & Sklair-Levy, M. Standard and Delayed Contrast-Enhanced MRI of Malignant and Benign Breast Lesions with Histological and Clinical Supporting Data (Advanced-MRI-Breast-Lesions). The Cancer Imaging Archive https://doi.org/10.7937/C7X1-YN57 (2024)

work page doi:10.7937/c7x1-yn57 2024

[11] [11]

Saha, A. et al. Dynamic contrast-enhanced magnetic resonance images of breast cancer patients with tumor locations. The Cancer Imaging Archive https://doi.org/10.7937/TCIA.E3SV-RE93 (2022)

work page doi:10.7937/tcia.e3sv-re93 2022

[12] [12]

Müller-Franzes, G. et al. Medical Slice Transformer: Improved Diagnosis and Explainability on 3D Medical Images with DINOv2. Preprint at https://doi.org/10.48550/ARXIV.2411.15802 (2024)

work page doi:10.48550/arxiv.2411.15802 2024

[13] [13]

Oquab, M. et al. DINOv2: Learning Robust Visual Features without Supervision. Preprint at https://doi.org/10.48550/ARXIV.2304.07193 (2023). Supplementary Materials Imaging Protocols Table S1: Acquisition Hardware CAM (BRAID1) CAM (TRICKS) MHA RSH RUMC UKA UMCU Manufacturer GE GE Siemens Philips Siemens Philips Philips Scanner SIGNA Artist DISCOVERY MR750 ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.07193 2023