pith. sign in

arxiv: 2406.17323 · v3 · submitted 2024-06-25 · 💻 cs.CV · astro-ph.IM· cs.LG

XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images

Pith reviewed 2026-05-24 00:27 UTC · model grok-4.3

classification 💻 cs.CV astro-ph.IMcs.LG
keywords XMM-Newtonartefact detectioninstance segmentationastronomical imagesmachine learning datasetoptical monitoringdata qualityscattered light
0
0 comments X

The pith

A dataset of 1000 hand-annotated XMM-Newton images trains machine learning models to detect and mask artefacts in optical telescope data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates XAMI, a public collection of 1000 images from the XMM-Newton Optical Monitoring camera that have been manually labeled for different kinds of artefacts produced by reflected or scattered light. It shows that instance segmentation models, built by combining convolutional neural networks with transformer components, can locate and outline these artefacts accurately enough to mask them out. The work addresses the shortage of labeled examples needed to train automated tools for cleaning astronomical images. A reader would care because artefact contamination can distort measurements in large surveys, and a shared starting point lets the community test and improve detection methods without starting from scratch each time.

Core claim

The paper claims that releasing the XAMI dataset of 1000 hand-annotated XMM-Newton images together with a hybrid CNN-transformer instance segmentation pipeline supplies a reproducible baseline for automated detection and masking of artefacts, directly tackling the absence of training data for machine learning methods applied to optical astronomy observations.

What carries the argument

The XAMI dataset of 1000 hand-annotated images paired with a hybrid instance segmentation model that merges convolutional neural networks for local feature extraction and transformer layers for broader context to locate and mask artefacts.

If this is right

  • Automated masking reduces the time astronomers spend manually inspecting and cleaning images before scientific analysis.
  • Precise artefact outlines from instance segmentation preserve more usable pixels than simpler detection methods.
  • Public code and data let other groups retrain or extend the models on additional observations.
  • Consistent artefact removal across datasets improves the reliability of statistical results drawn from XMM-Newton surveys.
  • The hybrid architecture demonstrates one workable way to combine local and global cues for segmentation in this domain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same labeling and modeling approach could be applied to images from other space telescopes that face scattered-light artefacts.
  • If the models generalize, they might support automated quality flagging during future observation planning.
  • Extending the dataset with images from different instruments would test whether the artefacts share enough structure to allow transfer learning.
  • Integration with existing astronomy pipelines could reduce systematic errors in source catalogs derived from optical monitoring data.

Load-bearing premise

The 1000 hand-annotated images capture the full range of artefacts that appear in XMM-Newton observations and the labels are consistent enough to serve as reliable training targets.

What would settle it

Apply the trained model to a fresh collection of XMM-Newton images that have been labeled independently by multiple astronomers and measure whether the detection and masking accuracy remains close to the level reported on the original 1000 images.

Figures

Figures reproduced from arXiv: 2406.17323 by C\u{a}lin-Adrian Popa, Julia Dima, Pablo G\'omez, Peter Kretschmar, Sandor Kruk, Simon Rosen.

Figure 1
Figure 1. Figure 1: Examples of artefacts in various space missions. (upper left) An optical ghost detected in Euclid’s First Light near-infrared images. (upper right) Ghost rays and stray light patterns present in NuSTAR mission. (bottom left) Star loops and dragon’s breath artefacts appearing in the Hubble Space Telescope images. (bottom right) Star loops and streaks present in the XMM-Newton Optical Monitor. XMM-Newton Opt… view at source ↗
Figure 2
Figure 2. Figure 2: Artefacts appearing in the XMM-OM observation S0148740701 of the QSO 1939+7000 field (U filter). We use the stratified k-fold technique to maintain consistent class proportions across dataset splits, thus [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: (left) Cumulative distribution of IoUs between predicted and true masks on training and validation sets. (right) Comparison of IoU distributions with higher median and consistency in training data and greater variability in validation data. 16 steps, weight decay of 10−5 and AdamW optimizer. We train the Mask Decoder only, while freezing the Image Encoder and Prompt Embedding layers. Following recommendati… view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of annotation bounding boxes across different classes in the XAMI dataset. ensuring accurate performance estimation. Resulting class distributions can be seen in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Detected masks across eight fields within the validation set, with increasing mean IoU between predicted and ground-truth masks. The mean IoU on the validation set images is 0.658 ± 0.207. Category Precision Recall Overall 84.3 72.1 CR 89.3 94.0 SR 80.6 85.6 SL 80.5 74.1 ROS 71.1 73.3 Other 100.0 33.3 [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
read the original abstract

Reflected or scattered light produce artefacts in astronomical observations that can negatively impact the scientific study. Hence, automated detection of these artefacts is highly beneficial, especially with the increasing amounts of data gathered. Machine learning methods are well-suited to this problem, but currently there is a lack of annotated data to train such approaches to detect artefacts in astronomical observations. In this work, we present a dataset of images from the XMM-Newton space telescope Optical Monitoring camera showing different types of artefacts. We hand-annotated a sample of 1000 images with artefacts which we use to train automated ML methods. We further demonstrate techniques tailored for accurate detection and masking of artefacts using instance segmentation. We adopt a hybrid approach, combining knowledge from both convolutional neural networks (CNNs) and transformer-based models and use their advantages in segmentation. The presented method and dataset will advance artefact detection in astronomical observations by providing a reproducible baseline. All code and data are made available (https://github.com/ESA-Datalabs/XAMI-model and https://github.com/ESA-Datalabs/XAMI-dataset).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript introduces the XAMI dataset of 1000 hand-annotated XMM-Newton Optical Monitoring (OM) camera images containing various artefacts (e.g., reflected or scattered light). It trains a hybrid CNN-transformer instance segmentation model for artefact detection and masking, and releases the dataset and code as a reproducible baseline for ML-based artefact detection in astronomical imaging.

Significance. If the annotations are shown to be reliable and the sample representative, the work addresses a documented gap in annotated data for artefact detection in space-based optical observations. The public release of both the dataset (https://github.com/ESA-Datalabs/XAMI-dataset) and model code (https://github.com/ESA-Datalabs/XAMI-model) is a clear strength that supports reproducibility and future benchmarking.

major comments (3)
  1. [Dataset creation / annotation description] Dataset section: the claim that the 1000 hand-annotated images provide a reliable baseline rests on unstated details of the annotation process. No information is given on the number of annotators, their expertise, annotation protocol, or quantitative measures of label quality such as inter-annotator agreement (IoU or Dice on masks).
  2. [Dataset creation / sampling] Dataset section: no sampling strategy, coverage statistics, or diversity analysis is reported for the 1000 images relative to the full XMM-Newton OM archive. Without evidence that artefact types, observation conditions, and instrument states are adequately represented, the representativeness assumption remains unverified.
  3. [Model description and evaluation] Methods / Experiments: the hybrid CNN-transformer model is described only at a high level in the abstract. Quantitative results (precision, recall, mAP, or mask quality metrics on a held-out test set) and comparison to standard baselines are not provided in the available text, preventing assessment of whether the tailored techniques deliver the claimed accuracy.
minor comments (1)
  1. [Abstract] The abstract states that 'all code and data are made available' but does not include direct DOIs or persistent identifiers for the releases; adding these would improve citability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major comment below and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Dataset creation / annotation description] Dataset section: the claim that the 1000 hand-annotated images provide a reliable baseline rests on unstated details of the annotation process. No information is given on the number of annotators, their expertise, annotation protocol, or quantitative measures of label quality such as inter-annotator agreement (IoU or Dice on masks).

    Authors: We agree that the annotation process requires more explicit description to substantiate the reliability of the dataset. The annotations were performed by astronomers with direct experience in XMM-Newton OM data. In the revised manuscript we will add a subsection detailing the number of annotators, their expertise, the annotation protocol, and any computed inter-annotator agreement metrics. revision: yes

  2. Referee: [Dataset creation / sampling] Dataset section: no sampling strategy, coverage statistics, or diversity analysis is reported for the 1000 images relative to the full XMM-Newton OM archive. Without evidence that artefact types, observation conditions, and instrument states are adequately represented, the representativeness assumption remains unverified.

    Authors: We acknowledge the need to document sampling choices. The 1000 images were selected to encompass the main artefact categories observed across the XMM-Newton OM archive. We will expand the Dataset section with an explicit description of the sampling approach together with coverage statistics and a basic diversity analysis. revision: yes

  3. Referee: [Model description and evaluation] Methods / Experiments: the hybrid CNN-transformer model is described only at a high level in the abstract. Quantitative results (precision, recall, mAP, or mask quality metrics on a held-out test set) and comparison to standard baselines are not provided in the available text, preventing assessment of whether the tailored techniques deliver the claimed accuracy.

    Authors: The Methods section contains a fuller architectural description, yet we accept that quantitative performance numbers and baseline comparisons are required for proper evaluation. We will add held-out test-set metrics (precision, recall, mAP) and comparisons against standard instance-segmentation baselines in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset release with standard ML application

full rationale

The paper's core contribution is the release of 1000 hand-annotated XMM-Newton OM images plus application of existing CNN/transformer segmentation models. No equations, fitted parameters, predictions, or uniqueness theorems are present. The abstract and described method contain no self-definitional loops, fitted-input predictions, or load-bearing self-citations. The work is self-contained as data collection plus off-the-shelf model use; external validation of annotation quality is a separate correctness issue, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the accuracy and representativeness of manual annotations plus the applicability of standard computer-vision models; no free parameters, physical constants, or new entities are introduced.

axioms (1)
  • domain assumption Hand annotations by the authors accurately and consistently identify artefact locations and shapes across the 1000 images
    The dataset and all downstream ML results depend directly on the quality of these labels (abstract).

pith-pipeline@v0.9.0 · 5745 in / 1251 out tokens · 32375 ms · 2026-05-24T00:27:26.997202+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 4 internal anchors

  1. [1]

    Ivezić, Ž. et al. LSST: From Science Drivers to Reference De- sign and Anticipated Data Products.The Astrophysical Journal 873, 111. i s s n: 1538-4357. http://dx.doi.org/10.3847/ 1538-4357/ab042c (Mar. 2019)

  2. [2]

    Euclid Assessment Study Report for the ESA Cosmic Visions

    Laureijs, R. Euclid Assessment Study Report for the ESA Cosmic Visions 2009. arXiv: 0912.0914 [astro-ph.CO]

  3. [3]

    Jansen, F. et al. XMM-Newton observatory. I. The spacecraft and operations. Astronomy and Astrophysics 365, L1–L6 (Dec. 2000)

  4. [4]

    Schartel, N. et al. in Handbook of X-ray and Gamma-ray As- trophysics 1–38 (Springer Nature Singapore, 2022). i s b n: 9789811645440. http://dx.doi.org/10.1007/978- 981- 16-4544-0_41-1

  5. [5]

    Mason, K. O. et al. The XMM-Newton optical/UV monitor telescope. Astronomy and Astrophysics 365, L36–L44. i s s n: 1432-0746. http : / / dx . doi . org / 10 . 1051 / 0004 - 6361 : 20000044 (Jan. 2001)

  6. [6]

    A., Mason, K

    Códova, F. A., Mason, K. O., Priedhorsky, W. C. & Citterio, O. The Optical Monitor on XMM. Bulletin of the American Astronomical Society 21, 1137 (Sept. 1989)

  7. [7]

    Lumb, D. H. et al. The Optical Monitor on ESA’s XMM Obser- vatory. Bulletin of the American Astronomical Society 23, 1349 (Sept. 1991)

  8. [8]

    Soria, R., Wu, K., Page, M. J. & Sakelliou, I. XMM-Newton optical monitor observations of LMC X-3. Astronomy and As- trophysics 365, L273–L276. i s s n: 1432-0746. http://dx.doi. org/10.1051/0004-6361:20000065 (Jan. 2001)

  9. [9]

    Audard, M. et al. The XMM-Newton Optical Monitor survey of the Taurus molecular cloud ***.A&A 468, 379–390. https: //doi.org/10.1051/0004-6361:20066320 (2007)

  10. [10]

    A., Willem den Herder, J

    Zane, S., Ramsay, G., Jimenez-Garate, M. A., Willem den Herder, J. & Hailey, C. J. XMM–Newton EPIC and Optical Monitor observations of Her X-1 over the 35-d beat period. Monthly Notices of the Royal Astronomical Society 350, 506–

  11. [11]

    eprint: https://academic.oup.com/ mnras/article-pdf/350/2/506/3888238/350-2-506.pdf

    i s s n: 0035-8711. eprint: https://academic.oup.com/ mnras/article-pdf/350/2/506/3888238/350-2-506.pdf . https://doi.org/10.1111/j.1365- 2966.2004.07660.x (May 2004)

  12. [12]

    Mukhin, A. et al. Wavelet-based image decomposition method for NuSTAR stray light background studies. Journal of As- tronomical Telescopes, Instruments, and Systems 9. i s s n: 2329-

  13. [13]

    http://dx.doi.org/10.1117/1.JATIS.9.4.048001 (Oct. 2023)

  14. [14]

    Grefenstette, B. et al. StrayCats: A Catalog of NuSTAR Stray Light Observations. Astrophysical Journal 909 (Mar. 2021). 3https://app.roboflow.com/iuliaelisa/xmm_om_ artefacts_512/ 4https://www.zooniverse.org/projects/ori-j/ ai-for-artefacts-in-sky-images

  15. [15]

    & Wetzstein, M

    Desai, S., Mohr, J., Bertin, E., Kümmel, M. & Wetzstein, M. De- tection and removal of artifacts in astronomical images. As- tronomy and Computing 16, 67–78. i s s n: 2213-1337. https: / / www . sciencedirect . com / science / article / pii / S2213133716300233 (2016)

  16. [16]

    & Liao, H.-Y

    Wang, C.-Y., Bochkovskiy, A. & Liao, H.-Y. M. YOLOv7: Train- able bag-of-freebies sets new state-of-the-art for real-time object detectors 2022. arXiv: 2207.02696 [cs.CV]

  17. [17]

    Tu, Z.et al. MaxViT: Multi-axis Vision Transformerin Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIV (Springer-Verlag, Tel Aviv, Israel, 2022), 459–479. i s b n: 978-3-031-20052-6. https://doi.org/10.1007/978-3-031-20053-3_27

  18. [18]

    Maaz, M. et al. Class-agnostic Object Detection with Multi- modal Transformer 2022. arXiv: 2111.11430 [cs.CV]

  19. [19]

    Detrs with collaborative hybrid assign- ments training. arxiv 2022,

    Zong, Z., Song, G. & Liu, Y. DETRs with Collaborative Hybrid Assignments Training 2023. arXiv: 2211.12860 [cs.CV]

  20. [20]

    & Sharma, G

    Srivastava, S. & Sharma, G. OmniV ec: Learning robust repre- sentations with cross modal sharing 2023. arXiv: 2311.05709 [cs.CV]

  21. [21]

    Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language T asks 2022

    Wang, W.et al. Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language T asks 2022. arXiv: 2208 . 10442 [cs.CV]

  22. [22]

    Hümmer, C. et al. VLTSeg: Simple Transfer of CLIP-Based Vision-Language Representations for Domain Generalized Se- mantic Segmentation 2023. arXiv: 2312.02021 [cs.CV]

  23. [23]

    SERNet-Former: Semantic Segmentation by E ffi - cient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

    Erisen, S. SERNet-Former: Semantic Segmentation by E ffi - cient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks. https://arxiv.org/abs/2401. 15741 (2024)

  24. [24]

    EV A: Exploring the Limits of Masked Visual Repre- sentation Learning at Scale 2022

    Fang, Y.et al. EV A: Exploring the Limits of Masked Visual Repre- sentation Learning at Scale 2022. arXiv: 2211.07636 [cs.CV]

  25. [25]

    InternImage: Exploring Large-Scale Vision Foun- dation Models with Deformable Convolutions 2023

    Wang, W.et al. InternImage: Exploring Large-Scale Vision Foun- dation Models with Deformable Convolutions 2023. arXiv: 2211. 05778 [cs.CV]

  26. [26]

    Liu, Z. et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 2021. arXiv: 2103.14030 [cs.CV]

  27. [27]

    & Girshick, R

    He, K., Gkioxari, G., Dollár, P . & Girshick, R. Mask R-CNN

  28. [28]

    arXiv: 1703.06870 [cs.CV]

  29. [29]

    Xu, X. et al. An Improved Swin Transformer-Based Model for Remote Sensing Object Detection and Instance Segmentation. Remote Sensing 13. i s s n: 2072-4292. https : / / www . mdpi . com/2072-4292/13/23/4779 (2021)

  30. [30]

    Merz, G. et al. Detection, instance segmentation, and classi- fication for astronomical surveys with deep learning (deep- disc): detectron2 implementation and demonstration with Hyper Suprime-Cam data. Monthly Notices of the Royal As- tronomical Society 526, 1122–1137. i s s n: 0035-8711. eprint: https : / / academic . oup . com / mnras / article - pdf / ...

  31. [31]

    Sortino, R. et al. Radio astronomical images object detection and segmentation: a benchmark on deep learning methods. Experimental Astronomy 56, 293–331. i s s n: 1572-9508. http: //dx.doi.org/10.1007/s10686-023-09893-w (May 2023)

  32. [32]

    & Robertson, B

    Hausen, R. & Robertson, B. Partial-Attribution Instance Seg- mentation for Astronomical Source Detection and Deblending

  33. [33]

    arXiv: 2201.04714 [astro-ph.IM]

  34. [34]

    Tanoglidis, D. et al. DeepGhostBusters: Using Mask R-CNN to Detect and Mask Ghosting and Scattered-Light Artifacts from Op- tical Survey Images 2021. arXiv: 2109.08246 [astro-ph.IM]

  35. [35]

    Kirillov, A. et al. Segment Anything 2023. arXiv: 2304.02643 [cs.CV]

  36. [36]

    Centre, E. X.-N. S. O. XMM-Newton Users Handbook https: / / xmm - tools . cosmos . esa . int / external / xmm _ user _ support/documentation/uhb/omfilters.html

  37. [37]

    & Daoudi, A.Real-Time Flying Ob- ject Detection with YOLOv8 2023

    Reis, D., Kupec, J., Hong, J. & Daoudi, A.Real-Time Flying Ob- ject Detection with YOLOv8 2023. arXiv: 2305.09972 [cs.CV]

  38. [38]

    Zhang, C. et al. Faster Segment Anything: Towards Lightweight SAM for Mobile Applications 2023. arXiv: 2306 . 14289 [cs.CV]

  39. [39]

    Kuhn, H. W. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 83–97. eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/ nav . 3800020109. https : / / onlinelibrary . wiley . com / doi/abs/10.1002/nav.3800020109 (1955)