Recognition: unknown
Euclid Quick Data Release (Q1). AstroVink: A vision transformer approach to find strong gravitational lens systems
Pith reviewed 2026-05-08 13:58 UTC · model grok-4.3
The pith
A vision transformer retrained on real Euclid data ranks all 110 known strong lenses within its top 500 candidates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The retrained AstroVink model recovers all 110 known lens systems within the top 500 ranked candidates on the test set and reduces inspection effort to one lens per 4.5 inspected objects; when applied to the Q1 selection of 1.08 million targets followed by Space Warps inspection and expert vetting, it identifies eight Grade A and 26 Grade B new lens candidates.
What carries the argument
AstroVink, a vision transformer classifier based on the fine-tuned DINOv2 encoder that ranks galaxy images according to the likelihood they contain strong gravitational lenses, with performance gains from incorporating real negative examples rejected during visual inspection.
If this is right
- Incorporating realistic negative examples from visual inspection plays a key role in improving generalization beyond training on simulated lenses alone.
- The model reduces the inspection effort required from one lens per 5.7 objects to one lens per 4.5 objects on the test set.
- Application to the full Q1 dataset of 1.08 million targets yields eight Grade A and 26 Grade B new lens candidates after further inspection.
- Transformer-based architectures recover strong lens candidates with high efficiency in real Euclid survey data while substantially reducing the number of objects needing visual review.
Where Pith is reading between the lines
- Iterative retraining with survey-specific rejected candidates could allow similar models to adapt quickly to the full Euclid dataset without starting from scratch.
- The same vision-transformer approach might reduce manual inspection workloads in other large astronomical surveys searching for rare objects such as supernovae or transients.
- If the newly identified candidates are spectroscopically confirmed, they would enlarge the sample available for statistical studies of dark-matter halos and cosmological parameters.
Load-bearing premise
The visual inspection labels used for retraining and final vetting are sufficiently complete and unbiased ground truth that allow the model to generalize to the full Euclid survey without significant distribution shift.
What would settle it
Follow-up high-resolution imaging or spectroscopy that shows most of the new Grade A and B candidates are not genuine strong lenses, or that the model fails to rank a fresh set of verified lenses highly in an independent test, would show the claimed recovery and efficiency gains do not hold.
Figures
read the original abstract
We present AstroVink, a vision transformer classifier designed for automated identification of strong lens candidates in Euclid imaging. We build upon the DINOv2 encoder, fine tuned to distinguish between lens and non-lens galaxies. Our base model, trained on simulated strong lens systems and labelled non lenses, recovers 88 of the 110 lens candidates within the top 500 ranked candidates, corresponding to an inspection efficiency of one lens per 5.7 inspected objects in our test set. After the Q1 data release, which yielded about 500 lens candidates, we retrained the model using high confidence lens candidates and new negatives, initially flagged as potential lenses by other classifiers but rejected during visual inspection. The retrained network further improves performance, achieving recovery of all 110 systems within the same ranking and reducing the inspection effort to one lens per 4.5 inspected objects, demonstrating that incorporating real examples significantly enhances model generalisation. An analysis of training subsets revealed that the inclusion of realistic negative examples played a key role in this improvement. Finally, we applied the retrained model to the Q1 original selection of 1.08M targets, followed by a new round of Space Warps citizen science inspection and expert vetting, where we identified a total of eight Grade A and 26 Grade B new lens candidates. These results demonstrate that transformer based architectures can recover strong lens candidates with high efficiency in real Euclid data, while substantially reducing the number of candidates requiring visual inspection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents AstroVink, a vision transformer classifier built on a fine-tuned DINOv2 encoder for automated detection of strong gravitational lens candidates in Euclid imaging. A base model trained on simulated lenses and labeled non-lenses recovers 88 of 110 known candidates in the top 500 ranked objects on a held-out test set (inspection efficiency of 1 lens per 5.7 objects). Retraining on high-confidence real lens candidates from Q1 plus rejected negatives improves this to 100/110 recovery (1 per 4.5 objects). The retrained model is applied to the Q1 selection of 1.08 million targets, followed by Space Warps citizen science and expert vetting, yielding 8 Grade A and 26 Grade B new candidates.
Significance. If the performance gains are free of overlap between the 110-lens test set and the high-confidence positives used in retraining, the work shows that fine-tuning vision transformers on realistic negatives meaningfully improves generalization for lens finding in large surveys. The reported reduction in inspection effort and the discovery of new Grade A/B candidates in Q1 data would be a useful methodological contribution for Euclid and similar wide-field programs. The emphasis on the role of real negatives is a concrete, testable insight.
major comments (2)
- [Abstract] Abstract: The headline claim that retraining lifts recovery from 88/110 to 100/110 within the top 500 does not state whether the 110 known test lenses were excluded from the “high confidence lens candidates” added during retraining. Overlap would make perfect recovery the expected outcome of supervised fine-tuning rather than evidence of improved generalization to real Euclid data, directly affecting the asserted efficiency gain and the narrative that the model now generalizes better.
- [Abstract] Abstract: No information is given on how the 1.08M target selection was performed, how the test set of 110 lenses was constructed to be disjoint from training data, or whether cross-validation or error bars accompany the recovery fractions. These details are load-bearing for the central performance claims.
minor comments (1)
- [Abstract] Abstract: The phrase “initially flagged as potential lenses by other classifiers but rejected during visual inspection” for the new negatives is clear in intent but would benefit from a brief description of the other classifiers and the rejection criteria.
Simulated Author's Rebuttal
We thank the referee for their careful and constructive review. The points raised about potential ambiguity in data splits and the need for explicit procedural details are valid and will be addressed through revisions to improve clarity and strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline claim that retraining lifts recovery from 88/110 to 100/110 within the top 500 does not state whether the 110 known test lenses were excluded from the “high confidence lens candidates” added during retraining. Overlap would make perfect recovery the expected outcome of supervised fine-tuning rather than evidence of improved generalization to real Euclid data, directly affecting the asserted efficiency gain and the narrative that the model now generalizes better.
Authors: We agree this clarification is essential. The 110 known test lenses were explicitly excluded from the high-confidence positives used in retraining. The retraining set drew from the remaining high-confidence candidates among the ~500 identified in Q1, combined with the new rejected negatives. This disjoint construction, together with our analysis of training subsets, supports that the gain to 100/110 recovery (and the improved efficiency) arises from better generalization enabled by realistic negatives rather than overlap. We will revise the abstract and add an explicit statement in the methods section confirming the test-set exclusion. revision: yes
-
Referee: [Abstract] Abstract: No information is given on how the 1.08M target selection was performed, how the test set of 110 lenses was constructed to be disjoint from training data, or whether cross-validation or error bars accompany the recovery fractions. These details are load-bearing for the central performance claims.
Authors: We acknowledge that the abstract omits these details and that they require explicit description. The 1.08 million targets correspond to the Q1 parent sample defined by the survey's photometric and morphological selection cuts (we will reference or briefly summarize the exact criteria). The 110-lens test set was assembled from known strong lenses lying within the Q1 footprint but withheld from both the initial simulated training and the real high-confidence positives used for retraining, ensuring complete disjointness. Recovery fractions are reported as exact counts on this fixed held-out set; cross-validation was not performed owing to the limited number of confirmed real lenses, and we will add a short discussion of this choice together with any appropriate uncertainty estimates. We will expand the methods section with a dedicated subsection on data selection, splits, and evaluation protocol. revision: yes
Circularity Check
No circularity: empirical ML training and held-out evaluation
full rationale
The paper describes a standard supervised learning pipeline: a vision transformer is pretrained on simulated strong-lens images plus labeled non-lenses, then fine-tuned on additional real high-confidence positives and rejected negatives from Q1. Performance is measured by ranking recovery on a fixed set of 110 known lenses described as the test set. No equations, fitted parameters renamed as predictions, or self-citation chains appear; the reported fractions (88/110 then 110/110) are direct empirical counts on the stated test objects rather than quantities forced by construction from the training inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Human visual inspection provides reliable ground-truth labels for training and evaluation.
- domain assumption Simulated lens images plus real non-lens galaxies are sufficiently representative for initial training.
Reference graph
Works this paper leans on
-
[1]
, " * write output.state after.block = add.period write newline
ENTRY address archiveprefix author booktitle chapter edition editor howpublished institution eprint journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 ...
-
[2]
write newline
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....
-
[3]
A., O'Riordan , C
Acevedo Barroso , J. A., O'Riordan , C. M., Cl \'e ment , B., et al. 2025, A&A, 697, A14
2025
-
[4]
G., Aguilar, J., Ahlen, S., et al
Adame, A. G., Aguilar, J., Ahlen, S., et al. 2024, AJ, 168, 58
2024
- [5]
-
[6]
Baharoon, M., Qureshi, W., Ouyang, J., et al. 2024, arXiv:2312.02366
-
[7]
& Amara, A
Birrer, S. & Amara, A. 2018, Phys. Dark Univ., 22, 189
2018
-
[8]
Birrer, S., Shajib, A., Gilman, D., et al. 2021, J. Open Source Softw., 6, 3283
2021
-
[9]
G., Morel, J., & Ehret, T
Bou, X., Facciolo, G., von Gioi, R. G., Morel, J., & Ehret, T. 2024, in CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (IEEE), 430--439
2024
-
[10]
2021, in CVF International Conference on Computer Vision (ICCV) (IEEE), 9630--9640
Caron, M., Touvron, H., Misra, I., et al. 2021, in CVF International Conference on Computer Vision (ICCV) (IEEE), 9630--9640
2021
-
[11]
2017, in Proceedings of the European Physical Society Conference on High Energy Physics
Carretero , J., Tallada , P., Casals , J., et al. 2017, in Proceedings of the European Physical Society Conference on High Energy Physics. 5-12 July, 488
2017
-
[12]
2021, A&A, 653, L6
Cañameras, R., Schuldt, S., Shu, Y., et al. 2021, A&A, 653, L6
2021
-
[13]
2021, in ICLR 2021 (ICLR)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. 2021, in ICLR 2021 (ICLR)
2021
-
[14]
Euclid Collaboration: Aussel , H., Tereno , I., Schirmer , M., et al. 2025, A&A, submitted (Euclid Q1 SI), arXiv:2503.15302
-
[15]
2025, A&A, 697, A5
Euclid Collaboration: Castander , F., Fosalba , P., Stadel , J., et al. 2025, A&A, 697, A5
2025
-
[16]
2025, A&A, 697, A2
Euclid Collaboration: Cropper , M., Al-Bahlawan , A., Amiaux , J., et al. 2025, A&A, 697, A2
2025
-
[17]
R., Fabricius, M., Seitz, S., et al
Euclid Collaboration: Ecker , L. R., Fabricius, M., Seitz, S., et al. 2026, A&A, submitted
2026
-
[18]
2025, A&A, accepted (Euclid Q1 SI), arXiv:2503.15328
Euclid Collaboration: Holloway , P., Verma , A., Walmsley , M., et al. 2025, A&A, accepted (Euclid Q1 SI), arXiv:2503.15328
-
[19]
2025, A&A, 697, A3
Euclid Collaboration: Jahnke , K., Gillard , W., Schirmer , M., et al. 2025, A&A, 697, A3
2025
-
[20]
Euclid Collaboration: Li , T., Collett , T. E., Walmsley , M., et al. 2025, A&A, in press (Euclid Q1 SI), https://doi.org/10.1051/0004-6361/202554543, arXiv:2503.15327
-
[21]
Euclid Collaboration: Lines , N. E. P., Collett , T. E., Walmsley , M., et al. 2025, A&A, in press (Euclid Q1 SI), https://doi.org/10.1051/0004-6361/202554542, arXiv:2503.15326
-
[22]
2025, A&A, 697, A1
Euclid Collaboration: Mellier , Y., Abdurro'uf , Acevedo Barroso , J., et al. 2025, A&A, 697, A1
2025
-
[23]
Euclid Collaboration: Rojas , K., Collett , T. E., Acevedo Barroso , J. A., et al. 2025, A&A, in press (Euclid Q1 SI), https://doi.org/10.1051/0004-6361/202554605, arXiv:2503.15325
-
[24]
2022, , 662, A112
Euclid Collaboration: Scaramella , R., Amiaux , J., Mellier , Y., et al. 2022, , 662, A112
2022
- [25]
-
[26]
2026, A&A, submitted
Euclid Collaboration: Xu , X., Chen, R., Li, T., et al. 2026, A&A, submitted
2026
-
[27]
2025, https://doi.org/10.57780/esa-2853f3b
Euclid Quick Release Q1 . 2025, https://doi.org/10.57780/esa-2853f3b
-
[28]
J., Treu, T., & Sonnenfeld, A
Gavazzi, R., Marshall, P. J., Treu, T., & Sonnenfeld, A. 2014, ApJ, 785, 144
2014
-
[29]
D., et al
Gavazzi, R., Treu, T., Rhodes, J. D., et al. 2007, ApJ, 667, 176
2007
-
[30]
Gonzalez, J., Holloway, P., Collett, T., et al. 2025, arXiv:2501.15679
-
[31]
2021, ApJ, 909, 27
Huang, X., Storfer, C., Gu, A., et al. 2021, ApJ, 909, 27
2021
-
[32]
2019 a , ApJS, 243, 17
Jacobs, C., Collett, T., Glazebrook, K., et al. 2019 a , ApJS, 243, 17
2019
-
[33]
2019 b , MNRAS, 484, 5330
Jacobs, C., Collett, T., Glazebrook, K., et al. 2019 b , MNRAS, 484, 5330
2019
-
[34]
2017, MNRAS, 471, 167
Jacobs, C., Glazebrook, K., Collett, T., More, A., & McCarthy, C. 2017, MNRAS, 471, 167
2017
-
[35]
B., et al
Joseph, R., Courbin, F., Metcalf, R. B., et al. 2014, A&A, 566, A63
2014
-
[36]
Kingma, D. P. & Ba, J. 2015, in ICLR 2015 (ICLR), 1--13
2015
-
[37]
Kornblith, S., Norouzi, M., Lee, H., & Hinton, G. E. 2019, in Proceedings of Machine Learning Research, Vol. 97, Proceedings of the 36th International Conference on Machine Learning, ed. K. Chaudhuri & R. Salakhutdinov (PMLR), 3519--3529
2019
-
[38]
Lastufka, E., Bait, O., Drozdova, M., et al. 2025, arXiv:2409.11175
-
[39]
S., et al
LeCun, Y., Boser, B., Denker, J. S., et al. 1989, Neural Computation, 1, 541
1989
-
[40]
R., Tortora, C., et al
Li, R., Napolitano, N. R., Tortora, C., et al. 2020, ApJ, 899, 30
2020
-
[41]
E., Krawczyk, C
Li, T., Collett, T. E., Krawczyk, C. M., & Enzi, W. 2024, MNRAS, 527, 5311
2024
-
[42]
2020, IEEE Trans
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. 2020, IEEE Trans. Pattern Anal. Mach. Intell., 42, 318
2020
-
[43]
SGDR: Stochastic Gradient Descent with Warm Restarts
Loshchilov, I. & Hutter, F. 2017, arXiv:1608.03983
work page internal anchor Pith review arXiv 2017
-
[44]
& Hutter, F
Loshchilov, I. & Hutter, F. 2019, in ICLR 2019 (ICLR), 1--11
2019
-
[45]
& Buchanan, W
McKeown, S. & Buchanan, W. J. 2023, Forensic Sci. Int. Digit. Invest., 44, 301509
2023
-
[46]
2008, A&A, 482, 403
Meneghetti, M., Melchior, P., Grazian, A., et al. 2008, A&A, 482, 403
2008
-
[47]
2010, A&A, 514, A93
Meneghetti, M., Rasia, E., Merten, J., et al. 2010, A&A, 514, A93
2010
-
[48]
Metcalf, R. B. & Petkova, M. 2014, MNRAS, 445, 1942
2014
-
[49]
Nagam , B. C., Acevedo Barroso , J. A., Wilde , J., et al. 2025, A&A, in press, https://doi.org/10.1051/0004-6361/202554132, arXiv:2502.09802
-
[50]
W., Massey, R
Nightingale, J. W., Massey, R. J., Harvey, D. R., et al. 2019, MNRAS, 489, 2049
2019
-
[51]
DINOv2: Learning Robust Visual Features without Supervision
Oquab, M., Darcet, T., Moutakanni, T., et al. 2024, arXiv:2304.07193
work page internal anchor Pith review arXiv 2024
-
[52]
B., & Giocoli, C
Petkova, M., Metcalf, R. B., & Giocoli, C. 2014, MNRAS, 445, 1954
2014
-
[53]
E., Tortora, C., Chatterjee, S., et al
Petrillo, C. E., Tortora, C., Chatterjee, S., et al. 2019, MNRAS, 482, 807
2019
-
[54]
E., Tortora, C., Chatterjee, S., et al
Petrillo, C. E., Tortora, C., Chatterjee, S., et al. 2017, MNRAS, 472, 1129
2017
-
[55]
& Zardari, B
Qamar, R. & Zardari, B. 2023, Mesopotamian Journal of Computer Science, 2023, 130
2023
-
[56]
2022, A&A, 668, A73
Rojas , K., Savary , E., Clément , B., et al. 2022, A&A, 668, A73
2022
-
[57]
Ruan, B.-K., Shuai, H.-H., & Cheng, W.-H. 2022, arXiv:2207.03041
-
[58]
2022, A&A, 666, A1
Savary, E., Rojas, K., Maus, M., et al. 2022, A&A, 666, A1
2022
-
[59]
LAION-5B: An open large-scale dataset for training next generation image-text models
Schuhmann, C., Beaumont, R., Vencu, R., et al. 2022, arXiv:2210.08402
work page internal anchor Pith review arXiv 2022
-
[60]
J., Vernardos, G., Collett, T
Shajib, A. J., Vernardos, G., Collett, T. E., et al. 2024, Space Sci. Rev., 220, 87
2024
-
[61]
2024, in Lecture Notes in Computer Science, Vol
Song, X., Xu, X., & Yan, P. 2024, in Lecture Notes in Computer Science, Vol. 15002: Medical Image Computing and Computer Assisted Intervention (MICCAI 2024), ed. M. G. Linguraru, Q. Dou, A. Feragen, S. Giannarou, B. Glocker, K. Lekadir, & J. A. Schnabel (Springer Nature Switzerland), 608--617
2024
-
[62]
2022, A&A, 659, A132
Sonnenfeld, A. 2022, A&A, 659, A132
2022
-
[63]
2024, A&A, 690, A325
Sonnenfeld, A. 2024, A&A, 690, A325
2024
-
[64]
& Cautun, M
Sonnenfeld, A. & Cautun, M. 2021, A&A, 651, A18
2021
-
[65]
Storfer, C. J., Magnier, E. A., Huang, X., et al. 2025, arXiv:2505.05032
-
[66]
2020, A&C, 32, 100391
Tallada, P., Carretero, J., Casals, J., et al. 2020, A&C, 32, 100391
2020
-
[67]
2022, A&A, 664, A4
Thuruthipilly, H., Zadrozny, A., Pollo, A., & Biesiada, M. 2022, A&A, 664, A4
2022
-
[68]
2017, in Advances in Neural Information Processing Systems, 31st Conference NeurIPS'17, ed
Vaswani, A., Shazeer, N., Parmar, N., et al. 2017, in Advances in Neural Information Processing Systems, 31st Conference NeurIPS'17, ed. I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Red Hook, NY, USA: Curran Associates, Inc.), 6000--6010
2017
-
[69]
2024, Space Sci
Vegetti, S., Birrer, S., Despali, G., et al. 2024, Space Sci. Rev., 220, 58
2024
-
[70]
F., & Weymann, R
Walsh, D., Carswell, R. F., & Weymann, R. J. 1979, Nat, 279, 381
1979
-
[71]
R., Kauffmann, O
Weaver, J. R., Kauffmann, O. B., Ilbert, O., et al. 2022, ApJS, 258, 11
2022
-
[72]
M., et al
Welch, B., Coe, D., Diego, J. M., et al. 2022, Nat, 603, 815
2022
-
[73]
C., Suyu, S
Wong, K. C., Suyu, S. H., Chen, G. C.-F., et al. 2020, ApJ, 890, L4
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.