arxiv: 2604.06143 · v1 · submitted 2026-04-07 · 🌌 astro-ph.IM

Recognition: 2 theorem links

· Lean Theorem

Deep Spectroscopy with DESI for Photometric Redshift Training and Calibration

Biprateep Dey , Jeffrey A. Newman , Tianqing Zhang , J. Aguilar , S. Ahlen , A. Anand , B. Andrews , S. Bailey

show 57 more authors

D. Bianchi D. Brooks F. J. Castander T. Claybaugh A. Cuceu K. S. Dawson A. de la Macorra J. Della Costa Arjun Dey P. Doel S. Ferraro A. Font-Ribera E. Gazta\~naga Satya Gontcho A Gontcho D. Gruen G. Gutierrez J. Guy H. K. Herrera-Alcantar K. Honscheid M. Ishak R. Joyce R. Kehoe D. Kirkby T. Kisner A. Kremin O. Lahav M. Landriau L. Le Guillou A. Leauthaud M. E. Levi M. Manera P. Martini J. McCullough A. Meisner R. Miquel J. Moustakas A. D. Myers J. Myles S. Nadathur N. Palanque-Delabrouille W. J. Percival F. Prada I. P\'erez-R\`afols G. Rossi L. Samushia E. Sanchez D. Schlegel M. Schubnell H. Seo J. Silber D. Sprayberry G. Tarl\'e B. A. Weaver N. Weaverdyck R. H. Wechsler R. Zhou H. Zou

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:05 UTC · model grok-4.3

classification 🌌 astro-ph.IM

keywords DESIphotometric redshiftsLSSTspectroscopic calibrationdeep spectroscopyredshift distributionsweak lensingcosmological surveys

0 comments

The pith

DESI on a 4m telescope delivers redshift success rates for faint galaxies comparable to 10m telescopes using only twice the expected integration time and 30 times the multiplexing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports results from the DESI-Deep pilot program, which tested the ability of the Dark Energy Spectroscopic Instrument to obtain spectra for galaxies as faint as those needed for photometric redshift training in the early LSST survey. It establishes that DESI achieves success rates similar to those from larger telescopes while observing many more targets simultaneously and maintaining expected background-limited performance even in long exposures. This efficiency matters because accurate spectroscopic redshifts are required to calibrate photometric redshifts and reduce systematic uncertainties in cosmological measurements from wide-field imaging surveys. The work also supplies updated time estimates for obtaining full calibration samples across facilities and outlines a possible larger DESI-Deep survey together with its projected effects on cosmological constraints.

Core claim

The DESI-Deep pilot shows that DESI on the 4m Mayall telescope can measure redshifts for galaxies with m_i ≤ 24.5 at success rates comparable to 10m-class telescopes, requiring only ∼2× the integration time instead of the ∼8× expected from aperture-area scaling, while achieving ∼30 times larger multiplexing. The signal-to-noise ratio of the spectra follows the expected scaling for background-limited observations even for the longest exposures of ∼7 hours on the faintest targets. These results indicate that DESI could supply the benchmark spectroscopic sample for photo-z training and calibration in the early years of LSST with a modest investment of observing time.

What carries the argument

The DESI-Deep pilot observations that compare measured redshift success rates and signal-to-noise scaling against aperture-area expectations and background-limited predictions.

Load-bearing premise

The small pilot sample and its target selection are representative of the full LSST lensing galaxy population, and the observed efficiency will continue to hold when scaled to the much larger samples needed for complete photo-z calibration.

What would settle it

A follow-up campaign that obtains spectra for a much larger set of LSST-like faint galaxies and checks whether the redshift success rate and background-limited scaling remain the same as in the pilot.

Figures

Figures reproduced from arXiv: 2604.06143 by A. Anand, A. Cuceu, A. de la Macorra, A. D. Myers, A. Font-Ribera, A. Kremin, A. Leauthaud, A. Meisner, Arjun Dey, B. Andrews, B. A. Weaver, Biprateep Dey, D. Bianchi, D. Brooks, D. Gruen, D. Kirkby, D. Schlegel, D. Sprayberry, E. Gazta\~naga, E. Sanchez, F. J. Castander, F. Prada, G. Gutierrez, G. Rossi, G. Tarl\'e, H. K. Herrera-Alcantar, H. Seo, H. Zou, I. P\'erez-R\`afols, J. Aguilar, J. Della Costa, Jeffrey A. Newman, J. Guy, J. McCullough, J. Moustakas, J. Myles, J. Silber, K. Honscheid, K. S. Dawson, L. Le Guillou, L. Samushia, M. E. Levi, M. Ishak, M. Landriau, M. Manera, M. Schubnell, N. Palanque-Delabrouille, N. Weaverdyck, O. Lahav, P. Doel, P. Martini, R. H. Wechsler, R. Joyce, R. Kehoe, R. Miquel, R. Zhou, S. Ahlen, Satya Gontcho A Gontcho, S. Bailey, S. Ferraro, S. Nadathur, T. Claybaugh, Tianqing Zhang, T. Kisner, W. J. Percival.

**Figure 1.** Figure 1: Locations of the 4863 objects observed in the DESI-XMMLSS (33.5 ◦ ≤R.A.≤ 37.5 ◦ ; −7 ◦ ≤DEC.≤ −3 ◦ ) and DESI-COSMOS (148◦ ≤R.A.≤ 152◦ ; 0◦ ≤DEC.≤ 4 ◦ ) fields. The colors of the points indicate the total effective exposure time for each object. The empty regions at the center of each field were filled by targets for other pilot observations for potential future DESI programs. 22.0 22.5 23.0 23.5 24.0 24.5… view at source ↗

**Figure 2.** Figure 2 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of effective exposure times represented by a stacked histogram. The median exposure time is about 94 min with the 5th and 95th percentiles of the distribution being 17 and 308 min respectively. 3.1.1. Adjusted Redshift Measurement Efficiency Above z ∼ 1.6, only spectral features that are weak in most galaxies will lie within the wavelength range covered by the DESI spectrographs. As a result… view at source ↗

**Figure 4.** Figure 4: Spectra of four example galaxies from the DESI-Deep pilot sample, chosen to illustrate the diversity of galaxy types for which reliable redshifts were obtained. In each panel, the light blue shading represents the observed spectrum, while the solid dark blue line is the spectrum smoothed using an inverse-variance-weighted Gaussian kernel (with σ = 4 ˚A for the first two spectra and σ = 2.4 ˚A for the last … view at source ↗

**Figure 5.** Figure 5: The redshift distribution of objects for which spectra were observed and deemed good enough for reliable redshift measurements. Beyond redshift 1.6, most prominent spectral features of a galaxy shift out of the rest-frame observing range of the DESI spectrographs, leading to a lack of redshift measurements beyond this z. To calculate SNRs, we exclusively used data from the red arm of the DESI spectrograph… view at source ↗

**Figure 6.** Figure 6: The ∆χ 2 between the best and second-best redshift fit plotted against visual inspection (VI) quality flags for our data set. The shaded gray region shows the flags corresponding to objects with reliable redshift measurements. While the DESI main survey uses ∆χ 2 for determining redshift measurement reliability, we observe a significant scatter between VI quality classifications and ∆χ 2 values, without… view at source ↗

**Figure 7.** Figure 7: Redshift measurement success rate for the DESI-Deep pilot sample as a function of i-band magnitude, with separate panels corresponding to different ranges of effective exposure time. The blue curve shows the success rate obtained from our observations, with the 95% confidence interval shown using the shaded blue region. The success rates as a function of i-magnitude for the combined DEEP2 and DEEP3 survey … view at source ↗

**Figure 8.** Figure 8: Redshift measurement success rate for the DESI-Deep pilot sample as a function of i-band fiber magnitude, with separate panels corresponding to different ranges of effective exposure time. The blue curve shows the success rate obtained for our observations, with the corresponding 95% confidence interval shown as the shaded blue region. The success rates for the combined DEEP2 and DEEP3 survey dataset is pl… view at source ↗

**Figure 9.** Figure 9: Redshift measurement success rate for the DESI-Deep pilot sample as a function of exposure time, with separate panels corresponding to different bins of i-band magnitude. The blue curve shows the success rate obtained for our observations, with 95% confidence intervals shown using the shaded blue region. For every i magnitude bin, we observe that the success rate increases with increase in exposure time as… view at source ↗

**Figure 10.** Figure 10: Test of the impact of adjusting for the abundance of z > 1.6 objects when calculating redshift success rates. The measured and adjusted redshift success rate are plotted as a function of i-band magnitude, with separate panels corresponding to different ranges of exposure time. As before, the solid blue curve shows the measured success rate as measured with 95% confidence intervals shown using the shaded b… view at source ↗

**Figure 11.** Figure 11: The ratio of empirically measured SNR and what we would expect from a background limited regime plotted as a function of i-fiber-magnitude in bins of exposure time. Each blue dot represents a single object and the orange curve denotes the average in ten equal population bins. The shaded orange region shows the 95% confidence interval on the mean. We observe that the instrument performance follows the back… view at source ↗

**Figure 12.** Figure 12: The ratio of empirically measured SNR and what we would expect from a background limited regime plotted as a function of exposure time in bins of i-fiber-magnitude. Each blue dot represents a single object and the orange curve denotes the average in ten equal population bins. The shaded orange region shows the 95% confidence interval on the mean. We observe that the instrument performance follows the back… view at source ↗

**Figure 13.** Figure 13: Redshift success rate versus the adjusted magnitude (mfi = mi − 1.25 log Texp [sec] 6000 [sec] ). The blue curve shows the success rate in 15 equal width bins, with the blue shaded region representing the 95% confidence interval on the counts. The orange curve shows the generalized linear model fit to the data with the shaded orange region denoting the 95% prediction interval obtained from bootstrap … view at source ↗

**Figure 14.** Figure 14: Predicted redshift measurement success rates plotted as a function of the total amount of dark time required for various magnitude limits. The magnitude limits are chosen to roughly represent the magnitude limits of the samples of various frontier weak lensing experiments. The lower horizontal axis displays the exposure times needed per object, while the upper horizontal axis shows the total time require… view at source ↗

**Figure 15.** Figure 15: The relationship between exposure time with the DESI instrument, redshift measurement success rate, and the LSST survey cosmological constraining power as quantified by the relative DETF figure of merit (FoM/FoMfiducial), degraded due to parts of parameter space lacking calibration redshifts due to spectroscopic measurement failures. The top panels correspond to an LSST Year-1-like 3×2pt (weak lensing plu… view at source ↗

**Figure 16.** Figure 16: shows the on-sky locations of the observed targets in all three fields, along with the effective exposure times per object. The empty regions in the focal plane correspond to spectrographs that were unavailable during these observations. The data collected came from spectrographs that had not been cooled to their optimal operating temperatures. Figures 17 and 18 show DESI’s redshift measurement success ra… view at source ↗

**Figure 17.** Figure 17: As [PITH_FULL_IMAGE:figures/full_fig_p027_17.png] view at source ↗

**Figure 18.** Figure 18: As [PITH_FULL_IMAGE:figures/full_fig_p028_18.png] view at source ↗

read the original abstract

Deep spectroscopic samples can be used to improve photometric redshift (photo-$z$) estimates and reduce uncertainties on redshift distributions. Such improvements can increase the cosmological constraining power of large imaging-based experiments such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) and mitigate what may be a limiting systematic effect. We present results from the ``DESI-Deep pilot'' program, which was designed to assess the capability of the Dark Energy Spectroscopic Instrument (DESI) on the 4m Mayall telescope to measure redshifts of galaxies as faint as expected lensing samples for early LSST data ($m_i \leq 24.5$). We find that DESI is remarkably efficient at this task, with redshift success rates comparable to the results of observations from 10m-class telescopes with only $\sim2\times$ longer integration time (rather than $\sim 8\times$ longer as would be expected from aperture-area scaling), while simultaneously achieving $\sim30$ times larger multiplexing. We also find that the signal-to-noise ratio of the spectra scales as expected for background-limited observations even for the longest exposure times ($\sim 7$ hours) and faintest targets in the program. These results demonstrate that DESI could provide the definitive redshift sample for the early years of LSST with a modest investment of observing time. Based upon the results of this program, we provide updated predictions for the time required to collect benchmark samples for photo-$z$ training and calibration using a variety of spectroscopic facilities. Finally, we describe a potential "DESI-Deep" survey designed to train and calibrate photo-$z$'s for imaging experiments, and provide forecasts of its impact on cosmological inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports results from the DESI-Deep pilot program using the DESI instrument on the 4m Mayall telescope to obtain redshifts for faint galaxies (m_i ≤ 24.5) as a test for LSST photometric redshift training and calibration. It claims DESI achieves redshift success rates comparable to 10m-class telescopes with only ~2× longer integration times (versus the ~8× expected from aperture-area scaling) while providing ~30× larger multiplexing, with S/N scaling as expected for background-limited observations even at ~7-hour exposures. The paper then supplies updated time estimates for spectroscopic samples across facilities and forecasts the impact of a proposed DESI-Deep survey on cosmological inference for LSST.

Significance. If the pilot results generalize, this work demonstrates a practical and efficient path to large spectroscopic training samples for LSST photo-z calibration, which could meaningfully reduce a key systematic uncertainty and improve cosmological constraints from imaging surveys. The empirical efficiency comparison and background-limited performance at long exposures are concrete strengths that support the modest-time forecasts.

major comments (2)

[§3 (target selection and sample definition)] The central scaling claims and DESI-Deep forecasts rest on the assumption that the pilot target selection and measured success rates are representative of the full LSST lensing population (in redshift distribution, galaxy types, and spectral features). No quantitative comparison of the pilot sample properties to the expected LSST distribution is provided, which is load-bearing for the linear extrapolation in the time estimates.
[§4 (results and efficiency comparison)] The comparison to 10m-class telescope results (success rates with ~2× vs. ~8× integration time) lacks a table or explicit listing of the reference observations, their exposure times, seeing conditions, and target properties, making it difficult to verify the aperture-scaling deviation.

minor comments (2)

[Figures 2-4] Figure captions should explicitly state the number of objects in each success-rate bin and any cuts applied to the pilot sample.
[§4.2] The notation for integration time scaling (e.g., the factor of ~2×) should be defined consistently with the aperture-area calculation in the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the significance of this work and for the constructive comments. We have revised the manuscript to address the concerns about sample representativeness and the details of the efficiency comparison, as detailed below.

read point-by-point responses

Referee: [§3 (target selection and sample definition)] The central scaling claims and DESI-Deep forecasts rest on the assumption that the pilot target selection and measured success rates are representative of the full LSST lensing population (in redshift distribution, galaxy types, and spectral features). No quantitative comparison of the pilot sample properties to the expected LSST distribution is provided, which is load-bearing for the linear extrapolation in the time estimates.

Authors: We agree that an explicit quantitative comparison strengthens the extrapolation and have added this to the revised manuscript. In §3 we now include a new figure and text comparing the pilot sample's redshift distribution, i-band magnitude distribution, and (g-r, r-i) colors to those of the expected LSST weak-lensing source population drawn from DESC DC2 mock catalogs. The pilot targets were selected to reach the same faint magnitude limit (m_i ≤ 24.5) and to sample a similar color space as LSST lensing galaxies; the added comparison shows substantial overlap in these properties, supporting the use of the measured success rates for the time estimates. We also note the small sample size as a caveat. revision: yes
Referee: [§4 (results and efficiency comparison)] The comparison to 10m-class telescope results (success rates with ~2× vs. ~8× integration time) lacks a table or explicit listing of the reference observations, their exposure times, seeing conditions, and target properties, making it difficult to verify the aperture-scaling deviation.

Authors: We agree that a tabulated summary improves verifiability and have added Table 2 in the revised §4. The table lists each reference observation (primarily from Keck/LRIS, VLT/FORS, and similar programs cited in the paper), with columns for telescope/instrument, target m_i range, total exposure time, median seeing, number of targets observed, redshift success rate, and citation. This allows direct inspection of the ~2× integration-time scaling relative to the naive ~8× area scaling while underscoring DESI's multiplexing advantage. revision: yes

Circularity Check

0 steps flagged

No significant circularity: observational results and direct extrapolations are self-contained

full rationale

The paper presents new empirical measurements from the DESI-Deep pilot program, including redshift success rates for faint galaxies and signal-to-noise scaling behavior. These are compared against independent expectations from aperture-area scaling and prior 10m-class telescope results. Updated time predictions for full surveys are straightforward linear extrapolations from the measured efficiencies and multiplexing factors, without any fitted parameters defined in terms of the target quantities or self-referential definitions. No load-bearing self-citations, ansatzes smuggled via prior work, or renaming of known results occur; the derivation chain relies on fresh data and standard scaling relations external to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical measurements from a pilot observing program and standard scaling arguments for telescope performance; no free parameters are introduced in the abstract.

axioms (1)

domain assumption Redshift success is determined by standard emission-line and continuum detection criteria used in DESI pipeline
Invoked to define the reported success rates.

pith-pipeline@v0.9.0 · 5963 in / 1288 out tokens · 48625 ms · 2026-05-10T18:05:26.351637+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We find that DESI is remarkably efficient at this task, with redshift success rates comparable to the results of observations from 10m-class telescopes with only ∼2× longer integration time (rather than ∼8× longer as would be expected from aperture-area scaling), while simultaneously achieving ∼30 times larger multiplexing.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the signal-to-noise ratio of the spectra scales as expected for background-limited observations even for the longest exposure times (∼7 hours) and faintest targets

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages

[1]

2018, PASJ, 70, S4, doi: 10.1093/pasj/psx066

Aihara, H., Arimoto, N., Armstrong, R., et al. 2018a, PASJ, 70, S4, doi: 10.1093/pasj/psx066 Aihara, H., Arimoto, N., Armstrong, R., et al. 2018b, PASJ, 70, S4, doi: 10.1093/pasj/psx066 Aihara, H., AlSayyad, Y., Ando, M., et al. 2022, PASJ, 74, 247, doi: 10.1093/pasj/psab122 Akeson, R., Armus, L., Bachelet, E., et al. 2019, arXiv e-prints, arXiv:1902.0556...

work page doi:10.1093/pasj/psx066 2022
[2]

4841, Instrument Design and Performance for Optical/Infrared Ground-based Telescopes, ed

http://www.jstor.org/stable/2958830 Faber, S. M., Phillips, A. C., Kibrick, R. I., et al. 2003, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 4841, Instrument Design and Performance for Optical/Infrared Ground-based Telescopes, ed. M. Iye & A. F. M. Moorwood, 1657–1669, doi: 10.1117/12.460346 Gaia Collaboration, Brow...

work page doi:10.1117/12.460346 2003
[3]

C., Stern, D

https://arxiv.org/abs/1903.09323 Masters, D. C., Stern, D. K., Cohen, J. G., et al. 2019, ApJ, 877, 81, doi: 10.3847/1538-4357/ab184d Matthews, D. J., Newman, J. A., Coil, A. L., Cooper, M. C., & Gwyn, S. D. J. 2013, ApJS, 204, 21, doi: 10.1088/0067-0049/204/2/21 McCullagh, P., & Nelder, J. A. 1989, Generalized Linear Models (London: Chapman & Hall / CRC)...

work page doi:10.3847/1538-4357/ab184d 1903
[4]

Aldo Pontremoli

https://arxiv.org/abs/1903.09325 Newman, J. A., & Gruen, D. 2022, ARA&A, 60, 363, doi: 10.1146/annurev-astro-032122-014611 Newman, J. A., Cooper, M. C., Davis, M., et al. 2013, ApJS, 208, 5, doi: 10.1088/0067-0049/208/1/5 Newman, J. A., Abate, A., Abdalla, F. B., et al. 2015, Astroparticle Physics, 63, 81, doi: 10.1016/j.astropartphys.2014.06.007 Pedregos...

work page doi:10.1146/annurev-astro-032122-014611 1903