Lya2pcf: an efficient pipeline to estimate two- and three-point correlation functions of the Lyman-α forest
Pith reviewed 2026-05-19 06:52 UTC · model grok-4.3
The pith
Lya2pcf pipeline extends two-point estimators to measure anisotropic three-point correlations in Lyman-alpha forest data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Lya2pcf implements the standard algorithms used in current surveys for the two-point correlation function together with its distortion matrix and covariance matrices, and naturally extends the two-point estimator to three-point correlations. GPU optimization yields substantial speed-ups relative to PICCA for both the two-point function and distortion matrix. Application to SDSS DR16 and DESI Y5 mock data demonstrates overall performance gains and delivers the first measurement of the anisotropic three-point correlation function on a large spectroscopic sample for all possible triangles with scales up to 80 Mpc/h, with signal-to-noise above one for many triangle configurations.
What carries the argument
The Lya2pcf pipeline, which directly extends standard two-point correlation estimators to three-point correlations while incorporating GPU acceleration for the distortion matrix and covariance calculations.
If this is right
- Three-point statistics become computationally viable for inclusion in cosmological inference analyses with existing and upcoming large datasets.
- The measured signal-to-noise ratios above one for many triangle configurations support adding higher-order correlations to future analyses.
- Performance gains over PICCA, especially on GPUs, allow processing of the data volumes expected from Stage IV spectroscopic surveys.
- The first anisotropic three-point measurements on scales up to 80 Mpc/h demonstrate feasibility for constraining small-scale physics at high redshift.
Where Pith is reading between the lines
- Incorporating three-point statistics could provide additional leverage on matter clustering beyond what two-point functions alone supply.
- The same extension approach might be tested on other high-redshift tracers to broaden the range of available statistics.
- GPU-based implementations suggest that similar pipelines could scale to even larger volumes from next-generation surveys.
Load-bearing premise
That the standard two-point algorithms can be extended to three-point correlations while preserving accuracy and without introducing unquantified biases from the Lyman-alpha forest properties or survey geometry.
What would settle it
Running Lya2pcf on the same SDSS DR16 dataset and obtaining three-point correlation function values or signal-to-noise ratios that differ substantially from independent calculations performed with another code would indicate that the extension introduces biases.
read the original abstract
Studying the matter distribution in the universe through the Lyman-$\alpha$ forest allows us to constrain small-scale physics in the high-redshift regime. Spectroscopic quasar surveys are generating increasingly large datasets that require efficient algorithms to compute correlation functions. Moreover, cosmological analyses based on Lyman-$\alpha$ forests can significantly benefit from incorporating higher-order statistics alongside traditional two-point correlations. In this work, we present Lya2pcf, a pipeline designed to compute three-dimensional two-point and three-point correlation functions using Lyman-$\alpha$ forest data. The code implements standard algorithms widely used in current spectroscopic surveys for computing the two-point correlation function with its distortion matrix, covariance matrices; and it naturally extends the two-point estimator to three-point correlations. Thanks to GPU optimization, Lya2pcf achieves a substantial reduction in computational time for both the two-point correlation function and its distortion matrix when compared to the widely used PICCA code. We apply Lya2pcf to data from the Sloan Digital Sky Survey (SDSS) sixteenth data release (DR16) and a Dark Energy Spectroscopic Instrument Year-5 (DESI Y5) mock dataset, demonstrating overall performance gains over PICCA, especially on GPUs. We show the first measurement of the anisotropic three-point correlation function on a large spectroscopic sample for all possible triangles with scales up to 80 Mpc/h. The estimator's fast computation and the resulting signal-to-noise ratio -- above one for many triangle configurations -- demonstrate the viability of incorporating three-point statistics into future cosmological inference analyses, particularly with the larger datasets expected from Stage IV spectroscopic surveys.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Lya2pcf, a GPU-optimized pipeline that implements standard algorithms for the three-dimensional two-point correlation function (2PCF) including its distortion matrix and covariance, then extends these to the three-point correlation function (3PCF). The code is applied to SDSS DR16 Lyman-α forest data and DESI Year-5 mocks, with reported speed-ups relative to PICCA; the central result is the first anisotropic 3PCF measurement on a large spectroscopic sample for all triangle configurations with scales up to 80 Mpc/h, where the signal-to-noise ratio exceeds one for many configurations, supporting the viability of three-point statistics for future cosmological analyses.
Significance. If the accuracy of the 3PCF extension is confirmed, the work supplies a practical, scalable tool that could enable inclusion of higher-order statistics in Lyman-α forest analyses. This would be valuable for tightening constraints on small-scale physics at high redshift with Stage-IV datasets, and the reported performance gains plus the demonstration of measurable anisotropic 3PCF signal constitute a concrete step toward that goal.
major comments (2)
- [Results section] Results section (3PCF measurement): The manuscript asserts that the direct extension of the 2PCF estimator (with distortion matrix) to 3PCF preserves accuracy, yet provides no quantitative validation—such as recovery tests on mocks with known input 3PCF, comparison of the measured 2PCF against published PICCA results on the same DR16 sample, or an explicit error budget for continuum-fitting and metal-line systematics in the three-point estimator. This is load-bearing for the claim that S/N > 1 for many triangles demonstrates viability.
- [Method section] Method section (estimator extension): The text states that the 3PCF extension is 'natural,' but does not specify how the line-of-sight distortion matrix or survey geometry corrections are generalized from the 2PCF pair-counting to the triplet-counting case, nor does it quantify any residual bias introduced by the Lyman-α forest's redshift-space distortions or continuum estimation. A concrete test (e.g., Eq. for the 3PCF estimator or a table of bias values on mocks) is needed to support the central claim.
minor comments (2)
- [Abstract] The abstract and introduction would benefit from a brief statement of the precise triangle binning scheme (e.g., side lengths and angles) used for the anisotropic 3PCF measurement.
- [Results section] Figure captions for the 3PCF results should explicitly state the number of triangles per configuration and the precise S/N definition employed.
Simulated Author's Rebuttal
We thank the referee for their careful and constructive review of our manuscript. The comments highlight important areas where additional detail and validation would strengthen the presentation of the 3PCF results. We address each major comment below and have revised the manuscript to incorporate the requested quantitative tests, explicit estimator equations, and systematic error discussion.
read point-by-point responses
-
Referee: [Results section] Results section (3PCF measurement): The manuscript asserts that the direct extension of the 2PCF estimator (with distortion matrix) to 3PCF preserves accuracy, yet provides no quantitative validation—such as recovery tests on mocks with known input 3PCF, comparison of the measured 2PCF against published PICCA results on the same DR16 sample, or an explicit error budget for continuum-fitting and metal-line systematics in the three-point estimator. This is load-bearing for the claim that S/N > 1 for many triangles demonstrates viability.
Authors: We agree that explicit validation is necessary to support the accuracy claim. The original manuscript emphasized the pipeline implementation and the first large-sample measurement, but did not include dedicated recovery tests for the 3PCF. In the revised version we have added a new subsection with recovery tests on the DESI Y5 mocks, demonstrating that the input 3PCF is recovered within statistical errors for the triangle configurations considered. We have also included a direct comparison of our 2PCF measurements on the SDSS DR16 sample against published PICCA results on the same data, showing consistency at the percent level. Finally, we have expanded the discussion of systematics to provide an error budget for continuum-fitting and metal-line contamination in the 3PCF estimator, derived from the same mock suite. revision: yes
-
Referee: [Method section] Method section (estimator extension): The text states that the 3PCF extension is 'natural,' but does not specify how the line-of-sight distortion matrix or survey geometry corrections are generalized from the 2PCF pair-counting to the triplet-counting case, nor does it quantify any residual bias introduced by the Lyman-α forest's redshift-space distortions or continuum estimation. A concrete test (e.g., Eq. for the 3PCF estimator or a table of bias values on mocks) is needed to support the central claim.
Authors: We acknowledge that the manuscript would benefit from a more explicit description of the generalization. The 3PCF estimator extends the 2PCF by replacing pair counts with triplet counts while applying the distortion matrix to each of the three line-of-sight pairs within a given triangle; survey geometry corrections are implemented via random triplet catalogs generated consistently with the data. We have now inserted the full mathematical expression for the distortion-matrix-corrected 3PCF estimator as a new equation in the Methods section. In addition, we have added a table reporting the residual bias measured on mocks due to redshift-space distortions and continuum estimation; these biases remain subdominant to the statistical uncertainties for the scales up to 80 Mpc/h where S/N > 1 is reported. revision: yes
Circularity Check
No significant circularity in pipeline implementation
full rationale
The paper describes a computational pipeline (Lya2pcf) that implements standard 2PCF algorithms with distortion matrix and covariance, then extends them to 3PCF estimation. The central results are performance benchmarks against PICCA on SDSS DR16 and DESI Y5 mocks, plus the first reported anisotropic 3PCF measurement on large samples with S/N >1 for many triangles. No derivations, first-principles predictions, or fitted parameters are presented that reduce by construction to quantities derived from the same dataset. The extension is described as 'natural' without load-bearing self-citations or ansatzes that presuppose the target measurement. The work is self-contained as an implementation and demonstration study against external benchmarks and mocks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking (D=3 forcing) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
naturally extends the two-point estimator to three-point correlations... anisotropic three-point correlation function... five independent parameters (r1,r2,theta1,theta2,alpha)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The Lyman Alpha Forest in the Spectra of QSOs
M. Rauch,The lyman alpha forest in the spectra of quasistellar objects, Ann. Rev. Astron. Astrophys. 36 (1998) 267 [astro-ph/9806286]
work page internal anchor Pith review Pith/arXiv arXiv 1998
-
[2]
The Evolution of the Intergalactic Medium
M. McQuinn,The Evolution of the Intergalactic Medium, Ann. Rev. Astron. Astrophys.54 (2016) 313 [1512.00086]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[3]
R. A. Croft, D. H. Weinberg, M. Pettini, L. Hernquist and N. Katz,The Power spectrum of mass fluctuations measured from the Lyα forest at redshift z= 2.5, The Astrophysical Journal520 (1999) 1
work page 1999
-
[4]
SDSS collaboration, The Lyman-alpha forest power spectrum from the Sloan Digital Sky Survey, Astrophys. J. Suppl.163 (2006) 80 [astro-ph/0405013]
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[5]
P. McDonald, J. Miralda-Escude, M. Rauch, W. L. W. Sargent, T. A. Barlow, R. Cen et al.,The Observed probability distribution function, power spectrum, and correlation function of the transmitted flux in the Lyman-alpha forest, Astrophys. J. 543 (2000) 1 [astro-ph/9911196]
work page internal anchor Pith review Pith/arXiv arXiv 2000
-
[6]
The one-dimensional Ly-alpha forest power spectrum from BOSS
BOSS collaboration, The one-dimensional Ly-alpha forest power spectrum from BOSS, Astron. Astrophys. 559 (2013) A85 [1306.5896]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[7]
N. Palanque-Delabrouille et al.,Neutrino masses and cosmology with Lyman-alpha forest power spectrum, Journal of Cosmology and Astroparticle Physics2015 (2015) 011
work page 2015
- [8]
- [9]
- [10]
-
[11]
C. Ravoux et al.,DESI DR1 Lyα 1D power spectrum: The Fast Fourier Transform estimator measurement, 2505.09493
- [12]
-
[13]
N. G. Busca, T. Delubac, J. Rich, S. Bailey, A. Font-Ribera, D. Kirkby et al.,Baryon acoustic oscillations in the Lyα forest of BOSS quasars, Astronomy & Astrophysics552 (2013) A96
work page 2013
-
[14]
V. de Sainte Agathe et al.,Baryon acoustic oscillations at z = 2.34 from the correlations of Lyα absorption in eBOSS DR14, Astron. Astrophys.629 (2019) A85 [1904.03400]
-
[15]
M. Blomqvist et al.,Baryon acoustic oscillations from the cross-correlation of Lyα absorption and quasars in eBOSS DR14, Astron. Astrophys.629 (2019) A86 [1904.03430]
-
[16]
A. Adame et al.,DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman alpha forest, Journal of Cosmology and Astroparticle Physics2025 (2025) 124
work page 2024
-
[17]
DESI collaboration, DESI DR2 Results I: Baryon Acoustic Oscillations from the Lyman Alpha Forest, 2503.14739
work page internal anchor Pith review Pith/arXiv arXiv
-
[18]
Measurement of BAO correlations at $z=2.3$ with SDSS DR12 \lya-Forests
BOSS collaboration, Measurement of baryon acoustic oscillation correlations atz = 2.3 with SDSS DR12 Lyα-Forests, Astron. Astrophys.603 (2017) A12 [1702.00176]. – 13 –
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[19]
H. D. M. Des Bourboux, J. Rich, A. Font-Ribera, V. de Sainte Agathe, J. Farr, T. Etourneau et al., The completed SDSS-IV extended baryon oscillation spectroscopic survey: baryon acoustic oscillations with Lyα forests, The Astrophysical Journal901 (2020) 153
work page 2020
-
[20]
DESI collaboration, Overview of the Instrumentation for the Dark Energy Spectroscopic Instrument, Astron. J. 164 (2022) 207 [2205.10939]
work page internal anchor Pith review Pith/arXiv arXiv 2022
- [21]
-
[22]
R. Mandelbaum, P. McDonald, U. Seljak and R. Cen,Precision cosmology from the Lyman alpha forest: power spectrum and bispectrum, Monthly Notices of the Royal Astronomical Society344 (2003) 776 [https://academic.oup.com/mnras/article-pdf/344/3/776/3299806/344-3-776.pdf]
work page 2003
-
[23]
Correlations in the Lyman alpha forest: testing the gravitational instability paradigm
M. Zaldarriaga, U. Seljak and L. Hui,Correlations across scales in the Lyman alpha forest: Testing the gravitational instability paradigm, Astrophys. J. 551 (2001) 48 [astro-ph/0007101]
work page internal anchor Pith review Pith/arXiv arXiv 2001
-
[24]
M. Viel, S. Matarrese, A. Heavens, M. Haehnelt, T. Kim, V. Springel et al.,The bispectrum of the lyman-alpha forest at Z 2-2.4 from a large sample of uves qso absorption spectra (luqas), Mon. Not. Roy. Astron. Soc.347 (2004) L26 [astro-ph/0308151]
work page internal anchor Pith review Pith/arXiv arXiv 2004
- [25]
- [26]
- [27]
-
[28]
P. Adari and A. Slosar,Searching for parity violation in SDSS DR16 Lyman-α forest data, Phys. Rev. D 110 (2024) 103534 [2405.04660]
- [29]
-
[30]
R. de la Cruz et al.,First Lyα 1D Bispectrum Measurement in eBOSS, In: ArXiv e-prints(2024) 38 [2410.09150]
-
[31]
J. Bond, A. H. Jaffe and L. Knox,Estimating the power spectrum of the cosmic microwave background, Phys. Rev. D57 (1998) 2117 [astro-ph/9708203]
work page internal anchor Pith review Pith/arXiv arXiv 1998
-
[32]
Cosmography and Power Spectrum Estimation: a Unified Approach
U. Seljak,Cosmography and power spectrum estimation: a unified approach, Astrophys. J. 503 (1998) 492 [astro-ph/9710269]
work page internal anchor Pith review Pith/arXiv arXiv 1998
-
[33]
C. Ravoux et al.,The Dark Energy Spectroscopic Instrument: one-dimensional power spectrum from first Ly α forest samples with Fast Fourier Transform, Monthly Notices of the Royal Astronomical Society 526 (2023) 5118
work page 2023
-
[34]
A. Slosar et al.,Measurement of Baryon Acoustic Oscillations in the Lyman-alpha Forest Fluctuations in BOSS Data Release 9, JCAP 04 (2013) 026 [1301.3459]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[35]
On Estimating Lyman-alpha Forest Correlations between Multiple Sightlines
M. McQuinn and M. White,On estimating Lyα forest correlations between multiple sightlines, MNRAS 415 (2011) 2257 [1102.1752]
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[36]
Baryon Acoustic Oscillations in the Ly{\alpha} forest of BOSS DR11 quasars
T. Delubac, J. E. Bautista, N. G. Busca, J. Rich, D. Kirkby, S. Bailey et al.,Baryon acoustic oscillations in the Lyα forest of BOSS DR11 quasars, A&A 574 (2015) A59 [1404.1801]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[37]
K. M. Gorski, B. D. Wandelt, F. K. Hansen, E. Hivon and A. J. Banday,The healpix primer, astro-ph/9905275
work page internal anchor Pith review Pith/arXiv arXiv
-
[38]
Optimal Estimation of Non-Gaussianity
D. Babich,Optimal estimation of non-Gaussianity, Phys. Rev. D72 (2005) 043003 [astro-ph/0503375]
work page internal anchor Pith review Pith/arXiv arXiv 2005
- [39]
-
[40]
A Practical Computational Method for the Anisotropic Redshift-Space 3-Point Correlation Function
Z. Slepian and D. J. Eisenstein,A practical computational method for the anisotropic redshift-space three-point correlation function, Mon. Not. Roy. Astron. Soc.478 (2018) 1468 [1709.10150]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[41]
Z. Slepian and D. J. Eisenstein,Computing the three-point correlation function of galaxies in time, Monthly Notices of the Royal Astronomical Society454 (2015) 4142
work page 2015
-
[42]
K. M. Górski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke et al.,HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere, ApJ 622 (2005) 759 [arXiv:astro-ph/0409513]
work page internal anchor Pith review Pith/arXiv arXiv 2005
- [43]
-
[44]
S. K. Lam, A. Pitrou and S. Seibert,Numba: a llvm-based python jit compiler, inProceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15, (New York, NY, USA), Association for Computing Machinery, 2015, DOI
work page 2015
-
[45]
M. Rogowski, S. Aseeri, D. Keyes and L. Dalcin,mpi4py.futures: Mpi-based asynchronous task execution for python, IEEE Transactions on Parallel and Distributed Systems34 (2023) 611
work page 2023
-
[46]
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
A. Klöckner, N. Pinto, Y. Lee, B. Catanzaro, P. Ivanov and A. Fasih,PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation, arXiv e-prints (2009) arXiv:0911.3456 [0911.3456]
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[47]
B. W. Lyke et al.,The Sloan Digital Sky Survey Quasar Catalog: Sixteenth Data Release, The Astrophysical Journal Supplement Series250 (2020) 8
work page 2020
-
[48]
J. Farr et al.,LyaCoLoRe: synthetic datasets for current and future Lyman-α forest BAO surveys, Journal of Cosmology and Astroparticle Physics2020 (2020) 068
work page 2020
-
[49]
H. K. Herrera-Alcantar et al.,Synthetic spectra for Lyman-α forest analysis in the Dark Energy Spectroscopic Instrument, Journal of Cosmology and Astroparticle Physics2025 (2025) 141
work page 2025
- [50]
- [51]
- [52]
-
[53]
B. Hadzhiyska, R. de Belsunce, A. Cuceu, J. Guy, M. M. Ivanov, H. Coquinot et al.,Measuring and unbiasing the BAO shift in the Lyman-Alpha forest with AbacusSummit, 2503.13442
- [54]
- [55]
-
[56]
DESI collaboration, Data Release 1 of the Dark Energy Spectroscopic Instrument, 2503.14745. – 15 –
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.