pith. sign in

arxiv: 1906.11218 · v1 · pith:X2Q4XOSZnew · submitted 2019-06-26 · 🌌 astro-ph.EP · astro-ph.IM· astro-ph.SR

A Principal Component Analysis-based method to analyse high-resolution spectroscopic data

Pith reviewed 2026-05-25 15:07 UTC · model grok-4.3

classification 🌌 astro-ph.EP astro-ph.IMastro-ph.SR
keywords exoplanet atmosphereshigh-resolution spectroscopyprincipal component analysisCRIREShot Jupitersmolecular detectioncross-correlationatmospheric retrieval
0
0 comments X

The pith

A new pipeline uses Principal Component Analysis on CRIRES time-series spectra to isolate and detect CO and H2O signals in hot Jupiter atmospheres.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces an automatic analysis pipeline for high-resolution near-infrared spectra taken with the CRIRES instrument on the VLT. The method first applies Principal Component Analysis to time-series observations to separate the faint planetary atmospheric contribution from stronger telluric and stellar signals. It then cross-correlates the residuals against synthetic templates generated by the tau-REx code using high-temperature opacities from ExoMol. When applied to existing observations of the hot Jupiters HD209458b and HD189733b, the pipeline recovers the known presence of CO and H2O at levels consistent with prior literature results.

Core claim

The paper claims that a novel application of Principal Component Analysis to CRIRES time-series spectra, followed by cross-correlation with atmospheric models, successfully extracts the molecular signatures of CO and H2O from the atmospheres of HD209458b and HD189733b without manual intervention, yielding detections in agreement with previous studies.

What carries the argument

Principal Component Analysis applied to time-series spectra to isolate the planetary atmospheric signal before cross-correlation with model templates.

If this is right

  • The pipeline automates detection of molecular species in CRIRES datasets of hot Jupiters.
  • Results for CO and H2O match those obtained by other techniques on the same targets.
  • The method can be run on additional CRIRES observations to search for the same or other molecules.
  • Consistency with existing literature supports the pipeline for routine atmospheric characterization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may reduce observer-dependent choices in component selection across multiple datasets.
  • Similar PCA isolation could be tested on spectra from other high-resolution instruments.
  • Automation opens the possibility of applying the same steps to larger samples of targets observed under comparable conditions.

Load-bearing premise

That Principal Component Analysis removes only noise and systematics while leaving the planetary atmospheric signal undistorted for the cross-correlation step.

What would settle it

Absence of a significant cross-correlation peak at the expected planetary radial velocity when testing the pipeline on the well-studied CO and H2O signals in HD209458b or HD189733b.

Figures

Figures reproduced from arXiv: 1906.11218 by G. Micela, G. Tinetti, M. Damiano.

Figure 1
Figure 1. Figure 1: HD189733b dataset. In (a), the data are shown after calibration, normalisation and spikes correction. In (b), the data are shown after the median has been subtracted from each column. In (c), the results of PCA are shown. In (d), the data are shown after the application of PCA and the injection of the CO model [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: HD209458b dataset. In (a), the data are shown after calibration, normalisation and spikes correction. In (b), the data are shown after the median has been subtracted from each column. In (c), the results of PCA are shown. In (d), the data are shown after the application of PCA and the injection of the CO model [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The box colours indicate different classes of ac￾tion: green boxes represent an external input coming from other models or different sources (e.g user), the red box in￾cludes the external reduction algorithm. Finally, the blue boxes contain the calculations developed for the analysis of the data. enough, we followed instead the procedure described in the literature (Birkby et al. 2013, 2017; Brogi et al. 2… view at source ↗
Figure 4
Figure 4. Figure 4: Left panels: first five eigenvectors of the TDM case. Right panels: first five eigenvectors of the WDM covariance matrix. • wavelength domain matrix (WDM); we transpose the matrix to have the spectra as columns and wavelength bins as rows; In the WDM/TDM case the principal components (eigenvectors) contain the information of the correla￾tions in the wavelength/time domain. We consider, for example, the fir… view at source ↗
Figure 5
Figure 5. Figure 5: Linear relation between the first component of each detector in the time domain and the recorded airmass for the HD189733b dataset. of each principal component for every detector of both datasets are shown in [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Detectors’ variances of the PCA decomposition relative to the HD189733bb dataset. The first component always carries more than 75% of the information. However, the variance is different for each of the detectors. The green dashed lines indicate the calculated components range relative to the water vapour [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Detectors’ variances of the PCA decomposition relative to the HD209458b dataset. The first component always carries more than 75% of the information. However, the variance is different for each of the detectors. The red dashed lines highlight the determined components range relative to the CO, while the green dashed lines are relative to the H2O [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Top left panel: the four CCF of the four CRIRES’ detectors summed together of the HD209458b dataset. Bottom left panel: same as top but with the model injected. The injection is 1× the synthesised model (Rp/R? ∼ 10−3 ). Top right panel: cross-correlation after changing the reference frame from the Earth to the rest frame of the exoplanet. In this frame the planetary cross-correlation signal is aligned to z… view at source ↗
Figure 9
Figure 9. Figure 9: Cross-correlations of water vapour co-added in-transit for the HD209458b dataset. The injected signal and the planetary signal are still present after using PCA. The co-added CCFs are relative to HD209458b rest frame (Kp = 145.041 kms−1 ). This graph has been generated considering PCA components from 33 to 43 [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Cross-correlations of the planetary signal and of the injected water vapour co-added in-transit for the HD189733b dataset. The co-added CCFs are calculated at the theoretical orbital velocity of the planet HD189733b (Kp = 152.564 kms−1 ). The CCFs are the result of the combination of the PCA components from 12th to 27th. the planetary spectrum moved across time, we then re￾aligned the single CCFs to the r… view at source ↗
Figure 11
Figure 11. Figure 11: Results for the HD209458b dataset. Top left panel: S/N map for the carbon monoxide. The maximum point is compatible with the planetary orbital parameters. Top right panel: distributions (i.e. in-trail and out-trail) used to compute the Welch’s T-Test. The null hypothesis is rejected with a confidence greater than 7σ. Bottom left panel: S/N map of the water vapour. The peak is compatible with the planetary… view at source ↗
Figure 12
Figure 12. Figure 12: Results for the HD189733b dataset. Top left panel: S/N map for the carbon monoxide. The maximum point is compatible with the result reported in Brogi et al. (2016) but it is not compatible with the expected value. Bottom left panel: S/N map of the water vapour. The peak is compatible with the planetary parameters. Bottom right panel: distribution used to compute the Welch’s T-Test. The null hypothesis is … view at source ↗
read the original abstract

High-Resolution Spectroscopy (HRS) has been used to study the composition and dynamics of exoplanetary atmospheres. In particular, the spectrometer CRIRES installed on the ESO-VLT has been used to record high-resolution spectra in the Near-IR of gaseous exoplanets. Here we present a new automatic pipeline to analyze CRIRES data-sets. Said pipeline is based on a novel use of Principal Component Analysis (PCA) and Cross-Correlation Function (CCF). The exoplanetary atmosphere is modeled with the $\tau$-REx code using opacities at high temperature from the ExoMol project. In this work, we tested our analysis tools on the detection of CO and H$_2$O in the atmospheres of the hot-Jupiters HD209458b and HD189733b. The results of our pipeline are in agreement with previous results in the literature and other techniques.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents an automatic pipeline for processing CRIRES time-series spectra of exoplanet atmospheres. It applies Principal Component Analysis (PCA) to remove telluric and stellar variance, followed by cross-correlation of the residuals against τ-REx atmospheric models that incorporate high-temperature ExoMol opacities. The pipeline is tested on the detection of CO and H2O in the hot Jupiters HD 209458b and HD 189733b, with the abstract stating that the results agree with previous literature detections.

Significance. An automated, reproducible PCA+CCF pipeline could reduce analysis time for high-resolution spectroscopy datasets if the planetary signal is demonstrably preserved. The current manuscript provides no quantitative metrics of agreement or validation of signal fidelity, so the work does not yet establish a clear methodological advance over existing techniques.

major comments (2)
  1. [Abstract] Abstract: the central claim that 'the results of our pipeline are in agreement with previous results in the literature' is stated without any reported detection significances, CCF peak amplitudes, or direct numerical comparisons to prior studies, rendering the agreement assertion unverifiable from the given information.
  2. [Methods] Methods (implied PCA step): the pipeline relies on the assumption that the time-varying planetary Doppler signal remains orthogonal to the removed principal components and is not partially projected out; no injection-recovery tests, component-selection diagnostics, or covariance analysis are described that would confirm this preservation, which is load-bearing for any detection claim.
minor comments (1)
  1. [Abstract] Abstract: the phrase 'novel use of Principal Component Analysis' is used without specifying the technical novelty relative to prior PCA applications in HRS (e.g., component ordering or masking strategy).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and indicate the revisions planned.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'the results of our pipeline are in agreement with previous results in the literature' is stated without any reported detection significances, CCF peak amplitudes, or direct numerical comparisons to prior studies, rendering the agreement assertion unverifiable from the given information.

    Authors: We agree that the abstract's agreement claim would be strengthened by quantitative metrics. In the revised manuscript we will add the detection significances and CCF peak amplitudes from our pipeline together with direct numerical comparisons to the corresponding literature values for both planets. revision: yes

  2. Referee: [Methods] Methods (implied PCA step): the pipeline relies on the assumption that the time-varying planetary Doppler signal remains orthogonal to the removed principal components and is not partially projected out; no injection-recovery tests, component-selection diagnostics, or covariance analysis are described that would confirm this preservation, which is load-bearing for any detection claim.

    Authors: The assumption that the Doppler-shifted planetary signal is orthogonal to the dominant static components is standard in the PCA literature for high-resolution spectroscopy. Nevertheless, we acknowledge that explicit validation strengthens the work. We will add injection-recovery tests and component-selection diagnostics to the revised methods section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; pipeline validated against external literature

full rationale

The paper introduces a PCA+CCF pipeline for CRIRES spectra and reports detections of CO/H2O on HD209458b and HD189733b that match prior independent results. No equations, fitted parameters, or self-citations are shown that reduce the reported signals to quantities defined from the same data by construction. The central claim rests on external agreement rather than internal re-derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the pipeline depends on standard domain assumptions about data decomposition and atmospheric modeling whose details are not provided.

free parameters (1)
  • Number of PCA components retained
    Must be chosen to separate planetary signal from noise; value not stated in abstract.
axioms (1)
  • domain assumption The tau-REx atmospheric models with ExoMol opacities provide a sufficiently accurate template for cross-correlation with the observed spectra.
    Invoked when the pipeline cross-correlates data against the models to detect CO and H2O.

pith-pipeline@v0.9.0 · 5689 in / 1309 out tokens · 25975 ms · 2026-05-25T15:07:01.239450+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

  1. [1]

    B., Knutson, H

    Agol, E., Cowan, N. B., Knutson, H. A., et al. 2010, ApJ, 721, 1861, doi: 10.1088/0004-637X/721/2/1861 Artigau, ´E., Astudillo-Defru, N., Delfosse, X., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9149, Observatory Operations: Strategies, Processes, and Systems V, 914905

  2. [2]

    Birkby, J. L. 2018, arXiv e-prints, arXiv:1806.04617. https://arxiv.org/abs/1806.04617

  3. [3]

    L., de Kok, R

    Birkby, J. L., de Kok, R. J., Brogi, M., et al. 2013, MNRAS, 436, L35, doi: 10.1093/mnrasl/slt107

  4. [4]

    Snellen, I. A. G. 2017, AJ, 153, 138, doi: 10.3847/1538-3881/aa5c87

  5. [5]

    2005, A&A, 444, L15, doi: 10.1051/0004-6361:200500201

    Bouchy, F., Udry, S., Mayor, M., et al. 2005, A&A, 444, L15, doi: 10.1051/0004-6361:200500201

  6. [6]

    1965, The Fourier Transform and Its Applications

    Bracewell, R. 1965, The Fourier Transform and Its Applications. (New York: McGraw-Hill), 46 and 243

  7. [7]

    J., Albrecht, S., et al

    Brogi, M., de Kok, R. J., Albrecht, S., et al. 2016, ApJ, 817, 106, doi: 10.3847/0004-637X/817/2/106

  8. [8]

    Snellen, I. A. G. 2014, A&A, 565, A124, doi: 10.1051/0004-6361/201423537

  9. [9]

    Brogi, M., Snellen, I. A. G., de Kok, R. J., et al. 2013, ApJ, 767, 27, doi: 10.1088/0004-637X/767/1/27

  10. [10]

    M., Noyes, R

    Charbonneau, D., Brown, T. M., Noyes, R. W., & Gilliland, R. L. 2002, ApJ, 568, 377, doi: 10.1086/338770

  11. [11]

    2017, AJ, 154, 39, doi: 10.3847/1538-3881/aa738b de Kok, R

    Tinetti, G. 2017, AJ, 154, 39, doi: 10.3847/1538-3881/aa738b de Kok, R. J., Brogi, M., Snellen, I. A. G., et al. 2013, A&A, 554, A82, doi: 10.1051/0004-6361/201321381

  12. [12]

    J., Oliva, E., et al

    Follert, R., Dorn, R. J., Oliva, E., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9147, Ground-based and Airborne Instrumentation for Astronomy V, 914719

  13. [13]

    2014, Nature, 513, 526, doi: 10.1038/nature13785

    Fraine, J., Deming, D., Benneke, B., et al. 2014, Nature, 513, 526, doi: 10.1038/nature13785

  14. [14]

    J., Burrows, A., Charbonneau, D., et al

    Grillmair, C. J., Burrows, A., Charbonneau, D., et al. 2008, Nature, 456, 767, doi: 10.1038/nature07574

  15. [15]

    1986, Publications of the Astronomical Society of the Pacific, 98, 609, doi: 10.1086/131801 Jolliffe, I

    Horne, K. 1986, Publications of the Astronomical Society of the Pacific, 98, 609, doi: 10.1086/131801 Jolliffe, I. T. 2002, Principal component analysis

  16. [16]

    Kipping, D. M. 2010, MNRAS, 407, 301, doi: 10.1111/j.1365-2966.2010.16894.x

  17. [17]

    A., Charbonneau, D., Noyes, R

    Knutson, H. A., Charbonneau, D., Noyes, R. W., Brown, T. M., & Gilliland, R. L. 2007, ApJ, 655, 564, doi: 10.1086/510111

  18. [18]

    T., Terada, H., et al

    Kobayashi, N., Tokunaga, A. T., Terada, H., et al. 2000, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 4008, Optical and IR Telescope Instrumentation and Detectors, ed. M. Iye & A. F. Moorwood, 1056–1066

  19. [19]

    L., Yang, H., France, K., et al

    Linsky, J. L., Yang, H., France, K., et al. 2010, ApJ, 717, 1291, doi: 10.1088/0004-637X/717/2/1291 14 Damiano et al

  20. [20]

    2000, ApJ, 532, L55, doi: 10.1086/312558

    Mazeh, T., Naef, D., Torres, G., et al. 2000, ApJ, 532, L55, doi: 10.1086/312558

  21. [21]

    K., Kawahara, H., Masuda, K., et al

    Nugroho, S. K., Kawahara, H., Masuda, K., et al. 2017, AJ, 154, 221, doi: 10.3847/1538-3881/aa9433

  22. [22]

    2012, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol

    Oliva, E., Origlia, L., Maiolino, R., et al. 2012, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 8446, Ground-based and Airborne Instrumentation for Astronomy IV, 84463T

  23. [23]

    R., et al

    Piskorz, D., Benneke, B., Crockett, N. R., et al. 2016, ApJ, 832, 131, doi: 10.3847/0004-637X/832/2/131 —. 2017, AJ, 154, 78, doi: 10.3847/1538-3881/aa7dd8

  24. [24]

    J., Caballero, J

    Quirrenbach, A., Amado, P. J., Caballero, J. A., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9147, Ground-based and Airborne Instrumentation for Astronomy V, 91471F Redfield, S., Endl, M., Cochran, W. D., & Koesterke, L. 2008, ApJ, 673, L87, doi: 10.1086/527475

  25. [25]

    R., Snellen, I

    Ridden-Harper, A. R., Snellen, I. A. G., Keller, C. U., et al. 2016, A&A, 593, A129, doi: 10.1051/0004-6361/201628448

  26. [26]

    2003, ApJ, 585, 1038, doi: 10.1086/346105

    Seager, S., & Mall´ en-Ornelas, G. 2003, ApJ, 585, 1038, doi: 10.1086/346105

  27. [27]

    K., Fortney, J

    Sing, D. K., Fortney, J. J., Nikolov, N., et al. 2016, Nature, 529, 59, doi: 10.1038/nature16068

  28. [28]

    2010, Nature, 465, 1049, doi: 10.1038/nature09111

    Albrecht, S. 2010, Nature, 465, 1049, doi: 10.1038/nature09111

  29. [29]

    M., Papadakis I

    Tamuz, O., Mazeh, T., & Zucker, S. 2005, MNRAS, 356, 1466, doi: 10.1111/j.1365-2966.2004.08585.x

  30. [30]

    Tennyson, J., & Yurchenko, S. N. 2012, MNRAS, 425, 21, doi: 10.1111/j.1365-2966.2012.21440.x

  31. [31]

    N., Al-Refaie, A

    Tennyson, J., Yurchenko, S. N., Al-Refaie, A. F., et al. 2016, Journal of Molecular Spectroscopy, 327, 73, doi: 10.1016/j.jms.2016.05.002

  32. [32]

    2007, Nature, 448, 169, doi: 10.1038/nature06002

    Tinetti, G., Vidal-Madjar, A., Liang, M.-C., et al. 2007, Nature, 448, 169, doi: 10.1038/nature06002

  33. [33]

    N., & Holman, M

    Torres, G., Winn, J. N., & Holman, M. J. 2008, ApJ, 677, 1324, doi: 10.1086/529429

  34. [34]

    Triaud, A. H. M. J., Queloz, D., Bouchy, F., et al. 2009, A&A, 506, 377, doi: 10.1051/0004-6361/200911897

  35. [35]

    P., Rocchetto, M., et al

    Tsiaras, A., Waldmann, I. P., Rocchetto, M., et al. 2016a, ApJ, 832, 202, doi: 10.3847/0004-637X/832/2/202

  36. [36]

    P., et al

    Tsiaras, A., Rocchetto, M., Waldmann, I. P., et al. 2016b, ApJ, 820, 99, doi: 10.3847/0004-637X/820/2/99

  37. [37]

    P., Zingales, T., et al

    Tsiaras, A., Waldmann, I. P., Zingales, T., et al. 2018, AJ, 155, 156, doi: 10.3847/1538-3881/aaaf75

  38. [38]

    2012, A&A, 546, A43, doi: 10.1051/0004-6361/201219310

    Venot, O., H´ ebrard, E., Ag´ undez, M., et al. 2012, A&A, 546, A43, doi: 10.1051/0004-6361/201219310

  39. [39]

    P., Rocchetto, M., Tinetti, G., et al

    Waldmann, I. P., Rocchetto, M., Tinetti, G., et al. 2015a, ApJ, 813, 13, doi: 10.1088/0004-637X/813/1/13

  40. [40]

    P., Tinetti, G., Rocchetto, M., et al

    Waldmann, I. P., Tinetti, G., Rocchetto, M., et al. 2015b, ApJ, 802, 107, doi: 10.1088/0004-637X/802/2/107