pith. sign in

arxiv: 2606.13574 · v1 · pith:5S4WV6QJnew · submitted 2026-06-11 · 🌌 astro-ph.EP · astro-ph.IM· astro-ph.SR

AESTRA II: Generative Spectral Modeling of the Sun as a Star for Precise Radial Velocities

Pith reviewed 2026-06-27 05:29 UTC · model grok-4.3

classification 🌌 astro-ph.EP astro-ph.IMastro-ph.SR
keywords radial velocitiesexoplanetsstellar activitygenerative modelingtelluric absorptionSun-as-a-starNEIDEarth analogs
0
0 comments X

The pith

Generative spectral model recovers 238 of 500 injected planets, including 13 below 0.3 m/s, versus 9 for standard methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AESTRA to model Sun-as-a-star spectra by breaking them into stellar line-shape changes, micro-telluric absorption, and continuum shifts using only the observations themselves. After subtracting the telluric and continuum parts, a compact representation of the remaining spectrum is trained to separate activity-induced velocity shifts from true planetary Doppler signals. In 500 injection tests spanning periods of 2.5 to 400 days and amplitudes from 0.1 to 0.7 m/s, the approach recovers 238 planets at a threshold that produces zero false positives, far outperforming cross-correlation-function detrending that finds only 9 and none below 0.5 m/s. This matters for extreme-precision radial velocity searches because it raises the chance of finding Earth analogs whose signals sit below the noise floor set by stellar activity.

Core claim

AESTRA empirically decomposes the spectra into stellar line-shape variability, micro-telluric absorption, and continuum variability without external templates. After removing the learned telluric and continuum components, it trains a low-dimensional representation of the spectrum to infer activity-driven apparent RVs jointly with candidate Doppler signals. In 500 single-planet injection-recovery tests, AESTRA recovers 238 injected planets including 13 with K < 0.3 m s^{-1}, whereas traditional CCF-based activity-indicator detrending recovers 9 planets and none below K = 0.5 m s^{-1}.

What carries the argument

AESTRA's empirical generative decomposition into line-shape variability, micro-telluric absorption, and continuum variability, followed by a low-dimensional spectral representation for joint RV inference.

If this is right

  • Planets with semi-amplitudes between 0.1 and 0.5 m/s become detectable at the same false-positive rate where traditional methods find none.
  • Activity and planetary signals can be inferred together from one low-dimensional model without separate activity indicators.
  • The zero-spurious-detection calibration allows higher true-positive rates while preserving search purity.
  • Reliance on external stellar or atmospheric templates is eliminated for this class of modeling.
  • The method scales to longer time series by continuing to learn components directly from the data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Application to other stars observed with the same instrument could extend the recovery gain beyond the Sun.
  • The same decomposition might be tested on multi-planet systems to check whether multiple signals remain separable.
  • Combining the low-dimensional representation with existing activity proxies could further lower the detection threshold.
  • Archival spectra from other EPRV spectrographs offer a direct test of whether the performance gain transfers.

Load-bearing premise

The learned decomposition into activity, telluric, and continuum components leaves genuine planetary Doppler signals untouched.

What would settle it

A controlled test in which injected planetary signals below 0.3 m/s are systematically attenuated or removed after the decomposition step would falsify the recovery advantage.

Figures

Figures reproduced from arXiv: 2606.13574 by Joshua N. Winn, Peter Melchior, Quang H. Tran, Sicong Lu, Yan Liang.

Figure 1
Figure 1. Figure 1: Raw spectral variability in NEID solar observations. Top: CCF radial velocities from a representative spectral order versus time, with twelve selected observing segments highlighted (colored). Dashed vertical lines mark the seasonal phases of strong telluric absorption; the gray dashed line indicates the corresponding phase that was not observed during the fire-related shutdown. Middle: Weighted average sp… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the Æstra workflow. (a) Each echelle order is decomposed into telluric absorption, continuum variability, and stellar line-shape variability using a shared encoder and component-specific decoder branches. The telluric and continuum components are removed, while stellar variability is preserved, yielding per-order cleaned spectra that are concatenated into a cleaned and merged spectrum. (b) The … view at source ↗
Figure 3
Figure 3. Figure 3: Example telluric components extracted by the spectral decomposition model. Gray curves show observed NEID solar spectra, while colored curves show the inferred telluric model, color-coded by the depth of the strongest extracted telluric line. The vertical gray shaded region marks this line, whose depth is used as an empirical proxy of telluric strength. The extracted features remain fixed in Earth-frame wa… view at source ↗
Figure 4
Figure 4. Figure 4: Examples of the spectral decomposition into three components capturing distinct sources of variability. Top: The inferred telluric model consists of narrow absorption features; shaded regions indicate wavelengths identified by the telluric mask. Colors encode the empirical telluric strength, as defined in [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Correlation between the leading stellar la￾tent component and an independent chromospheric activity tracer. The first principal component of the learned latent representation is strongly correlated with the NEID Ca II H&K index (r = 0.889), despite the Ca II H&K region being excluded from training. This out-of-sample agreement indi￾cates that the latent space captures physically meaningful stellar activity… view at source ↗
Figure 7
Figure 7. Figure 7 [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Detection quality score for candidate signals in the injection–recovery experiments. Blue circles mark cor￾rect recoveries. Blue triangles mark cases whose top-ranked candidates match the injected planets but do not pass the de￾tection threshold. Gray circles mark top-ranked candidates that do not pass the detection threshold and do not match the injected signal. The dashed horizontal line indicates the fi… view at source ↗
Figure 8
Figure 8. Figure 8: Detection performance in period–amplitude space for the injection–recovery sample. Each point represents one injected planetary signal. Colored points denote successful detections, with color indicating the fractional period error, and gray points indicate non-detections. Detection thresholds for both methods are calibrated to yield zero spurious detections in this injection–recovery suite. Top: Recovery p… view at source ↗
Figure 9
Figure 9. Figure 9: Detection completeness as a function of in￾jected semi-amplitude K in the injection–recovery experi￾ments. The blue curve shows Æstra with the calibrated de￾tection criterion, the black curve shows the traditional peri￾odogram-based method calibrated to zero false positives, and the gray dashed curve shows the traditional method evalu￾ated at FAP = 0.1%. Shaded regions indicate Wilson bino￾mial confidence … view at source ↗
Figure 10
Figure 10. Figure 10: Corner plot comparing the first two principal components of the learned stellar latent representation with tradi￾tional CCF-based activity indicators, shown separately for the pre-fire (blue) and post-fire (orange) subsets. The two populations occupy partially distinct regions in both latent space and classical indicator space. In particular, the separation between the pre-fire and post-fire data is visib… view at source ↗
read the original abstract

The detection of Earth analogs with extreme-precision radial velocities (EPRVs) is limited by spectral variability from stellar activity, telluric absorption, and instrumental systematics. We apply AESTRA, a generative spectrum modeling framework, to NEID Sun-as-a-star observations. AESTRA empirically decomposes the spectra into stellar line-shape variability, micro-telluric absorption, and continuum variability without external atmospheric or stellar templates. After removing the learned telluric and continuum components, we train a low-dimensional representation of the spectrum to infer activity-driven apparent RVs jointly with candidate Doppler signals. We evaluate the method with 500 single-planet injection-recovery tests spanning periods of 2.5 to 400 days and semi-amplitudes of K = 0.1 to 0.7 m s^-1, calibrating the detection criterion to yield zero spurious detections. At this matched confidence level, AESTRA recovers 238 injected planets, including 13 with K < 0.3 m s^-1, whereas traditional CCF-based activity-indicator detrending recovers 9 planets and none below K = 0.5 m s^-1.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents AESTRA II, a generative framework that empirically decomposes NEID Sun-as-a-star spectra into stellar line-shape variability, micro-telluric absorption, and continuum variability without external templates. After removing the telluric and continuum components, a low-dimensional representation is trained on the residuals to jointly infer activity-driven apparent RVs and candidate planetary Doppler signals. Evaluated via 500 single-planet injection-recovery tests (periods 2.5–400 days, K = 0.1–0.7 m s⁻¹) with a detection threshold calibrated to zero false positives, the method recovers 238 injected planets (including 13 with K < 0.3 m s⁻¹), compared to 9 planets (none below K = 0.5 m s⁻¹) using traditional CCF-based activity-indicator detrending.

Significance. If the central performance claims hold after addressing the training-data circularity, the approach could meaningfully advance EPRV sensitivity by reducing the impact of stellar activity and tellurics on low-amplitude signals, offering a data-driven alternative to template-based methods for Sun-like stars.

major comments (3)
  1. [Abstract / generative decomposition description] Abstract and method description: The generative decomposition into line-shape, micro-telluric, and continuum components is trained on the same NEID spectra that contain the 500 injected planetary signals. Any correlation between the injected Doppler shifts and the learned components risks partial absorption of planetary signals into the removed terms or the subsequent activity model, directly undermining the reported recovery counts (238 planets, 13 with K < 0.3 m s⁻¹). An explicit test (e.g., training on uninjected data or measuring recovered K versus injected K in the absence of activity) is required to validate the weakest assumption.
  2. [Injection-recovery tests] Injection-recovery section: The detection criterion is calibrated to produce zero spurious detections, yet the manuscript provides no derivation of the threshold, error propagation details, or data exclusion rules. Without these, the quantitative superiority over CCF methods cannot be independently verified from the 500 tests.
  3. [Low-dimensional representation and joint inference] Low-dimensional representation training: The activity-driven RV inference is performed jointly with candidate signals on residuals from the same dataset used to learn the decomposition. This creates a circularity burden where the model may misattribute or attenuate genuine Doppler signals as activity variability, and no cross-validation or hold-out test separating training spectra from injected signals is described.
minor comments (2)
  1. [Method] Notation for the low-dimensional basis and activity indicators should be defined explicitly with equations rather than descriptive text only.
  2. [Figures] Figure captions for the injection-recovery results should include the exact number of tests per K bin and the precise false-positive calibration procedure.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thorough review and for highlighting important concerns about potential circularity and methodological transparency. We address each major comment below. Where the comments identify gaps that require additional validation or documentation, we will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / generative decomposition description] Abstract and method description: The generative decomposition into line-shape, micro-telluric, and continuum components is trained on the same NEID spectra that contain the 500 injected planetary signals. Any correlation between the injected Doppler shifts and the learned components risks partial absorption of planetary signals into the removed terms or the subsequent activity model, directly undermining the reported recovery counts (238 planets, 13 with K < 0.3 m s⁻¹). An explicit test (e.g., training on uninjected data or measuring recovered K versus injected K in the absence of activity) is required to validate the weakest assumption.

    Authors: We agree that this is a substantive concern. The decomposition is intended to isolate line-shape variability, micro-tellurics, and continuum effects, with planetary Doppler shifts handled in the subsequent low-dimensional RV inference step. However, because the decomposition was trained on the injected dataset, we cannot rule out partial absorption without further testing. We will add an explicit validation test training the decomposition exclusively on uninjected spectra and then applying the full pipeline to the injected set, reporting the resulting recovery statistics in the revised manuscript. revision: yes

  2. Referee: [Injection-recovery tests] Injection-recovery section: The detection criterion is calibrated to produce zero spurious detections, yet the manuscript provides no derivation of the threshold, error propagation details, or data exclusion rules. Without these, the quantitative superiority over CCF methods cannot be independently verified from the 500 tests.

    Authors: We accept this criticism. The current text states only that the threshold was calibrated to zero false positives but does not supply the derivation, error propagation, or exclusion criteria. In the revised manuscript we will include a dedicated subsection detailing the threshold derivation, the statistical procedure used to ensure zero spurious detections, the error propagation, and any data exclusion rules applied to the 500 tests. revision: yes

  3. Referee: [Low-dimensional representation and joint inference] Low-dimensional representation training: The activity-driven RV inference is performed jointly with candidate signals on residuals from the same dataset used to learn the decomposition. This creates a circularity burden where the model may misattribute or attenuate genuine Doppler signals as activity variability, and no cross-validation or hold-out test separating training spectra from injected signals is described.

    Authors: This point overlaps with the first comment. The joint inference occurs after the telluric/continuum removal step, but the low-dimensional representation is still learned from the same spectra. We will introduce a cross-validation or hold-out protocol that separates the spectra used to train the low-dimensional activity model from those used in the injection-recovery tests and will report the outcome of this test in the revision. revision: yes

Circularity Check

1 steps flagged

Low-dimensional representation trained on spectra containing injected signals, then used to recover those signals

specific steps
  1. fitted input called prediction [Abstract]
    "After removing the learned telluric and continuum components, we train a low-dimensional representation of the spectrum to infer activity-driven apparent RVs jointly with candidate Doppler signals. We evaluate the method with 500 single-planet injection-recovery tests spanning periods of 2.5 to 400 days and semi-amplitudes of K = 0.1 to 0.7 m s^-1, calibrating the detection criterion to yield zero spurious detections. At this matched confidence level, AESTRA recovers 238 injected planets, including 13 with K < 0.3 m s^-1"

    The low-dimensional representation is trained on the identical spectra containing the 500 injected planetary signals. The inference of activity RVs jointly with candidate Doppler signals, followed by counting recoveries from those same data, reduces the 'recovery' counts to a fitted outcome on the training inputs rather than an out-of-sample prediction.

full rationale

The paper's evaluation uses injection-recovery tests on the same NEID spectra from which the generative decomposition and low-dimensional representation are learned. This setup means the reported recovery of 238 planets (including weak ones) occurs after fitting the activity model to data that already embeds the injected Doppler signals, creating moderate circularity in claiming superior separation without independent validation that signals are not partially absorbed. No self-citation chains or definitional reductions are evident from the abstract; the issue is specific to the fitted-input nature of the recovery metric.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the method is described as purely empirical and data-driven.

pith-pipeline@v0.9.1-grok · 5756 in / 1243 out tokens · 24858 ms · 2026-06-27T05:29:50.739877+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 25 canonical work pages

  1. [1]

    2023, ARA&A, 61, 329, doi: 10.1146/annurev-astro-052920-103508

    Aigrain, S., & Foreman-Mackey, D. 2023, ARA&A, 61, 329, doi: 10.1146/annurev-astro-052920-103508

  2. [2]

    2022, A&A, 666, A196, doi: 10.1051/0004-6361/202243629 Astropy Collaboration, Robitaille, T

    Allart, R., Lovis, C., Faria, J., et al. 2022, A&A, 666, A196, doi: 10.1051/0004-6361/202243629 Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33, doi: 10.1051/0004-6361/201322068 Astropy Collaboration, Price-Whelan, A. M., Sip˝ ocz, B. M., et al. 2018, AJ, 156, 123, doi: 10.3847/1538-3881/aabc4f Astropy Collaboration, ...

  3. [3]

    W., Foreman-Mackey, D., Montet, B

    Bedell, M., Hogg, D. W., Foreman-Mackey, D., Montet, B. T., & Luger, R. 2019, AJ, 158, 164, doi: 10.3847/1538-3881/ab40a7

  4. [4]

    2014, A&A, 564, A46, doi: 10.1051/0004-6361/201322383

    Bodichon, R. 2014, A&A, 564, A46, doi: 10.1051/0004-6361/201322383

  5. [5]

    , keywords =

    Boisse, I., Bouchy, F., H´ ebrard, G., et al. 2011, A&A, 528, A4, doi: 10.1051/0004-6361/201014354

  6. [6]

    A., Shephard, M

    Clough, S. A., Shephard, M. W., Mlawer, E. J., et al. 2005, JQSRT, 91, 233, doi: 10.1016/j.jqsrt.2004.05.058 Collier Cameron, A., Ford, E. B., Shahaf, S., et al. 2021, MNRAS, 505, 1699, doi: 10.1093/mnras/stab1323

  7. [7]

    2020, A&A, 633, A76, doi: 10.1051/0004-6361/201936548

    Lovis, C. 2020, A&A, 633, A76, doi: 10.1051/0004-6361/201936548

  8. [8]

    Cretignier, M., Dumusque, X., & Pepe, F. 2022, A&A, 659, A68, doi: 10.1051/0004-6361/202142435 23 Figure 10.Corner plot comparing the first two principal components of the learned stellar latent representation with tradi- tional CCF-based activity indicators, shown separately for the pre-fire (blue) and post-fire (orange) subsets. The two populations occu...

  9. [9]

    C., Figueira, P., et al

    Cunha, D., Santos, N. C., Figueira, P., et al. 2014, A&A, 568, A35, doi: 10.1051/0004-6361/201423723 de Beurs, Z. L., Vanderburg, A., Shallue, C. J., et al. 2022, AJ, 164, 49, doi: 10.3847/1538-3881/ac738e

  10. [10]

    Dumusque, X., Boisse, I., & Santos, N. C. 2014, ApJ, 796, 132, doi: 10.1088/0004-637X/796/2/132

  11. [11]

    Monteiro, M. J. P. F. G. 2011, A&A, 525, A140, doi: 10.1051/0004-6361/201014097

  12. [12]

    K., Vaughan, A

    Duncan, D. K., Vaughan, A. H., Wilson, O. C., et al. 1991, ApJS, 76, 383, doi: 10.1086/191572

  13. [13]

    B., Bender, C

    Ford, E. B., Bender, C. F., Blake, C. H., et al. 2024, arXiv e-prints, arXiv:2408.13318, doi: 10.48550/arXiv.2408.13318

  14. [14]

    B., Halverson, S., et al

    Gilbertson, C., Ford, E. B., Halverson, S., et al. 2024, arXiv e-prints, arXiv:2408.17289, doi: 10.48550/arXiv.2408.17289

  15. [15]

    D., Collier Cameron, A., Queloz, D., et al

    Haywood, R. D., Collier Cameron, A., Queloz, D., et al. 2014, MNRAS, 443, 2517, doi: 10.1093/mnras/stu1320

  16. [16]

    A., Collier Cameron, A., Faria, J

    John, A. A., Collier Cameron, A., Faria, J. P., et al. 2023, MNRAS, 525, 1687, doi: 10.1093/mnras/stad2381

  17. [17]

    2016, in Ground-based and airborne instrumentation for astronomy vi, Vol

    Jurgenson, C., Fischer, D., McCracken, T., et al. 2016, in Ground-based and airborne instrumentation for astronomy vi, Vol. 9908, SPIE, 2051–2070

  18. [18]

    1993, A&A, 271, 734

    Lallement, R., Bertin, P., Chassefiere, E., & Scott, N. 1993, A&A, 271, 734

  19. [19]

    Hodgkin, S. T. 2021, MNRAS, 502, 4392, doi: 10.1093/mnras/stab134

  20. [20]

    N., & Melchior, P

    Liang, Y., Winn, J. N., & Melchior, P. 2024, AJ, 167, 23, doi: 10.3847/1538-3881/ad0e01

  21. [21]

    Lin, A. S. J., Monson, A., Mahadevan, S., et al. 2022, AJ, 163, 184, doi: 10.3847/1538-3881/ac5622

  22. [22]

    2023, The Astronomical Journal, 166, 74 NEID SpecSoft Team

    Melchior, P., Liang, Y., Hahn, C., & Goulding, A. 2023, The Astronomical Journal, 166, 74 NEID SpecSoft Team. 2026, NEID RV Eras, https: //neid.ipac.caltech.edu/docs/NEID-DRP/rveras.html OpenAI. 2026, ChatGPT (2026 version), https://chatgpt.com/

  23. [23]

    A., Cristiani, S., Lopez, R

    Pepe, F. A., Cristiani, S., Lopez, R. R., et al. 2010, in Ground-based and Airborne Instrumentation for Astronomy III, Vol. 7735, SPIE, 209–217

  24. [24]

    2015, MNRAS, 452, 2269, doi: 10.1093/mnras/stv1428

    Roberts, S. 2015, MNRAS, 452, 2269, doi: 10.1093/mnras/stv1428

  25. [25]

    2019, Journal of Astronomical Telescopes, Instruments, and Systems, 5, 015003, doi: 10.1117/1.JATIS.5.1.015003

    Robertson, P., Anderson, T., Stefansson, G., et al. 2019, Journal of Astronomical Telescopes, Instruments, and Systems, 5, 015003, doi: 10.1117/1.JATIS.5.1.015003

  26. [26]

    M., & Bouchy, F

    Schmidt, T. M., & Bouchy, F. 2024, MNRAS, 530, 1252, doi: 10.1093/mnras/stae920

  27. [27]

    2016, in Ground-based and airborne instrumentation for astronomy VI, Vol

    Schwab, C., Rakich, A., Gong, Q., et al. 2016, in Ground-based and airborne instrumentation for astronomy VI, Vol. 9908, SPIE, 2220–2225

  28. [28]

    2015, A&A, 576, A77, doi: 10.1051/0004-6361/201423932 St¨ urmer, J., Buchhave, L

    Smette, A., Sana, H., Noll, S., et al. 2015, A&A, 576, A77, doi: 10.1051/0004-6361/201423932 St¨ urmer, J., Buchhave, L. A., Jessen, N. C., et al. 2024, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 13096, Ground-based and Airborne Instrumentation for Astronomy X, ed. J. J

  29. [29]

    Motohara, & J

    Bryant, K. Motohara, & J. R. D. Vernet, 130968E, doi: 10.1117/12.3019107

  30. [30]

    J., Queloz, D., Baraffe, I., et al

    Thompson, S. J., Queloz, D., Baraffe, I., et al. 2016, in Ground-based and Airborne Instrumentation for Astronomy VI, Vol. 9908, SPIE, 1949–1961

  31. [31]

    2024, A&A, 687, A281, doi: 10.1051/0004-6361/202450022

    Zhao, Y., Dumusque, X., Cretignier, M., et al. 2024, A&A, 687, A281, doi: 10.1051/0004-6361/202450022