High-fidelity stellar extinction with Gaia and APOGEE -- I. The method and a new extinction curve

Dennis Stello; Giacomo Cordoni; Haibo Yuan; Hiep Nguyen; Ioana Ciuc\u{a}; Jie Yu; John A. Taylor; Luca Casagrande; Ronald Drimmel; Shourya Khanna

arxiv: 2601.10595 · v2 · pith:4H6CYHVCnew · submitted 2026-01-15 · 🌌 astro-ph.SR · astro-ph.GA

High-fidelity stellar extinction with Gaia and APOGEE -- I. The method and a new extinction curve

Jie Yu , Luca Casagrande , John A. Taylor , Ioana Ciuc\u{a} , Giacomo Cordoni , Ronald Drimmel , Shourya Khanna , Hiep Nguyen

show 4 more authors

Tomasz R\'o\.za\'nski Dennis Stello Haibo Yuan Zhen Yuan

This is my paper

Pith reviewed 2026-05-16 13:59 UTC · model grok-4.3

classification 🌌 astro-ph.SR astro-ph.GA

keywords stellar extinctionreddeningGaiaAPOGEEXGBoostextinction curvered clumpMilky Way

0 comments

The pith

Stellar extinction for APOGEE stars is measured to 0.03 mag precision by training an XGBoost model on minimally reddened stars and anchoring to a Gaia BP/RP ratio of 1.694.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to obtain high-fidelity extinction values by first training an XGBoost model on stars with negligible dust to predict their intrinsic colors from effective temperature, surface gravity, metallicity, and alpha-element abundance. Observed colors from Gaia and other photometry are then subtracted from these predictions to isolate the reddening. The reddening is scaled to extinction using the fixed ratio A_BP / A_RP measured from red-clump stars, and the results are supplied for 39 filters together with a new empirical curve tuned to broadband passbands. This yields typical uncertainties of 0.03 mag in visual extinction and outperforms prior catalogs, while also showing that monochromatic extinction ratios can differ by up to 30 percent from passband-integrated values at wavelengths longer than 700 nm. The work demonstrates that passband-specific coefficients are required for accurate corrections when deriving stellar parameters from Gaia parallaxes.

Core claim

We determine reddening by comparing observed colors retrieved from photometric surveys or standardized synthetic magnitudes from Gaia BP/RP spectra to intrinsic colors predicted via an XGBoost model trained on minimally reddened stars using APOGEE stellar parameters. The derived reddening values are converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694, and we provide extinction measurements in 39 filters across 10 photometric systems along with a new empirical extinction curve optimized for broadband passbands. The estimates achieve a typical precision of 0.03 mag in Av and reveal systematic deviations of up to 30 percent between monochromatic and passband-integrated extin

What carries the argument

XGBoost model trained on minimally reddened stars to predict intrinsic colors from Teff, log g, [Fe/H] and [alpha/Fe], combined with the anchor ratio A_BP/A_RP = 1.694 derived from red-clump stars and the resulting passband-specific empirical extinction curve.

If this is right

Stellar parameters and distances derived from Gaia parallaxes will carry smaller systematic errors when these extinction values replace earlier maps.
Passband-integrated coefficients must replace monochromatic approximations when correcting photometry in filters redward of 700 nm.
Galactic stellar population studies gain a new high-precision extinction catalog that can be downloaded from Zenodo.
Existing maps such as Bayestar19, StarHorse and SEDEX can be cross-checked and potentially recalibrated against this benchmark.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be extended to other large spectroscopic surveys to produce a uniform extinction framework across the sky.
The 30 percent deviations may alter color-based age or metallicity estimates for red stars observed in broad filters.
Testing the same pipeline on stars in the inner Galaxy or in known dust clouds would reveal whether the new extinction curve varies with environment.

Load-bearing premise

A training set of stars with truly negligible reddening can be identified so that the XGBoost model learns unbiased intrinsic colors.

What would settle it

Independent Av measurements obtained from high-resolution spectroscopy or from stars with precisely known distances in regions of very low dust would show whether the claimed 0.03 mag precision holds or whether systematic offsets appear.

read the original abstract

The scarcity of high-fidelity extinction measurements remains a bottleneck in deriving accurate stellar properties from Gaia parallaxes. In this work, we aim to derive precision extinction estimates for APOGEE DR19 stars, establishing a new benchmark for Galactic stellar population studies. We first determine reddening by comparing observed colorsr, etrieved from photometric surveys or standardized synthetic magnitudes from Gaia BP/RP spectra, to intrinsic colors predicted via an XGBoost model. The model is trained on minimally reddened stars to infer intrinsic colors and their associated uncertainties, using APOGEE stellar parameters (Teff, logg, [Fe/H], and [alpha/Fe]). The derived reddening values are then converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694 +/- 0.004, derived from red-clump-like stars. Here, we provide extinction measurements in 39 filters across 10 photometric systems and introduce a new empirical extinction curve optimized for broadband passbands. Our extinction estimates (Av) outperform existing results (Bayestar19, StarHorse, SEDEX), achieving a typical precision of 0.03 mag in Av. Notably, we identify systematic deviations of up to 30% between monochromatic and passband-integrated extinction ratios at wavelengths greater than 700 nm. This result highlights the necessity of adopting passband-specific coefficients when correcting extinction to derive stellar parameters. The derived extinction and reddening data are available to the community for download through Zenodo.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical XGBoost pipeline for intrinsic colors plus a new passband extinction curve with 0.03 mag Av precision for APOGEE stars, but the minimally-reddened training set needs explicit zero-point checks.

read the letter

The main thing to know is that this work delivers extinction values in 39 filters for APOGEE DR19 stars at a claimed 0.03 mag typical precision on Av, along with an empirical curve that shows up to 30% deviations from monochromatic ratios beyond 700 nm. They train XGBoost on APOGEE parameters to predict intrinsic colors, measure reddening against observed photometry, and anchor the scale with an A_BP/A_RP ratio of 1.694 derived from red-clump-like stars in the sample. The public Zenodo release of the catalog is a clear plus for anyone needing consistent corrections across surveys.

Referee Report

2 major / 1 minor

Summary. The manuscript presents a method to derive high-precision stellar extinctions (Av) for APOGEE DR19 stars by training an XGBoost model on minimally reddened stars to predict intrinsic colors from Teff, logg, [Fe/H], and [alpha/Fe], then converting reddening to extinction using an anchor ratio A_BP/A_RP = 1.694 derived from red-clump-like stars. It supplies extinction values in 39 filters across 10 photometric systems, introduces a new empirical extinction curve, claims 0.03 mag typical precision in Av (outperforming Bayestar19, StarHorse, and SEDEX), and reports up to 30% systematic deviations between monochromatic and passband-integrated extinction ratios at wavelengths >700 nm.

Significance. If the training-set validation and anchor-ratio independence can be demonstrated, the work would deliver a valuable high-fidelity extinction catalog for Galactic stellar-population studies and would usefully highlight the practical importance of passband-specific coefficients over monochromatic approximations. The public Zenodo release of the data and curve would further strengthen its utility.

major comments (2)

[Abstract] Abstract / method description: the claim of 0.03 mag precision in Av rests on the XGBoost intrinsic-color model trained on a 'minimally reddened' sample, yet no quantitative verification (e.g., cross-match with Planck 353 GHz or 3D dust maps, or zero-point test on known dust-free clusters) is provided to confirm residual E(B-V) < 0.01 mag; undetected bias would systematically shift all derived Av values.
[Anchor ratio derivation] Anchor-ratio derivation: A_BP/A_RP = 1.694 ± 0.004 is obtained from red-clump-like stars drawn from the same APOGEE+Gaia dataset used for the XGBoost training, creating partial circularity that must be quantified before the reported 30% monochromatic-vs-passband deviations at >700 nm can be considered robust.

minor comments (1)

[Abstract] Abstract contains an obvious typographical error ('colorsr, etrieved' should read 'colors retrieved').

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough and constructive report. We address each major comment below and will incorporate revisions to strengthen the validation of the training sample and to quantify the independence of the anchor-ratio derivation. These changes will be reflected in the revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract / method description: the claim of 0.03 mag precision in Av rests on the XGBoost intrinsic-color model trained on a 'minimally reddened' sample, yet no quantitative verification (e.g., cross-match with Planck 353 GHz or 3D dust maps, or zero-point test on known dust-free clusters) is provided to confirm residual E(B-V) < 0.01 mag; undetected bias would systematically shift all derived Av values.

Authors: We agree that explicit external validation of the training-sample reddening is important for confirming the claimed precision. The selection of minimally reddened stars relied on Gaia photometry and APOGEE parameter quality cuts, but we acknowledge the absence of a direct zero-point test. In the revised manuscript we will add a new subsection (in Section 3) that cross-matches the training sample against Planck 353 GHz dust maps and performs a zero-point check on a set of well-studied, low-reddening open clusters. This will quantify the residual E(B-V) and, if necessary, include a small empirical correction. revision: yes
Referee: [Anchor ratio derivation] Anchor-ratio derivation: A_BP/A_RP = 1.694 ± 0.004 is obtained from red-clump-like stars drawn from the same APOGEE+Gaia dataset used for the XGBoost training, creating partial circularity that must be quantified before the reported 30% monochromatic-vs-passband deviations at >700 nm can be considered robust.

Authors: We recognize the referee's concern about partial circularity. The red-clump-like subsample was defined by independent photometric and metallicity criteria (CMD position and [Fe/H] range) rather than by the XGBoost features themselves, and the anchor ratio is computed after the intrinsic-color model has been applied. Nevertheless, to demonstrate robustness we will add an explicit overlap analysis and a sensitivity test that derives the anchor ratio from a held-out 20 % subset of the red-clump stars. These results will be presented in a new paragraph in Section 4.2; we expect the 30 % deviations at >700 nm to remain statistically unchanged. revision: partial

Circularity Check

2 steps flagged

Anchor ratio fitted from red-clump stars in same dataset and minimally-reddened training set create partial dependence in Av derivation

specific steps

fitted input called prediction [Abstract]
"The derived reddening values are then converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694 +/- 0.004, derived from red-clump-like stars."

The numerical anchor value is extracted from red-clump-like stars inside the same APOGEE DR19 sample that supplies all target stars; applying this fitted ratio back to the full set forces consistency by construction for the reddening-to-extinction step.
fitted input called prediction [Abstract]
"The model is trained on minimally reddened stars to infer intrinsic colors and their associated uncertainties, using APOGEE stellar parameters (Teff, logg, [Fe/H], and [alpha/Fe])."

Selection of the 'minimally reddened' training subset occurs within the identical photometric and spectroscopic dataset; any undetected residual reddening in that subset shifts the learned intrinsic-color locus, directly biasing the reddening (and thus Av) inferred for every subsequent star.

full rationale

The central pipeline derives reddening via XGBoost intrinsic-color predictions trained on a 'minimally reddened' subset of the APOGEE data, then converts to Av using an A_BP/A_RP anchor ratio also extracted from red-clump-like stars in the identical sample. This matches the 'fitted_input_called_prediction' pattern: both the training locus and the conversion factor are determined internally rather than from fully independent external calibrators. The 0.03 mag precision claim and the 30% monochromatic-vs-passband deviation therefore rest partly on these self-derived quantities. No self-citation chain or explicit self-definition of the final Av is present, so the circularity remains partial and the overall result retains independent content from the XGBoost architecture and cross-survey comparisons.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Central claim depends on the fitted anchor ratio and the domain assumption that minimally reddened stars can be isolated for training without circular bias.

free parameters (1)

A_BP / A_RP anchor ratio = 1.694
Fitted from red-clump-like stars to convert reddening to extinction; value given as 1.694 +/- 0.004.

axioms (1)

domain assumption A set of minimally reddened stars can be identified to train the XGBoost model for unbiased intrinsic colors
Invoked to justify the training procedure for predicting intrinsic colors from APOGEE parameters.

pith-pipeline@v0.9.0 · 5836 in / 1227 out tokens · 35908 ms · 2026-05-16T13:59:38.721466+00:00 · methodology

High-fidelity stellar extinction with Gaia and APOGEE -- I. The method and a new extinction curve

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)