High-fidelity stellar extinction with Gaia and APOGEE -- I. The method and a new extinction curve
Pith reviewed 2026-05-16 13:59 UTC · model grok-4.3
The pith
Stellar extinction for APOGEE stars is measured to 0.03 mag precision by training an XGBoost model on minimally reddened stars and anchoring to a Gaia BP/RP ratio of 1.694.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We determine reddening by comparing observed colors retrieved from photometric surveys or standardized synthetic magnitudes from Gaia BP/RP spectra to intrinsic colors predicted via an XGBoost model trained on minimally reddened stars using APOGEE stellar parameters. The derived reddening values are converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694, and we provide extinction measurements in 39 filters across 10 photometric systems along with a new empirical extinction curve optimized for broadband passbands. The estimates achieve a typical precision of 0.03 mag in Av and reveal systematic deviations of up to 30 percent between monochromatic and passband-integrated extin
What carries the argument
XGBoost model trained on minimally reddened stars to predict intrinsic colors from Teff, log g, [Fe/H] and [alpha/Fe], combined with the anchor ratio A_BP/A_RP = 1.694 derived from red-clump stars and the resulting passband-specific empirical extinction curve.
If this is right
- Stellar parameters and distances derived from Gaia parallaxes will carry smaller systematic errors when these extinction values replace earlier maps.
- Passband-integrated coefficients must replace monochromatic approximations when correcting photometry in filters redward of 700 nm.
- Galactic stellar population studies gain a new high-precision extinction catalog that can be downloaded from Zenodo.
- Existing maps such as Bayestar19, StarHorse and SEDEX can be cross-checked and potentially recalibrated against this benchmark.
Where Pith is reading between the lines
- The method could be extended to other large spectroscopic surveys to produce a uniform extinction framework across the sky.
- The 30 percent deviations may alter color-based age or metallicity estimates for red stars observed in broad filters.
- Testing the same pipeline on stars in the inner Galaxy or in known dust clouds would reveal whether the new extinction curve varies with environment.
Load-bearing premise
A training set of stars with truly negligible reddening can be identified so that the XGBoost model learns unbiased intrinsic colors.
What would settle it
Independent Av measurements obtained from high-resolution spectroscopy or from stars with precisely known distances in regions of very low dust would show whether the claimed 0.03 mag precision holds or whether systematic offsets appear.
read the original abstract
The scarcity of high-fidelity extinction measurements remains a bottleneck in deriving accurate stellar properties from Gaia parallaxes. In this work, we aim to derive precision extinction estimates for APOGEE DR19 stars, establishing a new benchmark for Galactic stellar population studies. We first determine reddening by comparing observed colorsr, etrieved from photometric surveys or standardized synthetic magnitudes from Gaia BP/RP spectra, to intrinsic colors predicted via an XGBoost model. The model is trained on minimally reddened stars to infer intrinsic colors and their associated uncertainties, using APOGEE stellar parameters (Teff, logg, [Fe/H], and [alpha/Fe]). The derived reddening values are then converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694 +/- 0.004, derived from red-clump-like stars. Here, we provide extinction measurements in 39 filters across 10 photometric systems and introduce a new empirical extinction curve optimized for broadband passbands. Our extinction estimates (Av) outperform existing results (Bayestar19, StarHorse, SEDEX), achieving a typical precision of 0.03 mag in Av. Notably, we identify systematic deviations of up to 30% between monochromatic and passband-integrated extinction ratios at wavelengths greater than 700 nm. This result highlights the necessity of adopting passband-specific coefficients when correcting extinction to derive stellar parameters. The derived extinction and reddening data are available to the community for download through Zenodo.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a method to derive high-precision stellar extinctions (Av) for APOGEE DR19 stars by training an XGBoost model on minimally reddened stars to predict intrinsic colors from Teff, logg, [Fe/H], and [alpha/Fe], then converting reddening to extinction using an anchor ratio A_BP/A_RP = 1.694 derived from red-clump-like stars. It supplies extinction values in 39 filters across 10 photometric systems, introduces a new empirical extinction curve, claims 0.03 mag typical precision in Av (outperforming Bayestar19, StarHorse, and SEDEX), and reports up to 30% systematic deviations between monochromatic and passband-integrated extinction ratios at wavelengths >700 nm.
Significance. If the training-set validation and anchor-ratio independence can be demonstrated, the work would deliver a valuable high-fidelity extinction catalog for Galactic stellar-population studies and would usefully highlight the practical importance of passband-specific coefficients over monochromatic approximations. The public Zenodo release of the data and curve would further strengthen its utility.
major comments (2)
- [Abstract] Abstract / method description: the claim of 0.03 mag precision in Av rests on the XGBoost intrinsic-color model trained on a 'minimally reddened' sample, yet no quantitative verification (e.g., cross-match with Planck 353 GHz or 3D dust maps, or zero-point test on known dust-free clusters) is provided to confirm residual E(B-V) < 0.01 mag; undetected bias would systematically shift all derived Av values.
- [Anchor ratio derivation] Anchor-ratio derivation: A_BP/A_RP = 1.694 ± 0.004 is obtained from red-clump-like stars drawn from the same APOGEE+Gaia dataset used for the XGBoost training, creating partial circularity that must be quantified before the reported 30% monochromatic-vs-passband deviations at >700 nm can be considered robust.
minor comments (1)
- [Abstract] Abstract contains an obvious typographical error ('colorsr, etrieved' should read 'colors retrieved').
Simulated Author's Rebuttal
We thank the referee for their thorough and constructive report. We address each major comment below and will incorporate revisions to strengthen the validation of the training sample and to quantify the independence of the anchor-ratio derivation. These changes will be reflected in the revised manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract / method description: the claim of 0.03 mag precision in Av rests on the XGBoost intrinsic-color model trained on a 'minimally reddened' sample, yet no quantitative verification (e.g., cross-match with Planck 353 GHz or 3D dust maps, or zero-point test on known dust-free clusters) is provided to confirm residual E(B-V) < 0.01 mag; undetected bias would systematically shift all derived Av values.
Authors: We agree that explicit external validation of the training-sample reddening is important for confirming the claimed precision. The selection of minimally reddened stars relied on Gaia photometry and APOGEE parameter quality cuts, but we acknowledge the absence of a direct zero-point test. In the revised manuscript we will add a new subsection (in Section 3) that cross-matches the training sample against Planck 353 GHz dust maps and performs a zero-point check on a set of well-studied, low-reddening open clusters. This will quantify the residual E(B-V) and, if necessary, include a small empirical correction. revision: yes
-
Referee: [Anchor ratio derivation] Anchor-ratio derivation: A_BP/A_RP = 1.694 ± 0.004 is obtained from red-clump-like stars drawn from the same APOGEE+Gaia dataset used for the XGBoost training, creating partial circularity that must be quantified before the reported 30% monochromatic-vs-passband deviations at >700 nm can be considered robust.
Authors: We recognize the referee's concern about partial circularity. The red-clump-like subsample was defined by independent photometric and metallicity criteria (CMD position and [Fe/H] range) rather than by the XGBoost features themselves, and the anchor ratio is computed after the intrinsic-color model has been applied. Nevertheless, to demonstrate robustness we will add an explicit overlap analysis and a sensitivity test that derives the anchor ratio from a held-out 20 % subset of the red-clump stars. These results will be presented in a new paragraph in Section 4.2; we expect the 30 % deviations at >700 nm to remain statistically unchanged. revision: partial
Circularity Check
Anchor ratio fitted from red-clump stars in same dataset and minimally-reddened training set create partial dependence in Av derivation
specific steps
-
fitted input called prediction
[Abstract]
"The derived reddening values are then converted into extinctions using an anchor ratio of A_BP / A_RP = 1.694 +/- 0.004, derived from red-clump-like stars."
The numerical anchor value is extracted from red-clump-like stars inside the same APOGEE DR19 sample that supplies all target stars; applying this fitted ratio back to the full set forces consistency by construction for the reddening-to-extinction step.
-
fitted input called prediction
[Abstract]
"The model is trained on minimally reddened stars to infer intrinsic colors and their associated uncertainties, using APOGEE stellar parameters (Teff, logg, [Fe/H], and [alpha/Fe])."
Selection of the 'minimally reddened' training subset occurs within the identical photometric and spectroscopic dataset; any undetected residual reddening in that subset shifts the learned intrinsic-color locus, directly biasing the reddening (and thus Av) inferred for every subsequent star.
full rationale
The central pipeline derives reddening via XGBoost intrinsic-color predictions trained on a 'minimally reddened' subset of the APOGEE data, then converts to Av using an A_BP/A_RP anchor ratio also extracted from red-clump-like stars in the identical sample. This matches the 'fitted_input_called_prediction' pattern: both the training locus and the conversion factor are determined internally rather than from fully independent external calibrators. The 0.03 mag precision claim and the 30% monochromatic-vs-passband deviation therefore rest partly on these self-derived quantities. No self-citation chain or explicit self-definition of the final Av is present, so the circularity remains partial and the overall result retains independent content from the XGBoost architecture and cross-survey comparisons.
Axiom & Free-Parameter Ledger
free parameters (1)
- A_BP / A_RP anchor ratio =
1.694
axioms (1)
- domain assumption A set of minimally reddened stars can be identified to train the XGBoost model for unbiased intrinsic colors
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.