Chemical decoding of kinematic substructures in the Galactic halo
Pith reviewed 2026-05-21 22:44 UTC · model grok-4.3
The pith
Kinematic substructures in the Galactic halo are chemical mixtures rather than single-origin populations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Halo stars chemically compatible with GSE are spread throughout the E-Lz space and considerably contaminate every halo substructure studied in this work. None of these substructures appears to be a unique population of stars with its own origin. In addition to GSE, they all appear to be mixtures of stars chemically compatible either with the metal-poor disc, Sagittarius, ω Cen, or with a combination of them.
What carries the argument
Gaussian Mixture Model applied star-by-star to seven elemental abundances to chemically tag and compare stars across kinematically selected substructures.
If this is right
- Sequoia shares a chemical origin with GSE.
- Heracles, Thamnos and the Helmi Stream each contain large fractions of both GSE and heated disc stars.
- The Helmi Stream also contains stars chemically matching Sagittarius while Thamnos contains stars matching ω Cen.
- GSE itself is contaminated by Sagittarius stars.
- Chemically GSE-like stars appear across the full E-Lz plane and affect every substructure examined.
Where Pith is reading between the lines
- Kinematic selections alone are likely to misattribute the origins of halo stars, so future work will need tighter chemistry-dynamics combinations to reconstruct accretion events.
- Milky Way formation simulations can be tested against the observed degree of mixing in E-Lz space.
- Larger samples with more precise abundances from upcoming surveys should allow quantitative fractions for each progenitor inside every substructure.
- The heated metal-poor disc's contribution suggests that in-situ heating during the GSE merger was more important than some models currently assume.
Load-bearing premise
The selected abundances plus the mixture model can separate stellar origins even when measurement errors, internal scatter inside each progenitor, and possible overlaps between progenitors are present.
What would settle it
High-resolution spectra of the same stars that reveal abundance patterns cleanly separated into groups that do not match the proposed GSE, disc, Sagittarius or omega Cen templates would falsify the claimed chemical mixtures.
Figures
read the original abstract
In the hierarchical assembly framework, the accretion history of the Milky Way is crucial to understand its evolution. However, in massive mergers, integrals of motion are not strictly conserved, redistributing accreted stars across dynamical spaces, such as energy-angular momentum ($E-L_z$). Additionally, the in situ disc becomes kinematically heated, acquiring halo-like orbits. Consequently, even for minor mergers, which should preserve dynamical coherence, we expect their kinematic-defined samples to be contaminated by both the massive merger(s) and the disc stars. This study aims at quantifying this contamination in known accreted halo substructures. As they are defined by kinematics, we aim at cleaning their samples analysing only chemical properties. We applied the kinematic selection criteria for the halo substructures to the Gaia EDR3 and APOGEE DR17 data. Then we adopted a Gaussian Mixture Model approach to chemically compare different substructures on a star-by-star basis, taking into account several abundances (Fe, Mg, Si, Ca, Mn, Al, and C). We argue that the chemical properties of Sequoia point towards a shared origin with GSE. Heracles, Thamnos and the Helmi Stream all likely comprise GSE and heated disc stars in a significant amount. Besides these two populations, we identified stars with chemical and orbital properties compatible with Sagittarius in the Helmi Stream and with $\omega$ Cen in Thamnos. Finally, GSE itself is contaminated by Sagittarius. Halo stars chemically compatible with GSE are spread throughout the $E-L_z$ space and considerably contaminate every halo substructure studied in this work. None of these substructures appears to be a unique population of stars with its own origin. In addition to GSE, they all appear to be mixtures of stars chemically compatible either with the metal-poor disc, Sagittarius, $\omega$ Cen, or with a combination of them.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies kinematic selections from Gaia EDR3 to APOGEE DR17 stars to isolate known halo substructures (GSE, Sequoia, Heracles, Thamnos, Helmi Stream) and then uses a Gaussian Mixture Model on seven abundances (Fe, Mg, Si, Ca, Mn, Al, C) to perform star-by-star chemical classification. The central claim is that GSE-compatible stars are distributed across E-Lz space and contaminate every substructure, rendering none of them chemically unique; instead, the substructures are reported as mixtures of GSE, metal-poor disc, Sagittarius, and ω Cen stars.
Significance. If the GMM-based separations hold after proper validation, the result would demonstrate that dynamical mixing from massive mergers and disc heating renders purely kinematic halo substructures chemically composite. This has direct implications for interpreting the Milky Way's accretion history and for the reliability of substructure catalogs derived from integrals of motion alone. The analysis is grounded in public catalogs, which in principle supports reproducibility.
major comments (3)
- [GMM methods] Section describing the GMM implementation (likely §3): no details are provided on how measurement uncertainties in the seven abundances are propagated into the mixture model, how the number of components is chosen (e.g., via BIC, cross-validation, or silhouette score), or on initialization stability. Because the reported contamination fractions and mixture interpretations (Heracles/Thamnos as GSE+disc; Helmi with added Sgr) rest directly on the per-star posterior assignments, these omissions make the quantitative claims sensitive to untested modeling choices.
- [Results for Helmi and Thamnos] Results sections on individual substructures (e.g., Helmi Stream and Thamnos): the identification of Sagittarius- or ω Cen-compatible stars is presented without a confusion-matrix test, synthetic-population validation, or quantification of overlap probabilities between the fitted Gaussians. Given known intrinsic scatter within progenitors and APOGEE abundance errors, the evidence that these specific contaminants are required (rather than being absorbed into broader GSE or disc components) is not yet load-bearing.
- [Data and sample selection] Sample selection and cleaning (likely §2): the manuscript does not specify the quality cuts applied to APOGEE DR17 abundances or how stars with large uncertainties or missing values in the chosen elements are handled before GMM fitting. This directly affects the reliability of the chemical comparison across substructures.
minor comments (3)
- [Abstract] The abstract states that 'GSE itself is contaminated by Sagittarius' but does not quantify the fraction or show the corresponding E-Lz distribution; adding a brief numerical summary would improve clarity.
- [Methods] Notation for the abundance vector and the GMM covariance matrices should be defined explicitly in the methods section to avoid ambiguity when readers attempt to reproduce the fits.
- [Figures] Figure captions for the E-Lz and abundance projections could explicitly state the number of stars assigned to each GMM component per substructure.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important areas for improving the methodological transparency and validation of our analysis. We address each major comment point by point below and will revise the manuscript accordingly where needed.
read point-by-point responses
-
Referee: [GMM methods] Section describing the GMM implementation (likely §3): no details are provided on how measurement uncertainties in the seven abundances are propagated into the mixture model, how the number of components is chosen (e.g., via BIC, cross-validation, or silhouette score), or on initialization stability. Because the reported contamination fractions and mixture interpretations (Heracles/Thamnos as GSE+disc; Helmi with added Sgr) rest directly on the per-star posterior assignments, these omissions make the quantitative claims sensitive to untested modeling choices.
Authors: We agree that additional details on the GMM procedure are required for full reproducibility. In the revised manuscript we will expand the methods section to describe: propagation of APOGEE abundance uncertainties via Monte Carlo resampling of the input data before fitting; selection of the number of components using the Bayesian Information Criterion with explicit reporting of BIC values for 2–8 components; and assessment of initialization stability through 50 random initializations, retaining only assignments that remain consistent across runs. These additions will directly support the robustness of the reported contamination fractions. revision: yes
-
Referee: [Results for Helmi and Thamnos] Results sections on individual substructures (e.g., Helmi Stream and Thamnos): the identification of Sagittarius- or ω Cen-compatible stars is presented without a confusion-matrix test, synthetic-population validation, or quantification of overlap probabilities between the fitted Gaussians. Given known intrinsic scatter within progenitors and APOGEE abundance errors, the evidence that these specific contaminants are required (rather than being absorbed into broader GSE or disc components) is not yet load-bearing.
Authors: We acknowledge that the current presentation would benefit from explicit validation of the additional components. In the revision we will add a supplementary section that generates synthetic populations drawn from literature abundance distributions for GSE, metal-poor disc, Sagittarius, and ω Cen, injects realistic APOGEE uncertainties, and quantifies the overlap probabilities and recovery rates of the fitted Gaussians. This will demonstrate whether the Sagittarius and ω Cen components are statistically required or can be absorbed into broader mixtures. revision: yes
-
Referee: [Data and sample selection] Sample selection and cleaning (likely §2): the manuscript does not specify the quality cuts applied to APOGEE DR17 abundances or how stars with large uncertainties or missing values in the chosen elements are handled before GMM fitting. This directly affects the reliability of the chemical comparison across substructures.
Authors: We thank the referee for noting this omission. The revised Section 2 will explicitly list the applied quality criteria, including ASPCAPFLAG and STARFLAG cuts, a minimum S/N threshold of 100, and the exclusion of stars with abundance uncertainties exceeding 0.2 dex in any of the seven elements or with missing values in more than one element. This will clarify the sample definition and allow direct assessment of its impact on the chemical tagging results. revision: yes
Circularity Check
No circularity: data-driven GMM classification from public catalogs
full rationale
The paper applies standard kinematic cuts to Gaia EDR3 and APOGEE DR17, then runs a Gaussian Mixture Model on the seven abundances (Fe, Mg, Si, Ca, Mn, Al, C) to assign chemical membership on a star-by-star basis. All reported contamination fractions and mixture interpretations follow directly from these empirical assignments and orbital comparisons; no equation or parameter is defined in terms of the final claim, no self-citation supplies a uniqueness theorem, and no fitted input is relabeled as an independent prediction. The derivation therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Number of GMM components
axioms (1)
- domain assumption Stars accreted from the same progenitor share sufficiently distinct multi-element abundance patterns to be separable by GMM despite observational errors.
Forward citations
Cited by 2 Pith papers
-
Probing the Milky Way Halo with RR Lyrae Stars from Gaia Data Release 3
RR Lyrae stars yield metallicity measurements for Milky Way halo substructures including Gaia Sausage/Enceladus at [Fe/H] = -1.57 dex, with some features like ED-1 showing disk contamination and others like Shiva and ...
-
The Nephele ecosystem: stars, globular clusters, and stellar streams associated with the progenitor galaxy of $\omega$ Centauri
Using 8D chemical clustering on APOGEE DR17 and kinematic matching to e-TidalGCs simulations, the authors report 470 ω Cen-like stars including 6 kinematically consistent with its stream and additional links to stream...
Reference graph
Works this paper leans on
- [1]
-
[2]
Amarante, J. A. S., Debattista, V . P., Beraldo e Silva, L., Laporte, C. F. P., & Deg, N. 2022, ApJ, 937, 12
work page 2022
- [3]
-
[4]
Belokurov, V ., Erkal, D., Evans, N. W., Koposov, S. E., & Deason, A. J. 2018, MNRAS, 478, 611
work page 2018
-
[5]
Belokurov, V ., Sanders, J. L., Fattahi, A., et al. 2020, MNRAS, 494, 3880
work page 2020
- [6]
-
[7]
Brown, A. G. A., Velázquez, H. M., & Aguilar, L. A. 2005, MNRAS, 359, 1287
work page 2005
- [8]
-
[9]
Callingham, T. M., Cautun, M., Deason, A. J., et al. 2022, MNRAS, 513, 4107
work page 2022
-
[10]
Deason, A. J. & Belokurov, V . 2024, New A Rev., 99, 101706 Di Matteo, P., Haywood, M., Lehnert, M. D., et al. 2019, A&A, 632, A4
work page 2024
-
[11]
K., Feltzing, S., Sahlholdt, C
Feuillet, D. K., Feltzing, S., Sahlholdt, C. L., & Casagrande, L. 2020, MNRAS, 497, 109
work page 2020
-
[12]
Feuillet, D. K., Sahlholdt, C. L., Feltzing, S., & Casagrande, L. 2021, MNRAS, 508, 1489
work page 2021
-
[13]
Forbes, D. A. 2020, MNRAS, 493, 847
work page 2020
-
[14]
Freeman, K. & Bland-Hawthorn, J. 2002, ARA&A, 40, 487 Gaia Collaboration, Brown, A. G. A., Vallenari, A., et al. 2021, A&A, 650, C3
work page 2002
-
[15]
Gallart, C., Bernard, E. J., Brook, C. B., et al. 2019, Nature Astronomy, 3, 932 Gómez, F. A., Helmi, A., Brown, A. G. A., & Li, Y .-S. 2010, MNRAS, 408, 935
work page 2019
-
[16]
Grand, R. J. J., Deason, A. J., White, S. D. M., et al. 2019, MNRAS, 487, L72
work page 2019
-
[17]
Hasselquist, S., Hayes, C. R., Lian, J., et al. 2021, ApJ, 923, 172
work page 2021
-
[18]
Hawkins, K., Jofré, P., Masseron, T., & Gilmore, G. 2015, MNRAS, 453, 758
work page 2015
- [19]
- [20]
- [21]
- [22]
-
[23]
Helmi, A., White, S. D. M., de Zeeuw, P. T., & Zhao, H. 1999, Nature, 402, 53
work page 1999
-
[24]
Horta, D., Schiavon, R. P., Mackereth, J. T., et al. 2021, MNRAS, 500, 1385
work page 2021
-
[25]
Horta, D., Schiavon, R. P., Mackereth, J. T., et al. 2023, MNRAS, 520, 5671
work page 2023
-
[26]
Ibata, R. A., Gilmore, G., & Irwin, M. J. 1994, Nature, 370, 194
work page 1994
-
[27]
Jean-Baptiste, I., Di Matteo, P., Haywood, M., et al. 2017, A&A, 604, A106
work page 2017
-
[28]
2023d, arXiv e-prints, arXiv:2310.05287
Khoperskov, S., Minchev, I., Steinmetz, M., et al. 2023d, arXiv e-prints, arXiv:2310.05287
-
[29]
Knebe, A., Gill, S. P. D., Kawata, D., & Gibson, B. K. 2005, MNRAS, 357, L35
work page 2005
- [30]
-
[31]
Kruijssen, J. M. D., Pfeffer, J. L., Chevance, M., et al. 2020, MNRAS, 498, 2472
work page 2020
-
[32]
Kruijssen, J. M. D., Pfeffer, J. L., Reina-Campos, M., Crain, R. A., & Bastian, N. 2019, MNRAS, 486, 3180
work page 2019
-
[33]
Laporte, C. F. P., Johnston, K. V ., & Tzanidakis, A. 2019, MNRAS, 483, 1427
work page 2019
- [34]
-
[35]
Leung, H. W. & Bovy, J. 2019, MNRAS, 489, 2079
work page 2019
-
[36]
Limberg, G., Souza, S. O., Pérez-Villegas, A., et al. 2022, ApJ, 935, 109
work page 2022
-
[37]
Mackereth, J. T. & Bovy, J. 2018, PASP, 130, 114501
work page 2018
-
[38]
Mackereth, J. T., Schiavon, R. P., Pfeffer, J., et al. 2019, MNRAS, 482, 3426
work page 2019
-
[39]
Majewski, S. R., Skrutskie, M. F., Weinberg, M. D., & Ostheimer, J. C. 2003, ApJ, 599, 1082
work page 2003
- [40]
-
[41]
McMillan, P. J. 2017, MNRAS, 465, 76
work page 2017
-
[42]
McMillan, P. J. & Binney, J. J. 2008, MNRAS, 390, 429
work page 2008
-
[43]
Monty, S., Venn, K. A., Lane, J. M. M., Lokhorst, D., & Yong, D. 2020, MNRAS, 497, 1236
work page 2020
-
[44]
Mori, A., Di Matteo, P., Salvadori, S., et al. 2024, A&A, 690, A136
work page 2024
-
[45]
Myeong, G. C., Evans, N. W., Belokurov, V ., Sanders, J. L., & Koposov, S. E. 2018, ApJ, 856, L26
work page 2018
-
[46]
C., Vasiliev, E., Iorio, G., Evans, N
Myeong, G. C., Vasiliev, E., Iorio, G., Evans, N. W., & Belokurov, V . 2019, MNRAS, 488, 1235
work page 2019
-
[47]
P., Conroy, C., Bonaca, A., et al
Naidu, R. P., Conroy, C., Bonaca, A., et al. 2020, ApJ, 901, 48
work page 2020
- [48]
-
[49]
Nandakumar, G., Ryde, N., Schultheis, M., et al. 2025, ApJ, 982, L14
work page 2025
-
[50]
Nissen, P. E. & Schuster, W. J. 2010, A&A, 511, L10
work page 2010
-
[51]
Pagnini, G., Di Matteo, P., Haywood, M., et al. 2025, A&A, 693, A155
work page 2025
- [52]
-
[53]
Ryde, N., Nandakumar, G., Schultheis, M., et al. 2025, ApJ, 979, 174
work page 2025
-
[54]
Schiavon, R. P., Phillips, S. G., Myers, N., et al. 2024, MNRAS, 528, 1393 Skúladóttir, Á., Ernandes, H., Feuillet, D. K., et al. 2025, ApJ, 986, L21
work page 2024
-
[55]
Thomas, G. F., Battaglia, G., Grand, R. J. J., & Aguiar Álvarez, A. 2025, arXiv e-prints, arXiv:2504.10398
- [56]
-
[57]
Venn, K. A., Irwin, M., Shetrone, M. D., et al. 2004, AJ, 128, 1177
work page 2004
-
[58]
2019, MNRAS, 487, L47 Article number, page 15 of 23 A&A proofs:manuscript no
Vincenzo, F., Spitoni, E., Calura, F., et al. 2019, MNRAS, 487, L47 Article number, page 15 of 23 A&A proofs:manuscript no. aa Appendix A: Chemical comparison of the halo substructures to GSE (K) and the metal-poor disc In this appendix, we show the detailed chemical comparison of each of the halo substructures with respect to GSE (K) and the metal-poor d...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.