pith. machine review for the scientific record. sign in

arxiv: 2604.15182 · v1 · submitted 2026-04-16 · 🌌 astro-ph.GA

Recognition: unknown

Understanding the regulation of star formation within TNG100 galaxies on kpc-scales using machine learning I: Global versus local

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:18 UTC · model grok-4.3

classification 🌌 astro-ph.GA
keywords galaxy quenchingstar formationmachine learningTNG100AGN feedbackenvironmental quenchinglocal stellar densityRandom Forest
0
0 comments X

The pith

Machine learning on TNG100 data shows black hole mass dominates galaxy quenching predictions while local stellar mass density controls star formation rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains Random Forest and XGBoost models on spatially resolved annular bins from about 6,000 TNG100 galaxies to rank which properties best predict whether a galaxy is quenched or actively forming stars. Both algorithms agree that black hole mass is the strongest predictor of quenching for central galaxies and high-mass satellites, while halo mass dominates for low-mass satellites. In contrast, predictions of star formation rate surface density are led by local stellar mass surface density in every star-forming galaxy type. This split indicates that quenching depends on global galaxy or halo properties whereas active star formation is regulated locally. A sympathetic reader cares because the result separates the physical scales and mechanisms that turn galaxies off from those that keep them forming stars inside cosmological simulations.

Core claim

Feature importance rankings recovered by both Random Forest and XGBoost on TNG100 annular-bin data show black hole mass as the leading predictor of quenching classification for central galaxies and high-mass satellites, halo mass for low-mass satellites, and local stellar mass surface density as the leading predictor of star formation rate surface density regression across all star-forming galaxies, supporting AGN-driven global quenching and locally regulated star formation.

What carries the argument

Random Forest and XGBoost feature-importance rankings applied to global and local galaxy properties extracted from ~63,000 annular bins across 6,189 TNG100 galaxies.

If this is right

  • Quenching in central and high-mass satellite galaxies proceeds primarily through AGN feedback tied to black hole mass.
  • Low-mass satellites experience environmental quenching driven by their host halo mass.
  • Active star formation remains a local process controlled by stellar mass surface density even inside galaxies that are globally quenched.
  • The two machine learning methods produce consistent rankings, with XGBoost spreading importance more evenly across correlated features.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same machine learning pipeline applied to MaNGA or SAMI observational maps could test whether the same features dominate in real galaxies.
  • If confirmed, galaxy evolution models would need to ensure AGN feedback prescriptions produce the observed dominance of black hole mass in quenching statistics.
  • The local dominance for star formation suggests that sub-grid star formation recipes in simulations should be tested primarily against small-scale density thresholds rather than global galaxy properties.

Load-bearing premise

The rankings recovered by the machine learning models reflect true causal physical drivers rather than correlations or artifacts from the simulation's sub-grid prescriptions.

What would settle it

Applying the same classification and regression tasks to resolved observational data from integral-field surveys and finding that local gas density or other variables outrank stellar mass surface density for star formation, or that halo mass outranks black hole mass for central-galaxy quenching, would falsify the claim.

Figures

Figures reproduced from arXiv: 2604.15182 by Ansa Brew-Smith, Asa F.L. Bluck, Bryanne McDonough, Joanna Piotrowska, Sathvika S. Iyengar.

Figure 1
Figure 1. Figure 1: Average feature importances from the Random Forest quenching classification models, shown for spatially-resolved bins drawn from all [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Feature importances from XGBoost quenching classification models, shown for low-mass and high-mass subsets of both centrals and [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Feature importances from RF regression model for predicting SFR, shown for low-mass and high-mass subsets of both centrals and [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Feature importances from XGBoost regression model for predicting SFR, shown for low-mass and high-mass subsets of both centrals [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Principal component analysis of local and global parameters. In both panels, we plot the global hyperparameter, PC [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
read the original abstract

We apply Random Forest and XGBoost machine learning algorithms to determine which galaxy properties most effectively predict star formation and quenching in simulated galaxies. Using spatially-resolved data from approximately 63,000 annular bins across 6,189 TNG100 galaxies, we train classification models to predict quenching states and regression models to predict star formation rate surface densities. Despite their different algorithmic approaches, both methods produce consistent feature importance rankings, with XGBoost distributing importance more evenly among correlated features. For central galaxies and high-mass satellites, black hole mass dominates quenching predictions, consistent with quenching via active galactic nuclei (AGN) feedback. Classification of low-mass satellites shows overwhelming importance for halo mass, indicating environmental quenching. Star formation predictions are dominated by local stellar mass surface density across all star-forming galaxy types, confirming that active star formation is a local process while quenching is driven by global properties.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper applies Random Forest and XGBoost machine learning algorithms to spatially-resolved annular-bin data from approximately 63,000 regions across 6,189 TNG100 galaxies. It trains classification models to predict quenching states and regression models to predict star-formation-rate surface densities, reporting consistent feature-importance rankings from both algorithms: black-hole mass dominates quenching predictions for central galaxies and high-mass satellites (interpreted as AGN feedback), halo mass dominates for low-mass satellites (environmental quenching), and local stellar-mass surface density dominates star-formation predictions across all star-forming galaxy types (indicating a local process).

Significance. If the reported feature importances can be shown to be robust to multicollinearity and not simply reconstructions of the simulation's sub-grid prescriptions, the work would provide a useful data-driven decomposition of global versus local regulation of star formation within a large cosmological simulation, reinforcing the distinction between AGN-driven quenching in massive systems and local density-driven star formation.

major comments (2)
  1. [Abstract and Results] The central interpretive claim (abstract and results) that black-hole-mass dominance confirms AGN-feedback quenching for centrals and high-mass satellites is load-bearing yet potentially circular: TNG100's sub-grid AGN model is explicitly parameterized by black-hole mass, and star formation is tied to local density thresholds. Without ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, the rankings may recover the input prescriptions rather than emergent physics.
  2. [Methods] The manuscript provides no description of multicollinearity handling, cross-validation strategy, or robustness checks against correlated features (e.g., stellar mass, halo mass, and black-hole mass). Given that both RF and XGBoost rankings are presented as evidence for specific physical drivers, the absence of these methodological details is a load-bearing gap for the reliability of the dominance claims.
minor comments (1)
  1. [Abstract] The abstract states the sample size but does not list the full set of input features or any preprocessing (normalization, imputation) applied before training.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments, which have helped us clarify the scope and limitations of our analysis. We address each major comment below and have revised the manuscript accordingly to improve methodological transparency and interpretive caution.

read point-by-point responses
  1. Referee: [Abstract and Results] The central interpretive claim (abstract and results) that black-hole-mass dominance confirms AGN-feedback quenching for centrals and high-mass satellites is load-bearing yet potentially circular: TNG100's sub-grid AGN model is explicitly parameterized by black-hole mass, and star formation is tied to local density thresholds. Without ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, the rankings may recover the input prescriptions rather than emergent physics.

    Authors: We agree that the feature importances must be interpreted within the context of TNG100's specific sub-grid prescriptions, and that black-hole mass is an explicit parameter in the AGN feedback model. Our results demonstrate that, within this simulation, black-hole mass is the most predictive feature for quenching in centrals and high-mass satellites, consistent with the model's implementation of AGN feedback. We will revise the abstract, results, and discussion to explicitly caveat that these findings reflect the TNG100 framework rather than universal emergent physics, and we will expand the discussion of how sub-grid choices may influence the rankings. However, ablation studies, orthogonalization tests, or direct comparisons to alternate simulations are not possible here, as they would require new simulation data. revision: partial

  2. Referee: [Methods] The manuscript provides no description of multicollinearity handling, cross-validation strategy, or robustness checks against correlated features (e.g., stellar mass, halo mass, and black-hole mass). Given that both RF and XGBoost rankings are presented as evidence for specific physical drivers, the absence of these methodological details is a load-bearing gap for the reliability of the dominance claims.

    Authors: We acknowledge this gap in the original submission. In the revised manuscript we will expand the Methods section to describe the cross-validation procedure (stratified 5-fold cross-validation for classification and 5-fold CV for regression, with hyperparameter tuning via grid search), include a correlation matrix and variance inflation factor analysis for the input features, and report permutation-based importance as a robustness check against multicollinearity. We will also note that the consistency of rankings between Random Forest and XGBoost (with the latter's known tendency to distribute importance across correlated features) provides supporting evidence, while adding explicit discussion of remaining limitations. revision: yes

standing simulated objections not resolved
  • Ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, as these require new simulation runs or data not available for TNG100.

Circularity Check

0 steps flagged

No significant circularity in the ML feature-importance analysis

full rationale

The paper applies standard Random Forest and XGBoost algorithms to ~63,000 annular bins from TNG100 galaxies, trains classification and regression models, and reports feature-importance rankings. The abstract states that black-hole mass dominates quenching predictions for centrals and high-mass satellites and that local stellar-mass surface density dominates star-formation predictions. No equations, self-citations, or ansatzes are quoted that reduce any claimed result to an input by construction, nor is any fitted parameter renamed as an independent prediction. The derivation consists of applying off-the-shelf ML to simulation output and interpreting the resulting importances; this chain remains self-contained and does not match any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis treats TNG100 outputs as faithful representations of real galaxy physics and assumes ML feature importances map to causal drivers.

axioms (1)
  • domain assumption TNG100 simulation accurately captures the relevant galaxy-formation physics including AGN feedback and environmental processes
    All conclusions rest on the simulation being a reliable proxy for reality; no independent observational validation is mentioned in the abstract.

pith-pipeline@v0.9.0 · 5475 in / 1223 out tokens · 32408 ms · 2026-05-10T10:18:13.738622+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 15 canonical work pages · 1 internal anchor

  1. [1]

    Monthly Notices of the Royal Astronomical Society 373, 1195–1202

    doi:10.1111/j.1365-2966.2006.11081.x, arXiv:astro-ph/0607648. Birnboim, Y ., Dekel, A., 2003. Virial shocks in galactic haloes? MNRAS 345, 349–

  2. [2]

    MNRAS344(4), 1000–1028 (2003) https://doi.org/10.1046/j.1365-8711.2003

    doi:10.1046/j.1365-8711.2003.06955.x, arXiv:astro-ph/0302161. Bluck, A.F.L., Bottrell, C., Teimoorinia, H., Henriques, B.M.B., Mendel, J.T., Ellison, S.L., Thanjavur, K., Simard, L., Patton, D.R., Conselice, C.J., Moreno, J., Woo, J., 2019. What shapes a galaxy? – un- raveling the role of mass, environment, and star for- mation in forming galactic structu...

  3. [4]

    A machine learning approach for identifying causality in astronomical data

    The quenching of galaxies, bulges, and disks since cosmic noon. A machine learning approach for identifying causality in astronomical data. A&A 659, A160. doi:10.1051/0004-6361/202142643, arXiv:2201.07814. Bluck, A.F.L., Maiolino, R., Piotrowska, J.M., Trus- sler, J., Ellison, S.L., Sánchez, S.F., Thorp, M.D., Teimoorinia, H., Moreno, J., Conselice, C.J.,...

  4. [6]

    MNRAS 351, 1151–

    The physical properties of star-forming galax- ies in the low-redshift Universe. MNRAS 351, 1151–

  5. [7]

    2004, MNRAS, 351, 1379, doi: 10.1111/j.1365-2966.2004.07876.x

    doi:10.1111/j.1365-2966.2004.07881.x, arXiv:astro-ph/0311060. Cameron, E., Driver, S.P., Graham, A.W., Liske, J.,

  6. [8]

    2010 , month = jul, journal =

    The Millennium Galaxy Catalogue: Exploring the Color-Concentration Bimodality via Bulge-Disk Decomposition. ApJ 699, 105–117. doi:10.1088/ 0004-637X/699/1/105,arXiv:0904.3096. Cano-Díaz, M., Sánchez, S.F., Zibetti, S., Ascasi- bar, Y ., Bland-Hawthorn, J., Ziegler, B., González Delgado, R.M., Walcher, C.J., García-Benito, R., Mast, D., Mendoza-Pérez, M.A....

  7. [10]

    Benjamini, Y

    Evidence of strong quasar feedback in the early Universe. MNRAS 425, L66–L70. doi:10.1111/j. 1745-3933.2012.01303.x,arXiv:1204.2904. Marinacci, F., V ogelsberger, M., Pakmor, R., Torrey, P., Springel, V ., Hernquist, L., Nelson, D., Wein- berger, R., Pillepich, A., Naiman, J., Genel, S.,

  8. [11]

    2018, MNRAS, 480, 5113, doi: 10.1093/mnras/sty2206

    First results from the IllustrisTNG simula- tions: radio haloes and magnetic fields. MNRAS 480, 5113–5139. doi:10.1093/mnras/sty2206, arXiv:1707.03396. Martig, M., Bournaud, F., Teyssier, R., Dekel, A.,

  9. [12]

    , keywords =

    Morphological Quenching of Star Forma- tion: Making Early-Type Galaxies Red. ApJ 707, 250–267. doi:10.1088/0004-637X/707/1/250, arXiv:0905.4669. 29 Martín-Navarro, I., Pillepich, A., Nelson, D., Rodriguez-Gomez, V ., Donnari, M., Hern- quist, L., Springel, V ., 2021. Anisotropic satellite galaxy quenching modulated by black hole activity. Nature 594, 187–

  10. [13]

    Measuring the Resolved Star Formation Main Sequence in TNG100: Fitting Technique Matters

    doi:10.1038/s41586-021-03545-9, arXiv:2106.04587. McDonough, B., Curtis, O., Brainerd, T., 2025a. Anal- ysis Notebook and Data for “Measuring the Resolved Star Formation Main Sequence in TNG100: Fitting Technique Matters". URL:https://doi.org/10. 5281/zenodo.15047581, doi:10.5281/zenodo. 15047581. McDonough, B., Curtis, O., Brainerd, T.G., 2023. Resolved ...

  11. [14]

    MNRAS537(4), 3313–3330 (2025) https://doi.org/10.1093/mnras/ staf243 arXiv:2502.04447 [astro-ph.SR]

    URL:https://doi.org/10.1093/mnras/ sty127, doi:10.1093/mnras/sty127. Naab, T., Ostriker, J.P., 2017. Theoretical Chal- lenges in Galaxy Formation. ARAA 55, 59–109. doi:10.1146/annurev-astro-081913-040019, arXiv:1612.06891. Naiman, J.P., Pillepich, A., Springel, V ., Ramirez- Ruiz, E., Torrey, P., V ogelsberger, M., Pakmor, R., Nelson, D., Marinacci, F., H...

  12. [15]

    , keywords =

    The connection between galaxy structure and quenching efficiency. MNRAS 440, 843–858. doi:10.1093/mnras/stu331,arXiv:1402.3394. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V ., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V ., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duch- esnay, E., 2011....

  13. [16]

    Monthly Notices of the Royal Astronomical Society 378, 245–275

    A unified model for AGN feedback in cosmo- logical simulations of structure formation. MNRAS 380, 877–900. doi:10.1111/j.1365-2966.2007. 12153.x,arXiv:0705.2238. Simons, R.C., Peeples, M.S., Tumlinson, J., O’Shea, B.W., Smith, B.D., Corlies, L., Lochhaas, C., Zheng, Y ., Augustin, R., Prasad, D., Snyder, G.F., Tollerud, E., 2020. Figuring Out Gas & Galaxi...

  14. [17]

    ApJ 789, 164

    Observations of Environmental Quenching in Groups in the 11 GYR since z=2.5: Different Quenching for Central and Satellite Galaxies. ApJ 789, 164. doi:10.1088/0004-637X/789/2/164, arXiv:1401.2984. Teimoorinia, H., Bluck, A.F.L., Ellison, S.L., 2016. An artificial neural network approach for ranking quenching parameters in central galaxies. Monthly Notices...

  15. [18]

    2017, MNRAS, 465, 3291, doi: 10.1093/mnras/stw2944

    Simulating galaxy formation with black hole driven thermal and kinetic feedback. MNRAS 465, 3291–3308. doi:10.1093/mnras/stw2944, arXiv:1607.03486. Weinberger, R., Springel, V ., Pakmor, R., Nelson, D., Genel, S., Pillepich, A., V ogelsberger, M., Mari- nacci, F., Naiman, J., Torrey, P., Hernquist, L.,

  16. [19]

    MNRAS , author =

    Supermassive black holes and their feed- back effects in the IllustrisTNG simulation. MNRAS 479, 4056–4072. doi:10.1093/mnras/sty1733, arXiv:1710.04659. Wilkinson, A., Almaini, O., Wild, V ., Maltby, D., Hart- ley, W.G., Simpson, C., Rowlands, K., 2021. From starburst to quiescence: post-starburst galaxies and their large-scale clustering over cosmic time...