Recognition: unknown
Understanding the regulation of star formation within TNG100 galaxies on kpc-scales using machine learning I: Global versus local
Pith reviewed 2026-05-10 10:18 UTC · model grok-4.3
The pith
Machine learning on TNG100 data shows black hole mass dominates galaxy quenching predictions while local stellar mass density controls star formation rates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Feature importance rankings recovered by both Random Forest and XGBoost on TNG100 annular-bin data show black hole mass as the leading predictor of quenching classification for central galaxies and high-mass satellites, halo mass for low-mass satellites, and local stellar mass surface density as the leading predictor of star formation rate surface density regression across all star-forming galaxies, supporting AGN-driven global quenching and locally regulated star formation.
What carries the argument
Random Forest and XGBoost feature-importance rankings applied to global and local galaxy properties extracted from ~63,000 annular bins across 6,189 TNG100 galaxies.
If this is right
- Quenching in central and high-mass satellite galaxies proceeds primarily through AGN feedback tied to black hole mass.
- Low-mass satellites experience environmental quenching driven by their host halo mass.
- Active star formation remains a local process controlled by stellar mass surface density even inside galaxies that are globally quenched.
- The two machine learning methods produce consistent rankings, with XGBoost spreading importance more evenly across correlated features.
Where Pith is reading between the lines
- The same machine learning pipeline applied to MaNGA or SAMI observational maps could test whether the same features dominate in real galaxies.
- If confirmed, galaxy evolution models would need to ensure AGN feedback prescriptions produce the observed dominance of black hole mass in quenching statistics.
- The local dominance for star formation suggests that sub-grid star formation recipes in simulations should be tested primarily against small-scale density thresholds rather than global galaxy properties.
Load-bearing premise
The rankings recovered by the machine learning models reflect true causal physical drivers rather than correlations or artifacts from the simulation's sub-grid prescriptions.
What would settle it
Applying the same classification and regression tasks to resolved observational data from integral-field surveys and finding that local gas density or other variables outrank stellar mass surface density for star formation, or that halo mass outranks black hole mass for central-galaxy quenching, would falsify the claim.
Figures
read the original abstract
We apply Random Forest and XGBoost machine learning algorithms to determine which galaxy properties most effectively predict star formation and quenching in simulated galaxies. Using spatially-resolved data from approximately 63,000 annular bins across 6,189 TNG100 galaxies, we train classification models to predict quenching states and regression models to predict star formation rate surface densities. Despite their different algorithmic approaches, both methods produce consistent feature importance rankings, with XGBoost distributing importance more evenly among correlated features. For central galaxies and high-mass satellites, black hole mass dominates quenching predictions, consistent with quenching via active galactic nuclei (AGN) feedback. Classification of low-mass satellites shows overwhelming importance for halo mass, indicating environmental quenching. Star formation predictions are dominated by local stellar mass surface density across all star-forming galaxy types, confirming that active star formation is a local process while quenching is driven by global properties.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper applies Random Forest and XGBoost machine learning algorithms to spatially-resolved annular-bin data from approximately 63,000 regions across 6,189 TNG100 galaxies. It trains classification models to predict quenching states and regression models to predict star-formation-rate surface densities, reporting consistent feature-importance rankings from both algorithms: black-hole mass dominates quenching predictions for central galaxies and high-mass satellites (interpreted as AGN feedback), halo mass dominates for low-mass satellites (environmental quenching), and local stellar-mass surface density dominates star-formation predictions across all star-forming galaxy types (indicating a local process).
Significance. If the reported feature importances can be shown to be robust to multicollinearity and not simply reconstructions of the simulation's sub-grid prescriptions, the work would provide a useful data-driven decomposition of global versus local regulation of star formation within a large cosmological simulation, reinforcing the distinction between AGN-driven quenching in massive systems and local density-driven star formation.
major comments (2)
- [Abstract and Results] The central interpretive claim (abstract and results) that black-hole-mass dominance confirms AGN-feedback quenching for centrals and high-mass satellites is load-bearing yet potentially circular: TNG100's sub-grid AGN model is explicitly parameterized by black-hole mass, and star formation is tied to local density thresholds. Without ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, the rankings may recover the input prescriptions rather than emergent physics.
- [Methods] The manuscript provides no description of multicollinearity handling, cross-validation strategy, or robustness checks against correlated features (e.g., stellar mass, halo mass, and black-hole mass). Given that both RF and XGBoost rankings are presented as evidence for specific physical drivers, the absence of these methodological details is a load-bearing gap for the reliability of the dominance claims.
minor comments (1)
- [Abstract] The abstract states the sample size but does not list the full set of input features or any preprocessing (normalization, imputation) applied before training.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which have helped us clarify the scope and limitations of our analysis. We address each major comment below and have revised the manuscript accordingly to improve methodological transparency and interpretive caution.
read point-by-point responses
-
Referee: [Abstract and Results] The central interpretive claim (abstract and results) that black-hole-mass dominance confirms AGN-feedback quenching for centrals and high-mass satellites is load-bearing yet potentially circular: TNG100's sub-grid AGN model is explicitly parameterized by black-hole mass, and star formation is tied to local density thresholds. Without ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, the rankings may recover the input prescriptions rather than emergent physics.
Authors: We agree that the feature importances must be interpreted within the context of TNG100's specific sub-grid prescriptions, and that black-hole mass is an explicit parameter in the AGN feedback model. Our results demonstrate that, within this simulation, black-hole mass is the most predictive feature for quenching in centrals and high-mass satellites, consistent with the model's implementation of AGN feedback. We will revise the abstract, results, and discussion to explicitly caveat that these findings reflect the TNG100 framework rather than universal emergent physics, and we will expand the discussion of how sub-grid choices may influence the rankings. However, ablation studies, orthogonalization tests, or direct comparisons to alternate simulations are not possible here, as they would require new simulation data. revision: partial
-
Referee: [Methods] The manuscript provides no description of multicollinearity handling, cross-validation strategy, or robustness checks against correlated features (e.g., stellar mass, halo mass, and black-hole mass). Given that both RF and XGBoost rankings are presented as evidence for specific physical drivers, the absence of these methodological details is a load-bearing gap for the reliability of the dominance claims.
Authors: We acknowledge this gap in the original submission. In the revised manuscript we will expand the Methods section to describe the cross-validation procedure (stratified 5-fold cross-validation for classification and 5-fold CV for regression, with hyperparameter tuning via grid search), include a correlation matrix and variance inflation factor analysis for the input features, and report permutation-based importance as a robustness check against multicollinearity. We will also note that the consistency of rankings between Random Forest and XGBoost (with the latter's known tendency to distribute importance across correlated features) provides supporting evidence, while adding explicit discussion of remaining limitations. revision: yes
- Ablation studies, feature-orthogonalization tests, or comparisons against simulations with alternate sub-grid implementations, as these require new simulation runs or data not available for TNG100.
Circularity Check
No significant circularity in the ML feature-importance analysis
full rationale
The paper applies standard Random Forest and XGBoost algorithms to ~63,000 annular bins from TNG100 galaxies, trains classification and regression models, and reports feature-importance rankings. The abstract states that black-hole mass dominates quenching predictions for centrals and high-mass satellites and that local stellar-mass surface density dominates star-formation predictions. No equations, self-citations, or ansatzes are quoted that reduce any claimed result to an input by construction, nor is any fitted parameter renamed as an independent prediction. The derivation consists of applying off-the-shelf ML to simulation output and interpreting the resulting importances; this chain remains self-contained and does not match any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption TNG100 simulation accurately captures the relevant galaxy-formation physics including AGN feedback and environmental processes
Reference graph
Works this paper leans on
-
[1]
Monthly Notices of the Royal Astronomical Society 373, 1195–1202
doi:10.1111/j.1365-2966.2006.11081.x, arXiv:astro-ph/0607648. Birnboim, Y ., Dekel, A., 2003. Virial shocks in galactic haloes? MNRAS 345, 349–
-
[2]
MNRAS344(4), 1000–1028 (2003) https://doi.org/10.1046/j.1365-8711.2003
doi:10.1046/j.1365-8711.2003.06955.x, arXiv:astro-ph/0302161. Bluck, A.F.L., Bottrell, C., Teimoorinia, H., Henriques, B.M.B., Mendel, J.T., Ellison, S.L., Thanjavur, K., Simard, L., Patton, D.R., Conselice, C.J., Moreno, J., Woo, J., 2019. What shapes a galaxy? – un- raveling the role of mass, environment, and star for- mation in forming galactic structu...
-
[4]
A machine learning approach for identifying causality in astronomical data
The quenching of galaxies, bulges, and disks since cosmic noon. A machine learning approach for identifying causality in astronomical data. A&A 659, A160. doi:10.1051/0004-6361/202142643, arXiv:2201.07814. Bluck, A.F.L., Maiolino, R., Piotrowska, J.M., Trus- sler, J., Ellison, S.L., Sánchez, S.F., Thorp, M.D., Teimoorinia, H., Moreno, J., Conselice, C.J.,...
-
[6]
MNRAS 351, 1151–
The physical properties of star-forming galax- ies in the low-redshift Universe. MNRAS 351, 1151–
-
[7]
2004, MNRAS, 351, 1379, doi: 10.1111/j.1365-2966.2004.07876.x
doi:10.1111/j.1365-2966.2004.07881.x, arXiv:astro-ph/0311060. Cameron, E., Driver, S.P., Graham, A.W., Liske, J.,
-
[8]
The Millennium Galaxy Catalogue: Exploring the Color-Concentration Bimodality via Bulge-Disk Decomposition. ApJ 699, 105–117. doi:10.1088/ 0004-637X/699/1/105,arXiv:0904.3096. Cano-Díaz, M., Sánchez, S.F., Zibetti, S., Ascasi- bar, Y ., Bland-Hawthorn, J., Ziegler, B., González Delgado, R.M., Walcher, C.J., García-Benito, R., Mast, D., Mendoza-Pérez, M.A....
-
[10]
Evidence of strong quasar feedback in the early Universe. MNRAS 425, L66–L70. doi:10.1111/j. 1745-3933.2012.01303.x,arXiv:1204.2904. Marinacci, F., V ogelsberger, M., Pakmor, R., Torrey, P., Springel, V ., Hernquist, L., Nelson, D., Wein- berger, R., Pillepich, A., Naiman, J., Genel, S.,
work page doi:10.1111/j 2012
-
[11]
2018, MNRAS, 480, 5113, doi: 10.1093/mnras/sty2206
First results from the IllustrisTNG simula- tions: radio haloes and magnetic fields. MNRAS 480, 5113–5139. doi:10.1093/mnras/sty2206, arXiv:1707.03396. Martig, M., Bournaud, F., Teyssier, R., Dekel, A.,
-
[12]
Morphological Quenching of Star Forma- tion: Making Early-Type Galaxies Red. ApJ 707, 250–267. doi:10.1088/0004-637X/707/1/250, arXiv:0905.4669. 29 Martín-Navarro, I., Pillepich, A., Nelson, D., Rodriguez-Gomez, V ., Donnari, M., Hern- quist, L., Springel, V ., 2021. Anisotropic satellite galaxy quenching modulated by black hole activity. Nature 594, 187–
-
[13]
Measuring the Resolved Star Formation Main Sequence in TNG100: Fitting Technique Matters
doi:10.1038/s41586-021-03545-9, arXiv:2106.04587. McDonough, B., Curtis, O., Brainerd, T., 2025a. Anal- ysis Notebook and Data for “Measuring the Resolved Star Formation Main Sequence in TNG100: Fitting Technique Matters". URL:https://doi.org/10. 5281/zenodo.15047581, doi:10.5281/zenodo. 15047581. McDonough, B., Curtis, O., Brainerd, T.G., 2023. Resolved ...
-
[14]
MNRAS537(4), 3313–3330 (2025) https://doi.org/10.1093/mnras/ staf243 arXiv:2502.04447 [astro-ph.SR]
URL:https://doi.org/10.1093/mnras/ sty127, doi:10.1093/mnras/sty127. Naab, T., Ostriker, J.P., 2017. Theoretical Chal- lenges in Galaxy Formation. ARAA 55, 59–109. doi:10.1146/annurev-astro-081913-040019, arXiv:1612.06891. Naiman, J.P., Pillepich, A., Springel, V ., Ramirez- Ruiz, E., Torrey, P., V ogelsberger, M., Pakmor, R., Nelson, D., Marinacci, F., H...
-
[15]
The connection between galaxy structure and quenching efficiency. MNRAS 440, 843–858. doi:10.1093/mnras/stu331,arXiv:1402.3394. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V ., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V ., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duch- esnay, E., 2011....
-
[16]
Monthly Notices of the Royal Astronomical Society 378, 245–275
A unified model for AGN feedback in cosmo- logical simulations of structure formation. MNRAS 380, 877–900. doi:10.1111/j.1365-2966.2007. 12153.x,arXiv:0705.2238. Simons, R.C., Peeples, M.S., Tumlinson, J., O’Shea, B.W., Smith, B.D., Corlies, L., Lochhaas, C., Zheng, Y ., Augustin, R., Prasad, D., Snyder, G.F., Tollerud, E., 2020. Figuring Out Gas & Galaxi...
-
[17]
Observations of Environmental Quenching in Groups in the 11 GYR since z=2.5: Different Quenching for Central and Satellite Galaxies. ApJ 789, 164. doi:10.1088/0004-637X/789/2/164, arXiv:1401.2984. Teimoorinia, H., Bluck, A.F.L., Ellison, S.L., 2016. An artificial neural network approach for ranking quenching parameters in central galaxies. Monthly Notices...
-
[18]
2017, MNRAS, 465, 3291, doi: 10.1093/mnras/stw2944
Simulating galaxy formation with black hole driven thermal and kinetic feedback. MNRAS 465, 3291–3308. doi:10.1093/mnras/stw2944, arXiv:1607.03486. Weinberger, R., Springel, V ., Pakmor, R., Nelson, D., Genel, S., Pillepich, A., V ogelsberger, M., Mari- nacci, F., Naiman, J., Torrey, P., Hernquist, L.,
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1093/mnras/stw2944
-
[19]
Supermassive black holes and their feed- back effects in the IllustrisTNG simulation. MNRAS 479, 4056–4072. doi:10.1093/mnras/sty1733, arXiv:1710.04659. Wilkinson, A., Almaini, O., Wild, V ., Maltby, D., Hart- ley, W.G., Simpson, C., Rowlands, K., 2021. From starburst to quiescence: post-starburst galaxies and their large-scale clustering over cosmic time...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.