Recognition: unknown
Classifying Supermassive Black Hole Growth Regimes to Observables Across Cosmological Simulations with Forecasts for LSST
Pith reviewed 2026-05-10 09:48 UTC · model grok-4.3
The pith
Machine learning on LSST photometry distinguishes over-massive from under-massive supermassive black hole growth regimes at 91 to 94 percent accuracy in simulations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Forward-modeling SIMBA, IllustrisTNG, and EAGLE into LSST bands produces an ensemble machine-learning classifier that separates over-massive and under-massive SMBH growth regimes at 91-94 percent accuracy within SIMBA and IllustrisTNG. Cross-simulation transfer with rank-normalized features reaches 83-89 percent accuracy, indicating that the relative photometric ordering of the regimes is preserved across different sub-grid SMBH feedback prescriptions. Signal decomposition shows the separation is driven mainly by host-galaxy colors and the shape of the accretion-state spectral energy distribution rather than any direct inversion of the luminosity prescription.
What carries the argument
Ensemble machine-learning classifier trained on rank-normalized broadband photometry obtained by forward-modeling cosmological simulations into LSST bands
Load-bearing premise
The relative ordering of photometric signatures for over-massive versus under-massive SMBH growth regimes remains consistent across different sub-grid feedback prescriptions and will hold for real galaxies.
What would settle it
A large sample of galaxies with independent black-hole mass measurements from reverberation mapping or stellar dynamics whose LSST photometry yields growth-regime assignments that disagree with the mass-based labels at a rate well above 10 percent.
Figures
read the original abstract
The possibility of over-massive black holes suggested by James Webb Space Telescope photometric discoveries of 'little red dots', may disfavor light supermassive black hole (SMBH) seeds. However, what should constitute the mass (range) of 'heavy' seeds remains relatively unconstrained. Moreover, Vera C Rubin Observatory's Legacy Survey of Space and Time will photometrically characterize galaxies without direct black hole mass measurements. We forward-model the SIMBA, IllustrisTNG, and EAGLE cosmological simulations into the photometric bands of LSST to train an ensemble machine learning classifier. Our framework achieves $91\%$--$94\%$ accuracy across SIMBA and IllustrisTNG in distinguishing between over-massive and under-massive SMBH growth regimes under LSST magnitude limits, using only broadband photometry. Furthermore, cross-simulation transfer experiments (training on one cosmological simulation and evaluating on another using rank-normalized features) achieve $83\%$--$89\%$ accuracy. This suggests the relative photometric ordering of growth regimes is largely preserved even across fundamentally different sub-grid SMBH feedback prescriptions. Signal decomposition shows our classification is driven by host galaxy colors ($82\%$--$87\%$ accuracy) and, relatedly, the accretion-state's spectral energy distribution shape as opposed to an inversion of our forward model's analytical luminosity prescription. Given that the evaluated simulations employ heavy seed prescriptions ($\geq 10^{4}~M_\odot$), our methodology establishes a validated baseline for classifying post-seeding growth regimes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops an ensemble machine learning classifier trained on forward-modeled LSST broadband photometry from the SIMBA, IllustrisTNG, and EAGLE cosmological simulations to distinguish over-massive versus under-massive SMBH growth regimes. It reports 91–94% accuracy within SIMBA and IllustrisTNG, 83–89% accuracy in cross-simulation transfer using rank-normalized features, and shows via signal decomposition that host galaxy colors drive the classification (82–87% accuracy) rather than direct inversion of the analytical luminosity model. All evaluated simulations employ heavy seeds (≥10^4 M_⊙), and the work positions itself as a validated baseline for post-seeding regime classification from photometry alone.
Significance. If the reported accuracies and cross-simulation robustness hold, the result is significant for providing an observationally practical route to classify SMBH growth regimes in the large LSST photometric sample without requiring direct black-hole mass measurements. The preservation of relative photometric ordering across differing sub-grid feedback prescriptions is a useful finding given current uncertainties in SMBH modeling, and the color-driven decomposition aligns with known observational degeneracies between AGN and host properties. The use of multiple simulations and explicit focus on LSST magnitude limits are strengths that enhance the forecast value.
major comments (2)
- [§3] §3 (label definition): The over-massive and under-massive labels are defined internally via deviations from each simulation’s M_BH–M_* relation; the manuscript must state the precise threshold (e.g., number of sigma or percentile cut) and demonstrate that the reported accuracies are insensitive to reasonable variations in this cut, because the training labels are load-bearing for all accuracy claims.
- [§4.2] §4.2 (cross-simulation transfer): The 83–89% transfer accuracies are obtained with rank-normalized features, yet no table or figure shows the per-feature importance or the distribution of the dominant color features for over- versus under-massive samples across the three simulations; without this, the claim that “relative photometric ordering is largely preserved” lacks direct quantitative support.
minor comments (3)
- [Abstract] Abstract: EAGLE is listed among the simulations used for forward modeling, but accuracy numbers are quoted only for SIMBA and IllustrisTNG; the role of EAGLE (training, testing, or only forward modeling) should be stated explicitly.
- [Figures] Figure captions (e.g., Figure 2 or 3): Add the exact LSST bands employed and the magnitude limit applied when reporting the color-only accuracies.
- [Methods] Methods: The ensemble classifier details (base learners, hyper-parameter search, and cross-validation scheme) are only summarized; a short table of the final hyper-parameters and the number of independent runs used for the quoted accuracies would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive review and recommendation for minor revision. We address each major comment point by point below.
read point-by-point responses
-
Referee: [§3] §3 (label definition): The over-massive and under-massive labels are defined internally via deviations from each simulation’s M_BH–M_* relation; the manuscript must state the precise threshold (e.g., number of sigma or percentile cut) and demonstrate that the reported accuracies are insensitive to reasonable variations in this cut, because the training labels are load-bearing for all accuracy claims.
Authors: We agree that the exact label threshold must be stated explicitly for reproducibility. In the revised manuscript we will add the precise definition used (deviation from the simulation-specific M_BH–M_* relation at fixed stellar mass) together with a sensitivity test showing that the quoted accuracies remain stable under reasonable variations of the cut (e.g., ±0.3 dex). revision: yes
-
Referee: [§4.2] §4.2 (cross-simulation transfer): The 83–89% transfer accuracies are obtained with rank-normalized features, yet no table or figure shows the per-feature importance or the distribution of the dominant color features for over- versus under-massive samples across the three simulations; without this, the claim that “relative photometric ordering is largely preserved” lacks direct quantitative support.
Authors: We acknowledge that direct quantitative support for the preservation of relative photometric ordering would strengthen the cross-simulation claim. In the revised manuscript we will add a table of ensemble feature importances for each simulation and a figure displaying the distributions of the dominant color features for the over-massive versus under-massive populations across SIMBA, IllustrisTNG, and EAGLE. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper defines over-massive and under-massive SMBH growth regimes from internal simulation quantities (e.g., BH mass relative to host properties) in SIMBA, IllustrisTNG, and EAGLE. It then forward-models broadband LSST photometry from the same simulations and trains an ensemble ML classifier whose reported accuracies (91-94% intra-simulation, 83-89% cross-simulation with rank-normalized features) are measured on held-out data or transferred simulations. These are standard empirical performance metrics, not reductions by construction. Cross-simulation transfer explicitly tests generalization across independent sub-grid feedback prescriptions rather than re-deriving the labels. Signal decomposition further isolates the role of host colors versus direct luminosity inversion, providing an internal control. No self-definitional loop, fitted parameter renamed as prediction, load-bearing self-citation, or smuggled ansatz appears in the methodology; the derivation chain remains self-contained against the simulation ground truth.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The cosmological simulations (SIMBA, IllustrisTNG, EAGLE) produce realistic photometric signatures for over-massive and under-massive SMBH regimes under their respective sub-grid prescriptions.
- domain assumption Forward modeling of simulation outputs into LSST bands introduces no systematic biases that would alter the relative ordering of growth regimes.
Reference graph
Works this paper leans on
-
[1]
2017, MNRAS, 472, L109, doi: 10.1093/mnrasl/slx161
Angl´ es-Alc´ azar, D., Faucher-Gigu` ere, C.-A., Quataert, E., et al. 2017, MNRAS, 472, L109, doi: 10.1093/mnrasl/slx161
-
[2]
T., Mellema, G., Pen, U.-L., et al
Begelman, M. C., Volonteri, M., & Rees, M. J. 2006, MNRAS, 370, 289, doi: 10.1111/j.1365-2966.2006.10467.x
-
[3]
Bell, E. F., & de Jong, R. S. 2001, ApJ, 550, 212, doi: 10.1086/319728
-
[4]
K., Blecha, L., Torrey, P., et al
Bhowmick, A. K., Blecha, L., Torrey, P., et al. 2025, MNRAS, 538, 518, doi: 10.1093/mnras/staf269
-
[5]
Bluck, A. F. L., Piotrowska, J. M., & Maiolino, R. 2023, ApJ, 944, 108, doi: 10.3847/1538-4357/acac7c Bogd´ an, A., Goulding, A. D., Natarajan, P., et al. 2024, Nature Astronomy, 8, 126, doi: 10.1038/s41550-023-02111-9
-
[6]
Bromm, V., & Larson, R. B. 2004, ARA&A, 42, 79, doi: 10.1146/annurev.astro.42.053102.134034
-
[7]
2003, ApJ, 596, 34, doi: 10.1086/377529
Bromm, V., & Loeb, A. 2003, ApJ, 596, 34, doi: 10.1086/377529
-
[8]
Calzetti, D., Armus, L., Bohlin, R. C., et al. 2000, ApJ, 533, 682, doi: 10.1086/308692
work page internal anchor Pith review doi:10.1086/308692 2000
-
[9]
Crain, R. A., Schaye, J., Bower, R. G., et al. 2015, MNRAS, 450, 1937, doi: 10.1093/mnras/stv725 Dav´ e, R., Angl´ es-Alc´ azar, D., Narayanan, D., et al. 2019, MNRAS, 486, 2827, doi: 10.1093/mnras/stz937
-
[10]
Devecchi, B., & Volonteri, M. 2009, ApJ, 694, 302, doi: 10.1088/0004-637X/694/1/302
-
[11]
Dancing in the dark: galactic properties trace spin swings along the cosmic web
Dubois, Y., Pichon, C., Welker, C., et al. 2014, MNRAS, 444, 1453, doi: 10.1093/mnras/stu1227
-
[12]
Fan, X., Ba˜ nados, E., & Simcoe, R. A. 2023, ARA&A, 61, 373, doi: 10.1146/annurev-astro-052920-102455
-
[13]
Fan, X., Strauss, M. A., Becker, R. H., et al. 2006, AJ, 132, 117, doi: 10.1086/504836
-
[14]
Gordon, K. D., Clayton, G. C., Misselt, K. A., Landolt, A. U., & Wolff, M. J. 2003, ApJ, 594, 279, doi: 10.1086/376774
-
[15]
2021, in Advances in Neural Information Processing
Gorishniy, Y., Rubachev, I., Khrulkov, V., & Babenko, A. 2021, in Advances in Neural Information Processing
2021
-
[16]
Habouzit, M., Li, Y., Somerville, R. S., et al. 2021, MNRAS, 503, 1940, doi: 10.1093/mnras/stab496
-
[17]
Habouzit, M., Somerville, R. S., Li, Y., et al. 2022, MNRAS, 509, 3015, doi: 10.1093/mnras/stab3147
-
[18]
2023, ApJ, 959, 39, doi:10.3847/1538-4357/ad029e
Harikane, Y., Zhang, Y., Nakajima, K., et al. 2023, ApJ, 959, 39, doi: 10.3847/1538-4357/ad029e
-
[19]
2014, ApJ, 781, 60, doi: 10.1088/0004-637X/781/2/60
Hirano, S., Hosokawa, T., Yoshida, N., et al. 2014, ApJ, 781, 60, doi: 10.1088/0004-637X/781/2/60
-
[20]
Hopkins, P. F. 2015, MNRAS, 450, 53, doi: 10.1093/mnras/stv195
-
[21]
Hopkins, P. F., & Quataert, E. 2011, MNRAS, 415, 1027, doi: 10.1111/j.1365-2966.2011.18542.x
-
[22]
2020, ARA&A, 58, 27, doi: 10.1146/annurev-astro-120419-014455 IRSA, & SSC
Inayoshi, K., Visbal, E., & Haiman, Z. 2020, ARA&A, 58, 27, doi: 10.1146/annurev-astro-120419-014455 Ivezi´ c, v., Kahn, S. M., Tyson, J. A., et al. 2019, ApJ, 873, 111, doi: 10.3847/1538-4357/ab042c
-
[23]
D., Onoue, M., Inayoshi, K., et al
Kocevski, D. D., Onoue, M., Inayoshi, K., et al. 2023, ApJL, 954, L4, doi: 10.3847/2041-8213/ace5a0
-
[24]
Kormendy, J., & Ho, L. C. 2013, ARA&A, 51, 511, doi: 10.1146/annurev-astro-082708-101811
work page internal anchor Pith review doi:10.1146/annurev-astro-082708-101811 2013
-
[25]
T., Mellema, G., Pen, U.-L., et al
Lodato, G., & Natarajan, P. 2006, MNRAS, 371, 1813, doi: 10.1111/j.1365-2966.2006.10801.x
-
[26]
Lupi, A., Colpi, M., Devecchi, B., Galanti, G., & Volonteri, M. 2014, MNRAS, 442, 3616, doi: 10.1093/mnras/stu1120
-
[27]
2012, MNRAS, 427, 127, doi: 10.1111/j.1365-2966.2012.21948.x
Lusso, E., Comastri, A., Simmons, B. D., et al. 2012, MNRAS, 425, 623, doi: 10.1111/j.1365-2966.2012.21513.x
-
[28]
2025, A&A, 702, A221, doi: 10.1051/0004-6361/202555571
Ma, W., Cui, W., Dav´ e, R., Angl´ es-Alc´ azar, D., & Guo, H. 2025, A&A, 702, A221, doi: 10.1051/0004-6361/202555571
-
[29]
1995, ApJ, 441, 18, doi: 10.1086/175332 Ma´ ız Apell´ aniz, J
Madau, P. 1995, ApJ, 441, 18, doi: 10.1086/175332
-
[30]
Madau, P., & Rees, M. J. 2001, ApJ, 551, L27, doi: 10.1086/319848
-
[31]
Maiolino, R., Scholtz, J., Curtis-Lake, E., et al. 2024, A&A, 691, A145, doi: 10.1051/0004-6361/202347640
-
[32]
2018, MNRAS, 480, 5113, doi: 10.1093/mnras/sty2206
Marinacci, F., Vogelsberger, M., Pakmor, R., et al. 2018, MNRAS, 480, 5113, doi: 10.1093/mnras/sty2206
-
[33]
McAlpine, S., Helly, J., Schaller, M., et al. 2016, Astronomy and Computing, 15, 72, doi: 10.1016/j.ascom.2016.02.004 Milosavljevi´ c, M., Couch, S. M., & Bromm, V. 2009, ApJL, 696, L146, doi: 10.1088/0004-637X/696/2/L146
-
[34]
P., Pillepich, A., Springel, V., et al
Naiman, J. P., Pillepich, A., Springel, V., et al. 2018, MNRAS, 477, 1206, doi: 10.1093/mnras/sty618
-
[35]
2024, ApJL, 960, L1, doi:10.3847/2041-8213/ad0e76
Natarajan, P., Pacucci, F., Ricarte, A., et al. 2024, ApJL, 960, L1, doi: 10.3847/2041-8213/ad0e76
-
[36]
First results from the IllustrisTNG simulations: the galaxy color bimodality
Nelson, D., Pillepich, A., Springel, V., et al. 2018, MNRAS, 475, 624, doi: 10.1093/mnras/stx3040
work page internal anchor Pith review doi:10.1093/mnras/stx3040 2018
-
[37]
2019, Computational Astrophysics and Cosmology, 6, 2, doi: 10.1186/s40668-019-0028-x
Nelson, D., Springel, V., Pillepich, A., et al. 2019a, Computational Astrophysics and Cosmology, 6, 2, doi: 10.1186/s40668-019-0028-x
-
[38]
2019, MNRAS, 490, 3234, doi: 10.1093/mnras/stz2306
Nelson, D., Pillepich, A., Springel, V., et al. 2019b, MNRAS, 490, 3234, doi: 10.1093/mnras/stz2306 19 NSF-DOE Vera C. Rubin Observatory. 2025, Legacy Survey of Space and Time Data Preview 1, doi: 10.71929/rubin/2570308
-
[39]
2006, ApJ, 648, L5, doi: 10.1086/507678
Franx, M. 2006, ApJ, 648, L5, doi: 10.1086/507678
-
[40]
2024, ApJ, 964, 154, doi: 10.3847/1538-4357/ad3044
Pacucci, F., & Loeb, A. 2024, ApJ, 964, 154, doi: 10.3847/1538-4357/ad3044
-
[41]
2023, ApJL, 957, L3, doi:10.3847/2041-8213/ad0158
Pacucci, F., Nguyen, B., Carniani, S., Maiolino, R., & Fan, X. 2023, ApJL, 957, L3, doi: 10.3847/2041-8213/ad0158
-
[42]
2018b, MNRAS, 473, 4077, doi: 10.1093/mnras/stx2656
Pillepich, A., Springel, V., Nelson, D., et al. 2018, MNRAS, 473, 4077, doi: 10.1093/mnras/stx2656
work page internal anchor Pith review doi:10.1093/mnras/stx2656 2018
-
[43]
2019, MNRAS, 490, 3196, doi: 10.1093/mnras/stz2338
Pillepich, A., Nelson, D., Springel, V., et al. 2019, MNRAS, 490, 3196, doi: 10.1093/mnras/stz2338
-
[44]
2024, The Open Journal of Astrophysics, 7, doi: 10.33232/001c.123239
Regan, J., & Volonteri, M. 2024, The Open Journal of Astrophysics, 7, doi: 10.33232/001c.123239
-
[45]
Richards, G. T., Fan, X., Newberg, H. J., et al. 2002, AJ, 123, 2945, doi: 10.1086/340187
-
[46]
T., Lacy, M., Storrie-Lombardi, L
Richards, G. T., Lacy, M., Storrie-Lombardi, L. J., et al. 2006, ApJS, 166, 470, doi: 10.1086/506525
-
[47]
2021, MNRAS, 506, 613, doi:10.1093/mnras/stab1737
Sassano, F., Schneider, R., Valiante, R., et al. 2021, MNRAS, 506, 613, doi: 10.1093/mnras/stab1737
-
[48]
Schaye, J., Crain, R. A., Bower, R. G., et al. 2015, MNRAS, 446, 521, doi: 10.1093/mnras/stu2058
-
[49]
Schlafly, E. F., & Finkbeiner, D. P. 2011, ApJ, 737, 103, doi: 10.1088/0004-637X/737/2/103
work page internal anchor Pith review doi:10.1088/0004-637x/737/2/103 2011
-
[50]
2025, MNRAS, 541, 2070, doi: 10.1093/mnras/staf747
Shankar, F., Bernardi, M., Roberts, D., et al. 2025, MNRAS, 541, 2070, doi: 10.1093/mnras/staf747
-
[51]
Shen, Y., Richards, G. T., Strauss, M. A., et al. 2011, ApJS, 194, 45, doi: 10.1088/0067-0049/194/2/45
-
[52]
2009, MNRAS, 398, 2122, doi: 10.1111/j.1365-2966.2009.15261.x
Springel, V. 2010, MNRAS, 401, 791, doi: 10.1111/j.1365-2966.2009.15715.x
-
[53]
2018, MNRAS, 475, 676, doi: 10.1093/mnras/stx3304
Springel, V., Pakmor, R., Pillepich, A., et al. 2018, MNRAS, 475, 676, doi: 10.1093/mnras/stx3304
-
[54]
Steffen, A. T., Strateva, I., Brandt, W. N., et al. 2006, AJ, 131, 2826, doi: 10.1086/503627
-
[55]
2009, ApJ, 696, 1798, doi: 10.1088/0004-637X/696/2/1798 The Astropy Collaboration, Price-Whelan, A
Tanaka, T., & Haiman, Z. 2009, ApJ, 696, 1798, doi: 10.1088/0004-637X/696/2/1798 The Astropy Collaboration, Price-Whelan, A. M., Lim, P. L., et al. 2022, ApJ, 935, 167, doi: 10.3847/1538-4357/ac7c74 Vanden Berk, D. E., Richards, G. T., Bauer, A., et al. 2001, AJ, 122, 549, doi: 10.1086/321167
-
[56]
Vestergaard, M., & Peterson, B. M. 2006, ApJ, 641, 689, doi: 10.1086/500572
-
[57]
Volonteri, M. 2010, Astronomy and Astrophysics Reviews, 18, 279, doi: 10.1007/s00159-010-0029-x
-
[58]
2021, Nature Reviews Physics, 3, 732, doi: 10.1038/s42254-021-00364-9
Volonteri, M., Habouzit, M., & Colpi, M. 2021, Nature Reviews Physics, 3, 732, doi: 10.1038/s42254-021-00364-9
-
[59]
Weinberger, R., Springel, V., Pakmor, R., et al. 2018, MNRAS, 479, 4056, doi: 10.1093/mnras/sty1733
-
[60]
Wolpert, D. H. 1992, Neural Networks, 5, 241, doi: 10.1016/S0893-6080(05)80023-1 Zhang , H., Behroozi, P., Volonteri, M., et al. 2023, MNRAS, 518, 2123, doi: 10.1093/mnras/stac2633
-
[61]
2024, MNRAS, 531, 4974, doi: 10.1093/mnras/stae1447
Zhang, H., Behroozi, P., Volonteri, M., et al. 2024, MNRAS, 531, 4974, doi: 10.1093/mnras/stae1447
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.