A Halo Merger Tree Generation and Evaluation Framework
Pith reviewed 2026-05-25 18:27 UTC · model grok-4.3
The pith
A generative adversarial network can generate realistic halo merger trees by learning from simulation matrices.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Halo merger tree construction can be treated as a matrix generation problem. A generative adversarial network trained on merger trees from the EAGLE simulation suite produces new trees that exhibit realistic features, specifically the absence of drastic changes in halo mass or jumps in physical locations.
What carries the argument
A generative adversarial network trained to output matrices that represent halo merger trees and to match their statistical properties from simulation data.
If this is right
- Semi-analytic models gain access to large numbers of realistic merger trees without the expense of full simulations.
- The generated trees maintain smooth mass evolution and spatial continuity matching the training data.
- The framework operates at modest computational cost relative to direct simulation outputs.
- Quality can be assessed by direct statistical comparison to EAGLE merger trees.
Where Pith is reading between the lines
- If the network has learned the underlying distribution, the same approach could produce trees for cosmologies or initial conditions different from those in the training set.
- The matrix representation might allow the method to be applied to other tree-structured data in astrophysics, such as galaxy assembly histories.
- Testing the generated trees against hydrodynamic simulations run with different codes would reveal whether the captured features are universal or tied to EAGLE specifics.
Load-bearing premise
The statistical features of halo merger trees can be captured by a GAN trained only on simulation outputs without explicit physical constraints or validation on other simulations.
What would settle it
Generate trees with the trained network and compare them to merger trees from an independent simulation suite; the claim fails if the generated trees show frequent unphysical mass jumps or position discontinuities absent in the independent set.
Figures
read the original abstract
Semi-analytic models are best suited to compare galaxy formation and evolution theories with observations. These models rely heavily on halo merger trees, and their realistic features (i.e., no drastic changes on halo mass or jumps on physical locations). Our aim is to provide a new framework for halo merger tree generation that takes advantage of the results of large volume simulations, with a modest computational cost. We treat halo merger tree construction as a matrix generation problem, and propose a Generative Adversarial Network that learns to generate realistic halo merger trees. We evaluate our proposal on merger trees from the EAGLE simulation suite, and show the quality of the generated trees.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes treating halo merger tree construction as an unsupervised matrix generation task and introduces a Generative Adversarial Network trained on merger trees extracted from the EAGLE simulation suite; the central claim is that the resulting synthetic trees reproduce realistic features such as smooth mass evolution and spatial continuity at modest computational cost for use in semi-analytic models.
Significance. If the generated trees satisfy the necessary physical validity constraints and match independent simulation statistics, the approach would supply a fast, data-driven alternative to full N-body runs for populating merger trees in galaxy formation modeling, enabling larger parameter explorations than currently feasible.
major comments (3)
- [Abstract] Abstract: the claim that 'we evaluate our proposal on merger trees from the EAGLE simulation suite, and show the quality of the generated trees' is unsupported by any reported metrics, baselines, or quantitative comparison; without these the central assertion of realism cannot be assessed.
- [Method] Method section (matrix-generation framing): the architecture is described as learning from EAGLE matrices without explicit loss terms, architectural constraints, or post-processing to enforce tree invariants such as parent mass >= sum of child masses, consistent ancestry across time steps, or bounded position jumps; this omission leaves open the possibility that discriminator matching of aggregate statistics alone permits invalid configurations.
- [Evaluation] Evaluation: no cross-validation against an independent simulation suite (e.g., Illustris or Bolshoi) is described, so it remains unclear whether the GAN has learned simulation-specific artifacts rather than universal merger-tree statistics.
minor comments (2)
- [Abstract] The abstract contains minor grammatical issues ('changes on halo mass' should read 'changes in halo mass'; 'jumps on physical locations' should read 'jumps in physical locations').
- Notation for the matrix representation of trees is introduced without an explicit diagram or example showing how ancestry and mass accretion are encoded.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's comments. We address each major comment point-by-point below, indicating where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'we evaluate our proposal on merger trees from the EAGLE simulation suite, and show the quality of the generated trees' is unsupported by any reported metrics, baselines, or quantitative comparison; without these the central assertion of realism cannot be assessed.
Authors: We agree with this assessment. The abstract is overly general and does not specify the evaluation approach. In the revised version, we will modify the abstract to include references to the quantitative metrics, visual comparisons, and any baselines used in the evaluation section to support the claim of quality. revision: yes
-
Referee: [Method] Method section (matrix-generation framing): the architecture is described as learning from EAGLE matrices without explicit loss terms, architectural constraints, or post-processing to enforce tree invariants such as parent mass >= sum of child masses, consistent ancestry across time steps, or bounded position jumps; this omission leaves open the possibility that discriminator matching of aggregate statistics alone permits invalid configurations.
Authors: This is a valid concern. The manuscript as currently written does not detail any explicit enforcement mechanisms beyond the data-driven learning. To address this, we will revise the method section to include a post-processing step that enforces the key tree invariants (mass conservation, ancestry consistency, and position continuity) after generation, and discuss how this ensures physical validity. revision: yes
-
Referee: [Evaluation] Evaluation: no cross-validation against an independent simulation suite (e.g., Illustris or Bolshoi) is described, so it remains unclear whether the GAN has learned simulation-specific artifacts rather than universal merger-tree statistics.
Authors: We acknowledge that testing on an independent simulation suite would provide stronger evidence of generalizability. The current evaluation is limited to EAGLE. In the revision, we will add a section discussing potential simulation-specific features and compare key statistics (such as merger rates) with published results from other simulations like Illustris where available, to partially address this. revision: partial
Circularity Check
No significant circularity; framework trains on external EAGLE data
full rationale
The paper frames halo merger tree generation as an unsupervised matrix-generation task solved by a standard GAN trained directly on merger trees extracted from the independent EAGLE simulation suite. No equations, loss terms, or evaluation metrics are shown to reduce generated outputs to quantities defined by the model's own fitted parameters; the central claim rests on the external training data and conventional adversarial training rather than any self-definitional, fitted-input-renamed-as-prediction, or self-citation load-bearing step. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
ISSN 0162-8828. doi: 10.1109/TPAMI.2013.50. Benson, A. J. and Bower, R. Galaxy formation spanning cosmic history. Mon. Not. Roy. Astron. Soc., 405(3):1573– 1623, Jul
-
[2]
doi: 10.1111/j.1365-2966.2010.16592.x. Bond, J. R., Cole, S., Efstathiou, G., and Kaiser, N. Excur- sion set mass functions for hierarchical gaussian fluctua- tions. Astrophys. J., 379:440,
-
[3]
doi: 10.1086/170520. Bower, R. G., Benson, A. J., Malbon, R., Helly, J. C., Frenk, C. S., Baugh, C. M., Cole, S., and Lacey, C. G. Breaking the hierarchy of galaxy formation. Mon. Not. Roy. Astron. Soc. , 370:645–655, August
-
[4]
doi: 10.1111/j.1365-2966.2006.10519.x. Carlberg, R. G., Couchman, H. M. P., and Thomas, P. A. Cosmological velocity bias. Astrophys. J., 352:L29–L32, April
-
[5]
doi: 10.1086/185686. Cole, S. Modeling Galaxy Formation in Evolving Dark Matter Halos. Astrophys. J., 367:45, Jan
-
[6]
A Halo Merger Tree Generation and Evaluation Framework Cole, S., Aragon-Salamanca, A., Frenk, C
doi: 10.1086/169600. A Halo Merger Tree Generation and Evaluation Framework Cole, S., Aragon-Salamanca, A., Frenk, C. S., Navarro, J. F., and Zepf, S. E. A recipe for galaxy formation. Mon. Not. Roy. Astron. Soc. , 271:781–806, Dec
-
[7]
doi: 10.1093/mnras/271.4.781. Cole, S., Lacey, C. G., Baugh, C. M., and Frenk, C. S. Hierarchical galaxy formation. Mon. Not. Roy. Astron. Soc., 319:168–204, November
-
[8]
doi: 10.1046/j. 1365-8711.2000.03879.x. Crain, R. A., Schaye, J., Bower, R. G., Furlong, M., Schaller, M., Theuns, T., Dalla Vecchia, C., Frenk, C. S., McCarthy, I. G., Helly, J. C., Jenkins, A., Rosas-Guevara, Y . M., White, S. D. M., and Trayford, J. W. The EAGLE simula- tions of galaxy formation: calibration of subgrid physics and model variations. Mon...
work page doi:10.1046/j 2000
-
[9]
doi: 10.1093/mnras/stv725. Croton, D. J., Springel, V ., White, S. D. M., De Lucia, G., Frenk, C. S., Gao, L., Jenkins, A., Kauffmann, G., Navarro, J. F., and Yoshida, N. The many lives of active galactic nuclei: cooling flows, black holes and the luminosities and colours of galaxies. Mon. Not. Roy. Astron. Soc. , 365:11–28, January
-
[10]
doi: 10.1111/j.1365-2966.2005.09675.x. De Lucia, G., Kauffmann, G., and White, S. D. M. Chemical enrichment of the intracluster and intergalactic medium in a hierarchical galaxy formation model. Mon. Not. Roy. Astron. Soc. , 349(3):1101–1116, Apr
-
[11]
doi: 10.1111/j.1365-2966.2004.07584.x. Diemand, J., Kuhlen, M., and Madau, P. Early Supersym- metric Cold Dark Matter Substructure. Astrophys. J., 649 (1):1–13, Sep
-
[12]
Dosovitskiy, A., Springenberg, J
doi: 10.1086/506377. Dosovitskiy, A., Springenberg, J. T., and Brox, T. Learning to generate chairs with convolutional neural networks. In IEEE Inter . Conf. Comput. Vis., Pattern Recog. (CVPR), pp. 1538–1546, June
-
[13]
doi: 10.1109/CVPR.2015. 7298761. Elahi, P. J., Thacker, R. J., and Widrow, L. M. Peaks above the Maxwellian Sea: a new approach to finding substruc- tures in N-body haloes. Mon. Not. Roy. Astron. Soc., 418 (1):320–335, Nov
-
[14]
doi: 10.1111/j.1365-2966.2011. 19485.x. Gonzalez-Perez, V ., Lacey, C. G., Baugh, C. M., Lagos, C. D. P., Helly, J., Campbell, D. J. R., and Mitchell, P. D. How sensitive are predicted galaxy luminosities to the choice of stellar population synthesis model? Mon. Not. Roy. Astron. Soc., 439(1):264–283, Mar
-
[15]
doi: 10.1093/mnras/stt2410. Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y . Generative adversarial networks. In Adv. Neural Inf. Process. Sys. (NeurIPS), June
-
[16]
doi: 10.1111/j.1365-2966.2010.18114.x. Han, J., Jing, Y . P., Wang, H., and Wang, W. Resolving subhaloes’ lives with the Hierarchical Bound-Tracing algorithm. Mon. Not. Roy. Astron. Soc. , 427(3):2437– 2449, Dec
-
[17]
doi: 10.1111/j.1365-2966.2012.22111. x. Hatton, S., Devriendt, J. E. G., Ninin, S., Bouchet, F. R., Guiderdoni, B., and Vibert, D. GALICS- I: A hybrid n-body/semi-analytic model of hierarchical galaxy forma- tion. Mon. Not. Roy. Astron. Soc., 343:75–106, July
-
[18]
doi: 10.1046/j.1365-8711.2003.05589.x. Jiang, F. and van den Bosch, F. C. Generating merger trees for dark matter haloes: a comparison of methods. Mon. Not. Roy. Astron. Soc. , 440:193–207, May
-
[19]
Katz, N., Hernquist, L., and Weinberg, D
doi: 10.1093/mnras/stu280. Katz, N., Hernquist, L., and Weinberg, D. H. Galaxies and gas in a cold dark matter universe. Astrophys. J., 399: L109–L112, November
-
[20]
doi: 10.1086/186619. Kauffmann, G. and White, S. D. M. The merging history of dark matter haloes in a hierarchical universe. Mon. Not. Roy. Astron. Soc. , 261:921–928, Apr
-
[21]
doi: 10.1093/mnras/261.4.921. Kauffmann, G., White, S. D. M., and Guiderdoni, B. The formation and evolution of galaxies within merging dark matter haloes. Mon. Not. Roy. Astron. Soc., 264:201–218, Sep
-
[22]
doi: 10.1093/mnras/264.1.201. Kauffmann, G., Colberg, J. M., Diaferio, A., and White, S. D. M. Clustering of galaxies in a hierarchical universe—I. methods and results at z = 0 . Mon. Not. Roy. Astron. Soc. , 303:188–206, February
-
[23]
doi: 10.1046/j.1365-8711.1999.02202.x. Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Adv. Neural Inf. Process. Sys. (NeurIPS), pp. 1097–1105,
-
[24]
doi: 10.1086/170625. Lacey, C. G., Baugh, C. M., Frenk, C. S., Benson, A. J., Bower, R. G., Cole, S., Gonzalez-Perez, V ., Helly, J. C., A Halo Merger Tree Generation and Evaluation Framework Lagos, C. D. P., and Mitchell, P. D. A unified mul- tiwavelength model of galaxy formation. Mon. Not. Roy. Astron. Soc. , 462(4):3854–3911, Nov
-
[25]
doi: 10.1093/mnras/stw1888. Lagos, C. d. P., Tobar, R. J., Robotham, A. S. G., Obreschkow, D., Mitchell, P. D., Power, C., and Elahi, P. J. Shark: introducing an open source, free, and flexi- ble semi-analytic model of galaxy formation. Mon. Not. Roy. Astron. Soc. , 481(3):3573–3603, Dec
-
[26]
doi: 10.1093/mnras/sty2440. Lee, J., Yi, S. K., Elahi, P. J., Thomas, P. A., Pearce, F. R., Behroozi, P., Han, J., Helly, J., Jung, I., Knebe, A., Mao, Y .-Y ., Onions, J., Rodriguez-Gomez, V ., Schneider, A., Srisawat, C., and Tweed, D. Sussing merger trees : The impact of halo merger trees on galaxy properties in a semi-analytic model. Mon. Not. Roy. As...
-
[27]
doi: 10.1093/mnras/stu2039. McAlpine, S., Helly, J. C., Schaller, M., Trayford, J. W., Qu, Y ., Furlong, M., Bower, R. G., Crain, R. A., Schaye, J., Theuns, T., Dalla Vecchia, C., Frenk, C. S., McCarthy, I. G., Jenkins, A., Rosas-Guevara, Y ., White, S. D. M., Baes, M., Camps, P., and Lemson, G. The EAGLE sim- ulations of galaxy formation: Public release ...
-
[28]
doi: 10.1016/j.ascom.2016.02.004. Muldrew, S. I., Pearce, F. R., and Power, C. The accuracy of subhalo detection. Mon. Not. Roy. Astron. Soc., 410(4): 2617–2624, Feb
-
[29]
doi: 10.1111/j.1365-2966.2010. 17636.x. Onions, J., Knebe, A., Pearce, F. R., Muldrew, S. I., Lux, H., Knollmann, S. R., Ascasibar, Y ., Behroozi, P., Elahi, P., Han, J., Maciejewski, M., Merch´an, M. E., Neyrinck, M., Ruiz, A. N., Sgr´o, M. A., Springel, V ., and Tweed, D. Subhaloes going Notts: the subhalo-finder comparison project. Mon. Not. Roy. Astron...
-
[30]
doi: 10.1111/j.1365-2966.2012.20947.x. Onions, J., Ascasibar, Y ., Behroozi, P., Casado, J., Elahi, P., Han, J., Knebe, A., Lux, H., Merch´an, M. E., Muldrew, S. I., Neyrinck, M., Old, L., Pearce, F. R., Potter, D., Ruiz, A. N., Sgr ´o, M. A., Tweed, D., and Yue, T. Subhaloes gone Notts: spin across subhaloes and finders. Mon. Not. Roy. Astron. Soc. , 429(...
-
[31]
doi: 10.1093/mnras/sts549. Planck Collaboration, Ade, P. A. R., Aghanim, N., Armitage-Caplan, C., Arnaud, M., Ashdown, M., Atrio- Barandela, F., Aumont, J., Baccigalupi, C., Banday, A. J., and et al. Planck 2013 results. XVI. cosmological param- eters. Astron. Astrophys., 571:A16, November
-
[32]
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
doi: 10.1051/0004-6361/201321591. Radford, A., Metz, L., and Chintala, S. Unsupervised rep- resentation learning with deep convolutional generative adversarial networks. arXiv, art. arXiv:1511.06434, Nov
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1051/0004-6361/201321591
-
[33]
doi: 10.1051/0004-6361/200913374. Schaye, J., Crain, R. A., Bower, R. G., Furlong, M., Schaller, M., Theuns, T., Dalla Vecchia, C., Frenk, C. S., McCarthy, I. G., Helly, J. C., Jenkins, A., Rosas-Guevara, Y . M., White, S. D. M., Baes, M., Booth, C. M., Camps, P., Navarro, J. F., Qu, Y ., Rahmati, A., Sawala, T., Thomas, P. A., and Trayford, J. The EAGLE ...
-
[34]
doi: 10.1093/mnras/stu2058. Somerville, R. S. and Primack, J. R. Semi-analytic mod- elling of galaxy formation: the local universe. Mon. Not. Roy. Astron. Soc., 310:1087–1110, December
-
[35]
doi: 10.1046/j.1365-8711.1999.03032.x. Somerville, R. S., Hopkins, P. F., Cox, T. J., Robertson, B. E., and Hernquist, L. A semi-analytic model for the co-evolution of galaxies, black holes and active galactic nuclei. Mon. Not. Roy. Astron. Soc. , 391:481–506, De- cember
-
[36]
doi: 10.1111/j.1365-2966.2008.13805.x. White, S. D. M. and Frenk, C. S. Galaxy formation through hierarchical clustering. Astrophys. J., 379:52–79, Septem- ber
-
[37]
doi: 10.1086/170483. A. Details of the GAN Architecture We selected a convolutional-based architecture, after hav- ing evaluated first a fully connected (FC) architecture with different number of layers. Even if the FC-based GAN successfully learned the merger tree structure, it failed to reproduce the correct mass range of the halos in branches different ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.