pith. machine review for the scientific record. sign in

arxiv: 2604.24863 · v1 · submitted 2026-04-27 · 🌌 astro-ph.CO · astro-ph.GA

Bound or blown: the fate of hot gas in galaxy groups

Pith reviewed 2026-05-08 01:32 UTC · model grok-4.3

classification 🌌 astro-ph.CO astro-ph.GA
keywords galaxy groupsAGN feedbackX-ray scaling relationshydrodynamical simulationshot gas contentXMM-Newton observationsselection function modeling
0
0 comments X

The pith

Intermediate AGN feedback strengths match the hot gas properties of galaxy groups observed by XMM-Newton, while the strongest ejection models do not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests how different strengths of AGN feedback affect the amount of hot gas retained in galaxy groups by comparing real X-ray data to a set of simulations. It builds realistic mock observations that include the same detection limits and measurement effects as the actual survey, then checks multiple properties such as luminosity-temperature and gas-mass-temperature relations at once. Intermediate feedback levels reproduce the observed relations with little tension, but models that eject far more gas than usual are clearly inconsistent with the data. This approach shows that the thermodynamic state of gas in these systems can discriminate among feedback prescriptions when selection effects are properly modeled.

Core claim

By generating end-to-end XMM-Newton mock observations from FLAMINGO hydrodynamical simulations that vary AGN feedback strength, the analysis finds that the normalization of the scaling relations provides the strongest test. The fgas-2sigma model yields the lowest overall tension of 0.8 sigma with the X-GAP sample, whereas the fgas-8sigma model is excluded at more than 4 sigma. Number counts fluctuate by more than 20 percent due to cosmic variance and are therefore a weaker discriminator than the relations themselves.

What carries the argument

Forward modeling of the full X-GAP selection function, detection thresholds, and observational systematics applied to simulated groups, producing mock X-ray images and catalogs analyzed identically to the real data.

If this is right

  • Thermodynamic properties of galaxy groups favor feedback stronger than the fiducial FLAMINGO calibration but rule out the most ejective scenarios.
  • The normalization of L-T and Mgas-T relations serves as the primary discriminator between feedback models.
  • Cosmic variance causes greater than 20 percent fluctuations in the number of detected groups, weakening counts as a standalone test.
  • Multi-observable constraints combined with forward modeling are required to probe the fate of hot baryons in low-mass halos.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar forward-modeling techniques could tighten constraints on feedback when applied to next-generation X-ray surveys with larger group samples.
  • The retained hot gas fraction in groups may influence how baryons are distributed on larger scales in the cosmic web.
  • Future simulations could be calibrated directly against these multi-observable tensions to reduce uncertainty in AGN feedback prescriptions.

Load-bearing premise

The forward model accurately recovers input luminosities, gas masses, and core-excised temperatures for regular systems, enabling direct comparison in observable space.

What would settle it

A larger X-ray sample or deeper observations that show scaling-relation normalizations matching the fgas-8sigma simulation at high significance would falsify the preference for intermediate feedback.

Figures

Figures reproduced from arXiv: 2604.24863 by A. Finoguenov, B. D. Oppenheimer, D. Eckert, E. O'Sullivan, F. Gastaldello, G. Gozaliasl, H. Khalil, J. Braspenning, J. Schaye, K. Kolokythas, L. Lovisari, M. A. Bourne, M. Schaller, M. Sun, R. Santra, R. Seppi, Y. E. Bahar.

Figure 1
Figure 1. Figure 1: Expected properties of an X-GAP-like sample selected from dif￾ferent FLAMINGO models. The L1_m8 includes tests for cosmic vari￾ance (CV) and uncertainties on the selection function. The top panel shows the number of groups within an SDSS-like area of 7430 deg2 , the bottom one shows the median temperature of the selected sample. The latter is a promising discriminator between FLAMINGO models. 2.4. Selectio… view at source ↗
Figure 2
Figure 2. Figure 2: Gas fraction as a function of mass for the selected systems (solid lines and shaded areas) and the full sample (dashed lines). Both are true input quantities. The top panel shows the mass distribution of the selected systems: skewed to higher masses for strong feedback models. The bottom panel denotes the ratio between the gas fraction in the X￾GAP-like selected sample and the whole population. While the L… view at source ↗
Figure 3
Figure 3. Figure 3: Workflow for the end-to-end XMM-Newton simulation of FLAMINGO galaxy groups down to the direct comparison with X-GAP. ries. They would only show up as a hard tail in the spectrum of the inner most bin in our analysis, and their contribution is expected to be around 1040 erg/s (Boroson et al. 2011), negligi￾ble in the soft X-rays compared to the hot gas luminosity in the regime of galaxy groups. 3.2. X-ray … view at source ↗
Figure 4
Figure 4. Figure 4 view at source ↗
Figure 5
Figure 5. Figure 5: Observables used for the comparison between X-GAP and various FLAMINGO models: the normalisation of the scaling relation between X-ray luminosity and temperature (both core-excised), between gas mass within 400 kpc and core-excised temperature, the total number of groups, the mean temperature and galaxy member velocity dispersion. X-GAP shows the best agreement with the fgas − 2σ model. et al. (2025), who … view at source ↗
read the original abstract

The impact of AGN feedback on the hot gas content of galaxy groups remains a key uncertainty in galaxy formation and its connection to the large scale structure of the Universe. We aim to compare the XMM-Newton Group AGN Project (X-GAP) sample to the hydrodynamical FLAMINGO simulations, which span a wide range of AGN feedback prescriptions. We construct X-GAP analogues by forward-modelling the full selection function, including detection and observational systematics, and generate end-to-end XMM-Newton mock observations analysed consistently with the data. We study multiple observables, including the L--T and Mgas--T relations, number of groups, mean temperature, and velocity dispersion, accounting for their covariance. The forward model accurately recovers input luminosities, gas masses, and core-excised temperatures for regular systems, enabling direct comparison in observable space. The normalisation of the scaling relations is the best discriminator between feedback models, while cosmic variance introduces > 20% fluctuations in the number of detected systems, making counts alone a weak discriminator. Models with intermediate feedback strength provide the best agreement with X-GAP, with the fgas-2sigma model yielding the lowest tension of only 0.8sigma, while the most extreme feedback scenario (fgas-8sigma) is ruled out at > 4sigma. Our results indicate that the thermodynamic properties of galaxy groups favour feedback stronger than the fiducial FLAMINGO calibration, but disfavour the most ejective models. This highlights the importance of combining forward modelling and multi-observable constraints to probe the fate of hot baryons in low-mass haloes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper compares the X-GAP sample of galaxy groups from XMM-Newton observations to FLAMINGO hydrodynamical simulations spanning a range of AGN feedback strengths. By forward-modeling the full selection function, generating end-to-end XMM mock observations analyzed identically to the data, and comparing multiple observables (L-T and Mgas-T scaling relations, group counts, mean temperature, velocity dispersion) while accounting for their covariance, the authors find that intermediate feedback models best match the observations. Specifically, the fgas-2sigma model yields the lowest tension (0.8σ), while the most extreme fgas-8sigma model is ruled out at >4σ. The work concludes that group thermodynamic properties favor feedback stronger than the fiducial FLAMINGO calibration but disfavor the most ejective scenarios.

Significance. If the central results hold, this provides important empirical constraints on AGN feedback efficiency in low-mass halos, directly addressing uncertainties in how baryons are ejected or retained and their effects on large-scale structure. The methodological approach of full forward modeling of selection effects combined with multi-observable covariance-aware comparison is a clear strength, enabling more robust discrimination between feedback variants than single-relation studies. It also quantifies the limited discriminating power of number counts due to cosmic variance (>20% fluctuations).

major comments (2)
  1. [Abstract and forward-modeling section] Abstract and forward-modeling section: The claim that the forward model 'accurately recovers input luminosities, gas masses, and core-excised temperatures for regular systems' underpins the direct observable-space comparison and the reported tensions (0.8σ and >4σ). However, no quantitative validation is provided for non-regular/disturbed systems, and the fraction of the X-GAP sample (or simulated analogues) meeting the regularity criteria is not reported. Disturbed systems are common at group scales; any recovery biases in L, Mgas or T would shift scaling-relation normalizations and covariance matrices, altering which feedback model is preferred and the strength of the >4σ exclusion.
  2. [Results and tension calculation] Tension calculation and covariance treatment (implied in results section): The manuscript states that observables are compared 'accounting for their covariance,' but full details on covariance matrix construction, error propagation, and the exact tension metric are not provided. Since the central claim of ruling out extreme models at >4σ rests on this multi-observable statistic, insufficient documentation prevents full verification of the quoted significances.
minor comments (2)
  1. [Simulation descriptions] Clarify the precise parameter differences between the fgas-2sigma and fgas-8sigma variants (e.g., in a table of feedback parameters) to aid reproducibility.
  2. [Figures and methods] Figure captions and text should explicitly state the regularity criteria used in the recovery tests to allow readers to assess applicability to the full sample.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive report and positive assessment of the work's significance. We address each major comment below with the strongest honest response possible. Where the manuscript is incomplete, we agree revisions are needed and will incorporate the requested details.

read point-by-point responses
  1. Referee: [Abstract and forward-modeling section] The claim that the forward model 'accurately recovers input luminosities, gas masses, and core-excised temperatures for regular systems' lacks quantitative validation for non-regular/disturbed systems. The fraction of the X-GAP sample (or simulated analogues) meeting regularity criteria is not reported. Disturbed systems are common; biases could alter scaling relations and the reported tensions.

    Authors: We agree the manuscript does not report the fraction of regular systems in X-GAP or provide quantitative recovery tests for disturbed systems. The validation statement applies specifically to regular systems, which form the core of the X-GAP thermodynamic analysis. In the revised manuscript we will add a dedicated paragraph in the forward-modeling section stating the regularity criteria applied, the observed fraction of X-GAP groups meeting them, and recovery statistics from simulated disturbed analogues to quantify any residual biases in L, Mgas and T. revision: yes

  2. Referee: [Results and tension calculation] The manuscript states that observables are compared 'accounting for their covariance,' but full details on covariance matrix construction, error propagation, and the exact tension metric are not provided. This prevents verification of the quoted significances including the >4σ exclusion.

    Authors: We acknowledge that the covariance construction and tension metric are described only at a high level. The covariance matrix was built from the joint posterior of the simulated observables (including cosmic variance), and tension was evaluated via a multivariate chi-squared statistic. In the revised manuscript we will add an appendix that explicitly gives the covariance matrix elements, the error propagation procedure, and the precise tension formula used, allowing full reproduction of the 0.8σ and >4σ results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; external data-simulation comparison

full rationale

The paper's derivation chain consists of forward-modeling the X-GAP selection function on FLAMINGO simulation variants, generating end-to-end XMM mocks, and comparing multi-observable statistics (L-T, Mgas-T, counts, etc.) with covariance to the independent X-GAP dataset. The abstract states that the forward model 'accurately recovers input luminosities, gas masses, and core-excised temperatures for regular systems,' which is presented as a supporting validation test rather than a definitional step. No equation or claim reduces a 'prediction' to a fitted parameter by construction, nor does any load-bearing premise rely on a self-citation chain or imported uniqueness theorem. The tension values (0.8sigma vs. >4sigma) emerge from direct observable-space comparison to external observations, satisfying the criterion of being self-contained against external benchmarks. No circular steps are identifiable from the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claim rests on the accuracy of the FLAMINGO hydrodynamical simulations, the completeness of the X-GAP selection function, and the assumption that forward-modeled mocks faithfully reproduce observational systematics.

axioms (2)
  • standard math Standard Lambda-CDM cosmology and hydrodynamical equations govern the simulations
    Invoked throughout the FLAMINGO runs used for comparison
  • domain assumption The X-GAP sample selection function and observational systematics are fully captured by the forward model
    Central to enabling direct observable-space comparison

pith-pipeline@v0.9.0 · 5678 in / 1328 out tokens · 84622 ms · 2026-05-08T01:32:47.558679+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

7 extracted references · 1 canonical work pages

  1. [1]

    Abbott, T. M. C., Aguena, M., Alarcon, A., et al. 2022, Phys. Rev. D, 105, 023520 Abril-Pla, O., Andreani, V ., Carroll, C., et al. 2023, PeerJ Computer Science, 9 Akino, D., Eckert, D., Okabe, N., et al. 2022, PASJ, 74, 175 Alam, S., Albareti, F. D., Allende Prieto, C., et al. 2015, ApJS, 219, 12 Aricò, G., Angulo, R. E., Zennaro, M., et al. 2023, A&A, 6...

  2. [2]

    Posterior values are reported from the third column onward, for each case labelled in the top row

    0.02±0.01 Notes.The symbolU(M,N) denotes a uniform prior between the values M and N. Posterior values are reported from the third column onward, for each case labelled in the top row. mate, but do not necessarily exactly reproduce, SDSS Petrosian magnitudes. At the low redshifts considered here (z<0.05), such differences are expected to be subdominant rel...

  3. [3]

    The result is presented in Fig. B.3. The selection bias follows a similar pat- tern: incompleteness at low masses and a down-scattered popu- lation at high masses caused by the upper radius cut. However, the effect is more pronounced for theL–Trelation because gas fraction is less directly tied to the selection observable (X-ray flux). We measure deviatio...

  4. [4]

    In addition, a slight offset and a portion of the observed scatter may originate from the cylindrical correc- tion (see Eq

    orapec (this work) does not impact the measurement of X-ray luminos- ity in our framework. In addition, a slight offset and a portion of the observed scatter may originate from the cylindrical correc- tion (see Eq. 2). The correction is applied to the reference true luminosities, whereas the reconstructed values are derived inde- pendently from the mock p...

  5. [5]

    4.2) we also store the input temperature and input veloc- ity dispersion

    Appendix C.1: Observables correlation From theN-light cone generationprocedure (explained in Sect. 4.2) we also store the input temperature and input veloc- ity dispersion. Although these are not identical to the measured quantities, they provide a physically motivated baseline to quan- tify correlations between observables driven by halo population and s...

  6. [6]

    observable we compute a two-sided tail probabilityp j and define Article number, page 17 of 20 A&A proofs:manuscript no. aa60011-26 the summary statisticSfollowing the Fisher method such that: p j =2×min(F j(x j),1− F j(x j)), S=−2 5X i=1 logp j.(C.1) Because thep j are correlated, we do not expectSto neces- sarily follow the standardχ 2 distribution. We ...

  7. [7]

    In contrast, theM gas–Trela- tion is more sensitive: an enhancedL X would require lowerM gas to match the observations

    TheL–Tnormalisation is largely unaffected, as luminosity drives the selection. In contrast, theM gas–Trela- tion is more sensitive: an enhancedL X would require lowerM gas to match the observations. We recompute the input X-ray luminosities withpyXSIMby fixing the metallicity of gas particles within R 500c to 0.3 Z ⊙ for haloes with M 500c >5×10 12 M⊙. Th...