arxiv: 2602.07651 · v2 · submitted 2026-02-07 · 🌌 astro-ph.CO · astro-ph.GA

Recognition: 2 theorem links

· Lean Theorem

Cosmology with one galaxy: An analytic formula relating Ω_{rm m} with galaxy properties

Kito Liao , Francisco Villaescusa-Navarro , Romain Teyssier , Natal\'i S. M. de Santi

Authors on Pith no claims yet

Pith reviewed 2026-05-16 06:04 UTC · model grok-4.3

classification 🌌 astro-ph.CO astro-ph.GA

keywords Ω_mgalaxy propertiessymbolic regressioncosmological inferencehydrodynamical simulationsbaryonic retention

0 comments

The pith

A compact analytic formula recovers the cosmic matter density Ω_m from the properties of a single galaxy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives an analytic expression that directly connects the matter density parameter Ω_m to observable galaxy properties such as stellar mass, gas content, and gravitational potential depth. This relation emerges from symbolic regression applied to hydrodynamical simulations and remains consistent across independent simulation suites after modest coefficient adjustments. A sympathetic reader would care because it implies cosmological information can be read from individual galaxies rather than requiring statistical measurements of galaxy clustering or halo abundances. The result supplies a physically motivated bridge between baryon retention inside galaxies and the large-scale cosmic density.

Core claim

Symbolic regression on the CAMELS hydrodynamical simulations yields a compact functional form that recovers Ω_m from intrinsic galaxy-scale observables. The expression admits a transparent interpretation in terms of baryonic retention and enrichment efficiency regulated by gravitational potential depth, and it generalizes across multiple simulation codes (IllustrisTNG, ASTRID, SIMBA, Swift-EAGLE) with only small recalibrations of coefficients.

What carries the argument

A symbolic-regression-derived analytic expression linking Ω_m to galaxy baryonic content and gravitational potential depth.

If this is right

Cosmological parameters can be inferred from single galaxies without population statistics or clustering measurements.
The relation holds across different hydrodynamical codes after limited recalibration of coefficients.
It supplies a direct physical account of why Ω_m appears in local galaxy properties through baryon retention regulated by potential depth.
The approach bypasses traditional large-scale structure probes and creates a new synergy between galaxy formation modeling and precision cosmology.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Observational catalogs from surveys such as SDSS or DESI could be used to test whether the simulated relation produces consistent Ω_m values.
At high redshift the method might provide cosmological constraints where clustering measurements are observationally difficult.
Discrepancies between the formula and real data could directly constrain the strength of feedback processes in galaxy formation models.

Load-bearing premise

The functional form found in the simulations correctly describes galaxies in the real universe rather than depending on the specific subgrid physics of the training runs.

What would settle it

Apply the formula to a real galaxy with well-measured stellar mass, gas fraction, and metallicity; the predicted Ω_m deviates substantially from the value measured independently by CMB or supernova surveys.

read the original abstract

Standard cosmological analyses typically treat galaxy formation and cosmological parameter inference as decoupled problems, relying on population-level statistics such as clustering, lensing, or halo abundances. However, classical studies of baryon fractions in massive galaxy clusters have long suggested that gravitationally bound systems may retain cosmological information through their baryonic content. Building on this insight, we present the first analytic and physically interpretable cosmological tracer that links the matter density parameter, $\Omega_m$, directly to intrinsic galaxy-scale observables, demonstrating that cosmological information can be extracted from individual galaxies. Using symbolic regression applied to state-of-the-art hydrodynamical simulations from the CAMELS project, we identify a compact functional form that robustly recovers $\Omega_m$ across multiple simulation suites (IllustrisTNG, ASTRID, SIMBA, and Swift-EAGLE), requiring only modest recalibration of a small number of coefficients. The resulting expression admits a transparent physical interpretation in terms of baryonic retention and enrichment efficiency regulated by gravitational potential depth, providing a clear explanation for why $\Omega_m$ is locally encoded in galaxy properties. Our work establishes a direct, interpretable bridge between small-scale galaxy physics and large-scale cosmology, opening a complementary pathway to cosmological inference that bypasses traditional clustering-based statistics and enables new synergies between galaxy formation theory and precision cosmology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Symbolic regression on CAMELS hydro runs produces a compact analytic formula that recovers input Ω_m from single-galaxy observables across four codes after small coefficient tweaks, but the result is still a fitted mapping whose reach outside those specific simulations is untested.

read the letter

The main result is a short analytic expression, discovered by symbolic regression, that combines a few galaxy properties to recover the matter density parameter used in the CAMELS simulations. The same basic form works in IllustrisTNG, SIMBA, ASTRID and Swift-EAGLE once a couple of coefficients are adjusted, and the authors supply a physical reading in terms of how potential depth controls baryon retention and enrichment efficiency. That cross-suite consistency and the move to an explicit, interpretable formula are the clearest advances over earlier black-box approaches. The work is therefore new in giving a concrete, physically motivated bridge between one galaxy and Ω_m rather than relying on clustering or halo statistics. The execution looks careful enough on the simulation side to merit checking the details. The obvious limitation is that every training run shares the same family of subgrid prescriptions for star formation and feedback. Even with the recalibration, the relation could still be capturing how those particular models respond to changes in background density instead of a more universal galaxy-scale feature. The abstract gives no quantitative error analysis or overfitting checks, and there is no comparison to real observations with independent Ω_m anchors. Until those steps are done, the formula remains a successful fit to the training data rather than a ready-to-use cosmological probe. This paper is for people working at the galaxy-formation and cosmology boundary who want to explore whether small-scale observables can carry cosmological information. A reader looking for an alternative to traditional large-scale structure methods would get value from seeing the idea laid out explicitly. I would send it to peer review because the cross-code robustness already demonstrated makes the claim worth a full technical evaluation rather than a desk rejection.

Referee Report

3 major / 2 minor

Summary. The paper claims to derive, via symbolic regression on CAMELS hydrodynamical simulations, a compact analytic formula that directly relates the matter density parameter Ω_m to intrinsic galaxy-scale observables (e.g., stellar mass, metallicity, and star-formation rate). The authors report that this expression recovers Ω_m robustly across four simulation suites (IllustrisTNG, SIMBA, ASTRID, Swift-EAGLE) after only modest coefficient recalibration and admits a physical interpretation in terms of baryonic retention efficiency regulated by gravitational potential depth.

Significance. If the claimed relation proves robust to variations in subgrid physics and generalizes to observational data, the result would be significant: it would establish a direct, analytic bridge between single-galaxy properties and cosmology, bypassing traditional clustering or abundance statistics and enabling new synergies between galaxy-formation modeling and parameter inference. The analytic, interpretable form is a clear strength relative to black-box machine-learning approaches.

major comments (3)

[§3.2] §3.2 (symbolic regression procedure): the functional form is obtained by fitting to simulation outputs in which Ω_m is an explicit input parameter; the reported 'recovery' therefore amounts to learning a mapping from the simulation's response to Ω_m rather than an independent derivation, and the manuscript must demonstrate that the expression encodes cosmological information beyond the specific subgrid implementations used in CAMELS.
[§4.3] §4.3 (cross-suite validation): while modest recalibration of coefficients yields acceptable recovery across suites, the manuscript provides no quantitative assessment of residual scatter, systematic offsets, or sensitivity to the precise range of Ω_m varied in the training set; without these metrics it is unclear whether the formula remains predictive outside the narrow CAMELS prior.
[§5.1] §5.1 (physical interpretation): the link to gravitational potential depth and baryonic retention is presented post-hoc; an a-priori test (e.g., controlled runs varying only the potential depth while holding subgrid parameters fixed) is needed to show that the relation is not an artifact of how the CAMELS feedback models respond to background density.

minor comments (2)

[Abstract] The abstract asserts this is the 'first' such analytic tracer; a brief discussion of earlier analytic work on baryon fractions in clusters (e.g., references to Kravtsov et al. or related studies) would place the novelty claim in context.
[Figure 3] Figure 3 (recovery plots) lacks error bars on the inferred Ω_m values and does not show the distribution of residuals as a function of galaxy mass or environment.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive and detailed report. We address each major comment below and have revised the manuscript to incorporate additional quantitative analysis and clarification where feasible.

read point-by-point responses

Referee: [§3.2] §3.2 (symbolic regression procedure): the functional form is obtained by fitting to simulation outputs in which Ω_m is an explicit input parameter; the reported 'recovery' therefore amounts to learning a mapping from the simulation's response to Ω_m rather than an independent derivation, and the manuscript must demonstrate that the expression encodes cosmological information beyond the specific subgrid implementations used in CAMELS.

Authors: We agree that the symbolic regression identifies a mapping learned from the simulation outputs in which Ω_m is varied as an input. The manuscript's central claim, however, rests on the fact that the identical functional form recovers Ω_m across four independent simulation suites that employ substantially different subgrid physics implementations. We have revised §3.2 to include an expanded discussion of this cross-suite robustness, emphasizing that the persistence of the same analytic expression (with only coefficient recalibration) across IllustrisTNG, SIMBA, ASTRID, and Swift-EAGLE indicates that the relation encodes cosmological dependence rather than being an artifact of any single subgrid model. revision: yes
Referee: [§4.3] §4.3 (cross-suite validation): while modest recalibration of coefficients yields acceptable recovery across suites, the manuscript provides no quantitative assessment of residual scatter, systematic offsets, or sensitivity to the precise range of Ω_m varied in the training set; without these metrics it is unclear whether the formula remains predictive outside the narrow CAMELS prior.

Authors: We acknowledge the absence of these quantitative metrics in the original submission. In the revised manuscript we have added a new table and accompanying text in §4.3 that report the root-mean-square error, mean bias, and scatter for each suite after recalibration, together with an explicit assessment of performance across the CAMELS Ω_m range (0.1–0.5). We also discuss the expected degradation for values outside this prior and note the limitations for extrapolation. revision: yes
Referee: [§5.1] §5.1 (physical interpretation): the link to gravitational potential depth and baryonic retention is presented post-hoc; an a-priori test (e.g., controlled runs varying only the potential depth while holding subgrid parameters fixed) is needed to show that the relation is not an artifact of how the CAMELS feedback models respond to background density.

Authors: The physical interpretation was developed after the functional form was identified. We have expanded §5.1 to provide additional supporting evidence drawn from correlations between the expression terms and halo properties already present in the CAMELS data, as well as references to the existing literature on baryonic retention. However, performing new controlled hydrodynamical runs that vary only gravitational potential depth while holding all subgrid parameters fixed lies outside the scope of the present study. revision: partial

standing simulated objections not resolved

Request for new controlled hydrodynamical simulations varying only gravitational potential depth while holding subgrid parameters fixed

Circularity Check

1 steps flagged

Symbolic regression formula fitted to CAMELS inputs presented as independent analytic tracer

specific steps

fitted input called prediction [Abstract]
"Using symbolic regression applied to state-of-the-art hydrodynamical simulations from the CAMELS project, we identify a compact functional form that robustly recovers Ω_m across multiple simulation suites (IllustrisTNG, ASTRID, SIMBA, and Swift-EAGLE), requiring only modest recalibration of a small number of coefficients."

The functional form is obtained by symbolic regression on simulation data where Ω_m is a known input; the reported 'recovery' of Ω_m from galaxy observables is therefore the output of the same fitting procedure applied to the training inputs rather than an independent prediction or derivation.

full rationale

The paper's derivation chain consists of applying symbolic regression to hydrodynamical simulation outputs (CAMELS suites) in which Ω_m is a controlled input parameter, then reporting that the resulting compact functional form 'recovers' Ω_m from galaxy properties. This matches the fitted_input_called_prediction pattern: the expression is discovered by optimizing against the known simulation inputs, so the reported recovery on the same or closely related data is a direct mapping by construction. Cross-suite consistency after coefficient recalibration is noted but does not remove the dependence on the original fitting process. No self-definitional equations, load-bearing self-citations, or imported uniqueness theorems appear in the provided text; the central claim remains a post-hoc fit rather than a first-principles derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the fidelity of the CAMELS hydrodynamical simulations and the assumption that symbolic regression identifies a generalizable rather than simulation-specific relation; limited information is available from the abstract alone.

free parameters (1)

coefficients in the analytic formula
Small number of coefficients that require modest recalibration across simulation suites.

axioms (1)

domain assumption Hydrodynamical simulations accurately capture the baryonic processes that encode cosmological information in galaxy properties
The relation is derived from and validated within these simulations.

pith-pipeline@v0.9.0 · 5556 in / 1269 out tokens · 37633 ms · 2026-05-16T06:04:12.625763+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We present the first analytic ... Ω_m directly to intrinsic galaxy-scale observables
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ln[σ(Z⋆ Score / f⋆ k) + c] − a R_compact

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.