pith. machine review for the scientific record. sign in

arxiv: 2604.20941 · v1 · submitted 2026-04-22 · 🌌 astro-ph.CO · astro-ph.HE· gr-qc

Recognition: unknown

Interpretable Analytic Formulae for GWTC-4 Binary Black Hole Population Properties via Symbolic Regression

Authors on Pith no claims yet

Pith reviewed 2026-05-09 22:41 UTC · model grok-4.3

classification 🌌 astro-ph.CO astro-ph.HEgr-qc
keywords binary black holesgravitational wavessymbolic regressionmerger rateeffective spinmass ratio distributionGWTC-4
0
0 comments X

The pith

Symbolic regression on GWTC-4 posteriors yields closed-form analytic expressions for binary black hole merger rates and spin-mass correlations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies symbolic regression directly to the posterior inference products from the GWTC-4 catalog to extract simple mathematical formulas that describe four main features of the binary black hole population. These include the change in merger rate with redshift, how the effective spin distribution depends on mass ratio and on redshift, and the mass ratio distributions that accompany the known peaks at 10 and 35 solar masses. The resulting expressions turn complex numerical models into transparent, differentiable laws that recover the low-redshift merger rate slope without any prior power-law assumption. A reader would care because the formulas supply exact derivatives and compact summaries that can be used for rapid calculations of rates and backgrounds without repeating the full numerical inference.

Core claim

Symbolic regression discovers compact closed-form analytic expressions for the merger-rate evolution with redshift, the mass-ratio dependence of the effective-spin distribution, the redshift evolution of the effective-spin distribution, and the conditional mass-ratio distributions associated with the 10 solar mass and 35 solar mass primary mass peaks. The method dynamically recovers a consistent low-redshift merger-rate slope without assuming an a priori power-law form. The exact analytic derivatives show that the mass ratio-effective spin and redshift-effective spin correlations are driven by broadening of the posterior widths rather than shifts in the mean, and qualitatively distinct forms

What carries the argument

Symbolic regression, which searches for compact mathematical expressions that fit the posterior inference products of the GWTC-4 catalog.

If this is right

  • The closed-form expressions supply exact analytic gradients for diagnostics of the population inference.
  • They serve as compact surrogate summaries for flexible numerical posteriors that lack low-dimensional analytic form.
  • The formulae enable rapid downstream calculations for rate forecasting, formation channel comparison, and stochastic background estimation.
  • Distinct functional forms appear for mass-ratio distributions conditioned on the 10 solar mass versus 35 solar mass primary mass peaks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These analytic expressions could be compared directly against predictions from astrophysical formation simulations to test which channels dominate at different masses.
  • The same regression approach could be applied to future larger catalogs to track how the discovered functional forms evolve with improved statistics.
  • If the distinct mass-ratio forms at different mass peaks persist, they would suggest separate formation pathways for the two populations.

Load-bearing premise

The expressions found by symbolic regression reflect genuine underlying population properties rather than artifacts from the particular models or sampling choices used to generate the GWTC-4 posteriors.

What would settle it

New independent analyses or future catalogs that produce merger-rate slopes, spin-mass correlations, or conditional mass-ratio distributions that deviate from the derived analytic forms at high statistical significance would falsify the claim.

Figures

Figures reproduced from arXiv: 2604.20941 by Chayan Chatterjee.

Figure 1
Figure 1. Figure 1: BBH comoving merger rate R(z) as a function of redshift. Shaded bands show the 90% credible intervals from the GWTC-4 PowerLawRedshift (blue) and BSplineIID (green) models. Dashed lines show the corresponding PySR symbolic regression median fits (orange), with PySR 90% credible bands derived from 200 draw-by-draw fits. The symbolic expressions faithfully capture both the median and the uncertainty structur… view at source ↗
Figure 2
Figure 2. Figure 2: Distributions of the peak redshift zpeak (left) and the low-z logarithmic slope γ0 (right), extracted from 200 draw-by-draw PySR fits for the PowerLawRedshift (blue) and BSplineIID (orange) models. Because the PowerLawRed￾shift symbolic surrogates are overwhelmingly monotonic, they lack a true mathematical turnover. Consequently, their zpeak values pile up at the upper boundary of the evaluation grid (z = … view at source ↗
Figure 3
Figure 3. Figure 3: Effective spin distribution parameters as a function of mass ratio q. Top left: Mean effective spin µχeff (q) for both the Spline (blue solid) and Linear (orange dashed) GWTC-4 models, with PySR symbolic fits overlaid. Top right: Width σχeff (q) of the spin distribution. Bottom left: Analytic gradient dµχeff /dq computed from the PySR symbolic expressions. Bottom right: Gradient d ln σχeff /dq. Shaded band… view at source ↗
Figure 4
Figure 4. Figure 4: Effective spin distribution parameters as a function of redshift z. Top left: Mean effective spin µχeff (z) for the Spline and Linear GWTC-4 models with PySR overlays. Top right: Width σχeff (z). Bottom left: Analytic gradient dµχeff /dz. Bottom right: Gradient d ln σχeff /dz. µχeff (z) ≃ 0.05 − 0.02z (8) The corresponding gradient curves are effectively flat and pinned near zero, yielding a median slope o… view at source ↗
Figure 5
Figure 5. Figure 5: Conditional mass-ratio distributions p(q) for the low-mass peak (left) and high-mass peak (right) of the BBH primary mass distribution. Blue shaded bands show the GWTC-4 90% credible intervals, orange bands show the PySR 90% CI from draw-by-draw fits, and the solid (GWTC-4) and dashed (PySR) lines show the medians. The low-mass peak is well-described by a simpler expression (Equation 12), while the high-ma… view at source ↗
read the original abstract

Recent LIGO-Virgo-KAGRA (LVK) analyses have revealed complex structure in the binary black hole (BBH) population, including distinct features in the primary mass spectrum and nontrivial spin-mass correlations. However, the phenomenological models used to capture these features often lack analytic transparency, making it difficult to isolate robust physical laws from modeling artifacts. To address this, symbolic regression is applied to the posterior inference products of the GWTC-4 catalog, discovering compact, closed-form analytic expressions for four key population relationships: (i) the merger-rate evolution with redshift; (ii) the mass-ratio dependence of the effective-spin distribution; (iii) the redshift evolution of the effective-spin distribution; and (iv) the conditional mass-ratio distributions associated with the 10 solar mass and 35 solar mass primary mass peaks. This framework successfully compresses both rigid and highly flexible models into differentiable phenomenological laws, dynamically recovering a consistent low-redshift merger-rate slope without assuming an a priori power-law form. The exact analytic derivatives provided by symbolic regression show that the mass ratio--effective spin and redshift--effective spin correlations are robustly driven by broadening of the posterior widths rather than shifts in the mean. Furthermore, qualitatively distinct functional forms for the mass-ratio distributions conditioned on the 10 solar mass and 35 solar mass primary mass peaks are identified. These closed-form expressions enable exact analytic gradient diagnostics and compact surrogate summaries, particularly for flexible numerical posteriors that are not otherwise available in low-dimensional analytic form. They also facilitate rapid downstream calculations for rate forecasting, formation channel comparison, and stochastic background estimation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript applies symbolic regression to the posterior samples from the GWTC-4 catalog to derive compact closed-form analytic expressions for four BBH population relations: merger-rate evolution with redshift, mass-ratio dependence of the effective-spin distribution, redshift evolution of the effective-spin distribution, and conditional mass-ratio distributions at the 10 M⊙ and 35 M⊙ primary-mass peaks. It claims these expressions compress both rigid and flexible phenomenological models into differentiable laws, recover a consistent low-redshift merger-rate slope without assuming a power-law form a priori, and demonstrate that observed spin-mass and spin-redshift correlations arise from posterior-width broadening rather than mean shifts.

Significance. If validated, the approach would offer useful interpretable and analytically differentiable surrogate models for complex GWTC-4 posteriors, enabling exact gradient diagnostics, rapid rate forecasting, formation-channel comparisons, and stochastic-background calculations. The data-driven compression of flexible numerical results into closed-form expressions is a promising direction for making hierarchical-inference outputs more transparent and reusable.

major comments (2)
  1. [Methods and Results sections] The central claim that the SR-derived expressions capture genuine population properties (rather than artifacts of the GWTC-4 inference pipeline, selection effects, or posterior sampling) requires end-to-end validation that is not present. The paper should inject known analytic population relations into mock catalogs, run the identical hierarchical inference plus SR pipeline, and demonstrate recovery of the injected forms; without this, the outputs remain data-driven reparameterizations of the same posteriors used as input.
  2. [Results on correlations] The attribution of mass-ratio–effective-spin and redshift–effective-spin correlations to posterior broadening (rather than mean shifts) is supported only by the analytic derivatives obtained from SR. No quantitative comparison is provided to alternative explanations, such as residual model-induced features from the rigid or flexible phenomenological models underlying GWTC-4, nor are uncertainty estimates on the derivatives reported.
minor comments (2)
  1. [Abstract] The abstract states that a 'consistent low-redshift merger-rate slope' is recovered but does not quote the numerical value or compare it directly to existing power-law fits in the literature.
  2. [Notation and equations] Notation for effective spin (χ_eff) and mass ratio (q) should be defined once at first use and used consistently in all equations and figure captions.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed comments, which have prompted us to clarify the scope of our analysis and strengthen the presentation of its limitations. We respond to each major comment below and indicate the corresponding revisions to the manuscript.

read point-by-point responses
  1. Referee: [Methods and Results sections] The central claim that the SR-derived expressions capture genuine population properties (rather than artifacts of the GWTC-4 inference pipeline, selection effects, or posterior sampling) requires end-to-end validation that is not present. The paper should inject known analytic population relations into mock catalogs, run the identical hierarchical inference plus SR pipeline, and demonstrate recovery of the injected forms; without this, the outputs remain data-driven reparameterizations of the same posteriors used as input.

    Authors: We agree that injecting known analytic population relations into mock catalogs, performing the full hierarchical inference, and then applying the identical SR pipeline would constitute the most rigorous validation that the recovered expressions reflect true population properties rather than pipeline artifacts. Such an end-to-end test lies beyond the computational resources and primary focus of the present work, which applies symbolic regression as a post-processing step to the publicly released GWTC-4 posterior samples. In the revised manuscript we have added a dedicated paragraph in the Discussion section that explicitly acknowledges this limitation, outlines the practical barriers, and identifies the mock-injection validation as a high-priority extension for future studies. We also emphasize that the SR expressions are derived directly from the data-driven posteriors and recover features (such as the low-redshift merger-rate slope) that are consistent with the original GWTC-4 phenomenological results, thereby providing compact, differentiable summaries of those inferences. revision: partial

  2. Referee: [Results on correlations] The attribution of mass-ratio–effective-spin and redshift–effective-spin correlations to posterior broadening (rather than mean shifts) is supported only by the analytic derivatives obtained from SR. No quantitative comparison is provided to alternative explanations, such as residual model-induced features from the rigid or flexible phenomenological models underlying GWTC-4, nor are uncertainty estimates on the derivatives reported.

    Authors: The exact analytic derivatives furnished by the SR expressions allow us to isolate the contribution of distribution width versus location. To address the request for quantitative comparison, the revised manuscript now includes a direct side-by-side evaluation of the SR-derived derivatives against the corresponding derivatives computed from the original rigid and flexible GWTC-4 models; the trends remain consistent, supporting that the broadening signal is not an artifact of the SR step alone. In addition, we have implemented bootstrap resampling across the posterior samples to obtain uncertainty estimates on the SR coefficients and the resulting derivatives; these uncertainties are now reported in the updated Results section and associated figures. revision: yes

standing simulated objections not resolved
  • Full end-to-end validation via injection of known analytic population relations into mock catalogs followed by re-running the hierarchical inference and SR pipeline, owing to the prohibitive computational cost of repeating the full GWTC-4 analysis on large simulated datasets.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are identifiable. Symbolic regression inherently fits functional forms and coefficients to the input posteriors, but the specific fitted values and any background assumptions are not stated.

pith-pipeline@v0.9.0 · 5594 in / 1268 out tokens · 160119 ms · 2026-05-09T22:41:49.298796+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 5 internal anchors

  1. [1]

    P., Abbott, R., et al

    Aasi, J., Abbott, B. P., Abbott, R., et al. 2015, Classical and Quantum Gravity, 32, 074001, doi: 10.1088/0264-9381/32/7/074001

  2. [2]

    2015, Classical and Quantum Gravity, 32, 024001, doi: 10.1088/0264-9381/32/2/024001

    Acernese, F., Agathos, M., Agatsuma, K., et al. 2014, Classical and Quantum Gravity, 32, 024001, doi: 10.1088/0264-9381/32/2/024001 12

  3. [3]

    Progress of Theoretical and Experimental Physics , keywords =

    Akutsu, T., Ando, M., Arai, K., et al. 2020, Progress of Theoretical and Experimental Physics, 2021, 05A101, doi: 10.1093/ptep/ptaa125

  4. [4]

    S., Fragos, T., Zevin, M., et al

    Bavera, S. S., Fragos, T., Zevin, M., et al. 2022, Astronomy & Astrophysics, 657, A36, doi: 10.1051/0004-6361/202141979

  5. [5]

    Physical Review X , author =

    Callister, T. A., & Farr, W. M. 2024, Physical Review X, 14, 021005, doi: 10.1103/PhysRevX.14.021005

  6. [6]

    Farr, W. M. 2020, The Astrophysical Journal Letters, 896, L32, doi: 10.3847/2041-8213/ab9743

  7. [7]

    Capote, W

    Capote, E., Jia, W., Aritomi, N., et al. 2025, Phys. Rev. D, 111, 062002, doi: 10.1103/PhysRevD.111.062002

  8. [8]

    S., Collaboration, T

    Collaboration, L. S., Collaboration, T. V., & Collaboration, T. K. 2025, GWTC-4.0: Population Properties of Merging Compact Binaries, Zenodo, doi: 10.5281/zenodo.16911563

  9. [9]

    Collaboration, T. L. S., Collaboration, T. V., & the KAGRA Collaboration. 2025a, GWTC-4.0: Updating the Gravitational-Wave Transient Catalog with Observations from the First Part of the Fourth LIGO-Virgo-KAGRA Observing Run, https://arxiv.org/abs/2508.18082

  10. [10]

    Collaboration, T. L. S., the Virgo Collaboration, & the KAGRA Collaboration. 2025b, GWTC-4.0: Population Properties of Merging Compact Binaries, https://arxiv.org/abs/2508.18083

  11. [11]

    Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl

    Cranmer, M. 2023, arXiv e-prints. https://arxiv.org/abs/2305.01582

  12. [12]

    2022, ApJ, 924, 101, doi: 10.3847/1538-4357/ac3667

    Edelman, B., Doctor, Z., Godfrey, J., & Farr, B. 2022, The Astrophysical Journal, 924, 101, doi: 10.3847/1538-4357/ac3667

  13. [13]

    2023, ApJ, 946, 16, doi: 10.3847/1538-4357/acb5ed

    Edelman, B., Farr, B., & Doctor, Z. 2023, The Astrophysical Journal, 946, 16, doi: 10.3847/1538-4357/acb5ed

  14. [14]

    The steep redshift evolution of the hierarchical binary black hole merger rate may cause the $z$-$\chi_{\rm eff}$ correlation

    Farah, A. M., Vijaykumar, A., & Fishbach, M. 2026, https://arxiv.org/abs/2601.03456

  15. [15]

    M., Stevenson, S., Miller, M

    Farr, W. M., Stevenson, S., Miller, M. C., et al. 2017, Nature, 548, 426, doi: 10.1038/nature23453

  16. [16]

    The Astrophysical Journal Letters , author =

    Fishbach, M., Holz, D. E., & Farr, W. M. 2018, The Astrophysical Journal Letters, 863, L41, doi: 10.3847/2041-8213/aad800

  17. [17]

    Cosmic Star Formation History

    Madau, P., & Dickinson, M. 2014, Annual Review of Astronomy and Astrophysics, 52, 415, doi: 10.1146/annurev-astro-081811-125615

  18. [18]

    Mandel, I., & de Mink, S. E. 2016, Monthly Notices of the Royal Astronomical Society, 458, 2634, doi: 10.1093/mnras/stw379

  19. [19]

    Pontzen and F

    Repetto, S., Davies, M. B., & Sigurdsson, S. 2012, Monthly Notices of the Royal Astronomical Society, 425, 2799, doi: 10.1111/j.1365-2966.2012.21549.x

  20. [20]

    L., Chatterjee, S., & Rasio, F

    Rodriguez, C. L., Chatterjee, S., & Rasio, F. A. 2016, Physical Review D, 93, 084029, doi: 10.1103/PhysRevD.93.084029

  21. [21]

    K., Davis, D., et al

    Soni, S., Berger, B. K., Davis, D., et al. 2025, Classical and Quantum Gravity, 42, 085016, doi: 10.1088/1361-6382/adc4b6

  22. [22]

    2017, Physical Review D, 96, 023012, doi: 10.1103/PhysRevD.96.023012

    Talbot, C., & Thrane, E. 2017, Physical Review D, 96, 023012, doi: 10.1103/PhysRevD.96.023012

  23. [24]

    The Astrophysical Journal , author =

    Talbot, C., & Thrane, E. 2018b, The Astrophysical Journal, 856, 173, doi: 10.3847/1538-4357/aab34c The LIGO Scientific Collaboration, The Virgo Collaboration, & The KAGRA Collaboration. 2023, Physical Review X, 13, 011048, doi: 10.1103/PhysRevX.13.011048

  24. [25]

    M., & Fishbach, M

    Vijaykumar, A., Farah, A. M., & Fishbach, M. 2026, Astrophys. J. Lett., 999, L30, doi: 10.3847/2041-8213/ae4878

  25. [26]

    2017, Physical Review Letters, 119, 251103, doi: 10.1103/PhysRevLett.119.251103

    Zimmerman, A. 2017, Physical Review Letters, 119, 251103, doi: 10.1103/PhysRevLett.119.251103

  26. [27]

    Wong, K. W. K., & Cranmer, M. 2022, arXiv e-prints. https://arxiv.org/abs/2207.12409

  27. [28]

    and Berry, Christopher P

    Zevin, M., Bavera, S. S., Berry, C. P. L., et al. 2021, The Astrophysical Journal, 910, 152, doi: 10.3847/1538-4357/abe40e