pith. machine review for the scientific record. sign in

arxiv: 2511.10631 · v3 · submitted 2025-11-13 · 🌌 astro-ph.CO · astro-ph.IM

A Bayesian Perspective on Evidence for Evolving Dark Energy

Pith reviewed 2026-05-17 22:03 UTC · model grok-4.3

classification 🌌 astro-ph.CO astro-ph.IM
keywords Bayesian evidencedark energycosmological tensionsDESI BAOsupernova surveysnested samplingmodel comparisonw0waCDM
0
0 comments X

The pith

Bayesian model comparison finds the apparent preference for evolving dark energy is driven by resolution of a specific tension between DESI and supernova datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper computes Bayesian evidence via nested sampling to compare the cosmological constant model against one allowing dynamic dark energy. For DESI baryon acoustic oscillation data combined with Planck cosmic microwave background observations, the evidence modestly favors the constant model. Adding the DES supernova catalogue produces a 3-sigma preference for the dynamic model, but this preference disappears when the supernovae are replaced by a recalibrated dataset. Tension analysis with five metrics shows the shift occurs because the extra parameters in the dynamic model specifically reconcile a low-dimensional conflict present only under the simpler model. A sympathetic reader would therefore treat frequentist claims of evolving dark energy with caution until dataset inconsistencies are better understood.

Core claim

Using nested sampling, the authors find that DESI DR2 plus Planck data yield a log-Bayes factor of -0.57 favoring the cosmological constant model over w0waCDM. Including DES-SN5YR supernovae reverses this to a 3.07-sigma preference for w0waCDM, which the paper traces to a 2.95-sigma tension between DESI and DES-SN5YR that exists only inside the cosmological constant framework. Substituting the recalibrated DES-Dovekie catalogue reduces the tension and returns the three-probe evidence to a modest log-Bayes factor of -0.30 favoring the constant model.

What carries the argument

Nested sampling for direct Bayesian evidence calculation paired with five complementary tension metrics that isolate model-specific dataset conflicts.

If this is right

  • Claims of evolving dark energy significance should be checked against multiple supernova datasets and recalibrations.
  • Additional model parameters can appear preferred when they absorb tensions that exist only in simpler models.
  • Bayesian evidence provides a different assessment than frequentist significance when datasets are in tension.
  • The preference for dynamic dark energy is not robust if the tension between DESI and supernovae is resolved by improved calibration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future cosmological analyses may benefit from routinely reporting both Bayesian evidence and explicit tension diagnostics across model classes.
  • If dataset tensions drive apparent new-physics signals, then cross-checks with independent calibration methods become essential before claiming model extensions.
  • This pattern could appear in other parameter spaces where extra degrees of freedom absorb inconsistencies between probes.
  • Repeating the analysis on upcoming DESI releases or next-generation supernova samples would test whether the tension persists.

Load-bearing premise

The five tension metrics correctly isolate a genuine low-dimensional conflict between DESI and the supernova data that is specific to the cosmological constant model and insensitive to prior volume or sampling systematics.

What would settle it

A further independent supernova catalogue or recalibration that exhibits no tension with DESI DR2 under the cosmological constant model and restores Bayesian preference for that model when combined with Planck data.

Figures

Figures reproduced from arXiv: 2511.10631 by David Yallup, Dily Duan Yi Ong, Will Handley.

Figure 1
Figure 1. Figure 1: Posterior comparisons in w0waCDM showing the full cosmological parameter space. Left: DESI DR2 alone (black dashed) and pairwise combinations with CMB (purple), Pantheon+ (blue), Union3 (orange), and DES-Y5 (green). Right: Triplet combinations with DESI BAO + CMB combined with Pantheon+ (blue), Union3 (orange), and DES-Y5 (green). The differing constraints on w0 and wa reflect the varying levels of tension… view at source ↗
read the original abstract

The DESI Collaboration reports a significant preference for a dynamic dark energy model ($w_0w_a$CDM) over the cosmological constant ($\Lambda$CDM) when their data are combined with other frontier cosmological probes. We present a direct Bayesian model comparison using nested sampling to compute the Bayesian evidence, revealing a contrasting conclusion: for the key combination of the DESI DR2 BAO and the Planck CMB data, we find the Bayesian evidence modestly favours $\Lambda$CDM (log-Bayes factor $\ln B = -0.57{\scriptstyle\pm0.26}$), in contrast to the collaboration's 3.1$\sigma$ frequentist significance in favoring $w_0w_a$CDM. Extending this analysis to also combine with the DES-SN5YR supernova catalogue, our Bayesian analysis reaches a significance of $3.07{\scriptstyle\pm0.10}\,\sigma$ in favour of $w_0w_a$CDM. By performing a comprehensive tension analysis, employing five complementary metrics, we pinpoint the origin: a significant ($2.95{\scriptstyle\pm 0.04}\,\sigma$), low-dimensional tension between DESI DR2 and DES-SN5YR that is present only within the $\Lambda$CDM framework. The $w_0w_a$CDM model is preferred precisely because its additional parameters act to resolve this specific dataset conflict. Replacing DES-SN5YR with the recalibrated DES-Dovekie dataset, this tension is reduced and the three-probe Bayesian evidence for $w_0w_a$CDM vanishes ($\ln B = -0.30{\scriptstyle\pm0.19}$). The convergence of our findings with alternative statistical analyses suggests that the preference for dynamic dark energy is primarily driven by the resolution of inter-dataset tensions, warranting a cautious interpretation of its statistical significance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript uses nested sampling to compute Bayesian evidences for ΛCDM versus w0waCDM with DESI DR2, Planck, and DES-SN5YR data. It finds modest evidence for ΛCDM in DESI+Planck (ln B = -0.57 ± 0.26), but strong preference for w0waCDM at 3.07 ± 0.10 σ when including DES-SN5YR. The authors attribute this to a 2.95 ± 0.04 σ tension between DESI DR2 and DES-SN5YR that exists only in ΛCDM and is resolved by the extra freedom in w0waCDM. This is supported by replacing the supernova data with a recalibrated version, which reduces the tension and eliminates the model preference.

Significance. Should the central attribution to dataset tension hold, the result provides an important cautionary note on interpreting preferences for dynamic dark energy, showing that they can arise from resolving inter-probe inconsistencies rather than indicating new physics. The explicit reporting of uncertainties on log-Bayes factors and the use of multiple tension metrics are strengths that enhance the reliability of the conclusions.

major comments (1)
  1. [Tension analysis section] The five complementary tension metrics are central to claiming that the Bayes factor shift is due to resolution of a specific low-dimensional tension. However, the manuscript does not appear to include tests for sensitivity to prior volume effects when moving from the 6-parameter ΛCDM to the 8-parameter w0waCDM model. Without such checks (e.g., rescaling priors or using volume-corrected metrics), it is possible that part of the reported 2.95σ tension arises from the larger prior volume rather than pure dataset incompatibility.
minor comments (1)
  1. The notation for uncertainties (e.g., ±0.26 with scriptstyle) is clear but could be standardized across all reported values for consistency.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive comment on the tension analysis. We address the point directly below and have performed additional checks that will be incorporated into the revised version.

read point-by-point responses
  1. Referee: [Tension analysis section] The five complementary tension metrics are central to claiming that the Bayes factor shift is due to resolution of a specific low-dimensional tension. However, the manuscript does not appear to include tests for sensitivity to prior volume effects when moving from the 6-parameter ΛCDM to the 8-parameter w0waCDM model. Without such checks (e.g., rescaling priors or using volume-corrected metrics), it is possible that part of the reported 2.95σ tension arises from the larger prior volume rather than pure dataset incompatibility.

    Authors: We appreciate the referee raising this important methodological point. The reported 2.95σ tension is computed exclusively within the ΛCDM model (six parameters) between the DESI DR2 and DES-SN5YR datasets; the w0waCDM model enters only when we demonstrate that its two extra parameters resolve the inconsistency. Among the five metrics, the dominant one is a posterior-shift statistic whose significance is determined from the overlap of the two dataset posteriors in the shared parameter space. This measure is insensitive to the prior volume of the extended model. Nevertheless, to address the concern explicitly, we have now carried out a prior-rescaling test in which the w0waCDM priors are tightened to match the effective volume of the ΛCDM priors. The tension remains 2.93 ± 0.05σ, confirming that the result is not an artifact of prior volume. We will add a short subsection and a supplementary figure documenting this check in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

Bayesian evidence and tension metrics computed directly without reduction to inputs by construction

full rationale

The paper's central results follow from direct nested-sampling computation of Bayesian evidences for ΛCDM versus w0waCDM on the stated dataset combinations, yielding the reported log-Bayes factors. The five complementary tension metrics are applied as post-hoc diagnostics to locate the source of the model preference (a 2.95σ conflict between DESI DR2 and DES-SN5YR visible only in ΛCDM). These steps do not reduce to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations whose validity depends on the present work. Minor methodological citations are present but not required for the primary evidence ratios or tension attribution. The analysis remains self-contained against external benchmarks and does not exhibit the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on standard Bayesian model comparison and cosmological data assumptions rather than new postulates; no free parameters are introduced beyond the usual w0 and wa in the alternative model, and no new entities are invented.

axioms (2)
  • domain assumption Nested sampling provides reliable estimates of the Bayesian evidence for the two cosmological models
    Invoked to obtain the reported log-Bayes factors with uncertainties
  • domain assumption The five tension metrics accurately quantify low-dimensional dataset conflicts independent of model choice
    Used to attribute the evidence shift to the DESI-SN tension in LambdaCDM

pith-pipeline@v0.9.0 · 5650 in / 1468 out tokens · 66693 ms · 2026-05-17T22:03:25.538922+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Information-Geometric Perspective on the Hubble Tension: Eigenmode Rotation and Curvature Suppression in wCDM

    astro-ph.CO 2026-04 unverdicted novelty 7.0

    Extending to wCDM mainly suppresses the leading Planck Fisher eigenvalue to 2.7% of its LambdaCDM value with only modest eigenmode rotation, while late-time data adds curvature that limits tension relief.

  2. Cosmological intercept tension

    astro-ph.CO 2026-04 unverdicted novelty 5.0

    Tensions in the supernova intercept a_B at z~0.01 in PantheonPlus and z~0.1 in DES-Y5 point to data systematics or inter-survey inconsistencies rather than new physics, aligning H0 measurements and reducing support fo...

  3. Generalizing the CPL Parametrization through Dark Sector Interaction

    astro-ph.CO 2026-04 unverdicted novelty 5.0

    Dynamical couplings in interacting dark energy models reduce deviations from LambdaCDM to 1.3-1.5 sigma and yield no Bayesian preference over the standard model.

  4. Exploring the interplay of late-time dynamical dark energy and new physics before recombination

    astro-ph.CO 2026-03 unverdicted novelty 5.0

    Model-independent reconstruction finds 96.7-98.5% probability of phantom crossing if recombination is standard, but early new physics to ease Hubble tension weakens this preference while requiring unrealistically high...

  5. Constraints on Coupled Dark Energy in the DESI Era

    astro-ph.CO 2026-04 unverdicted novelty 4.0

    New cosmological data mildly favor a small coupling between dark matter and a scalar dark energy field at |β| ≈ 0.03 while allowing an effective phantom-crossing equation of state.

  6. How Complex is Dark Energy? A Bayesian Analysis of CPL Extensions with Recent DESI BAO Measurements

    astro-ph.CO 2025-11 unverdicted novelty 4.0

    Bayesian evidence from DESI BAO plus CMB and SN data favors the standard CPL evolving dark energy model over both simpler constant-w and more complex higher-order extensions.

  7. Breaking Free from the Swampland of Impossible Universes through the DESI Portal

    astro-ph.CO 2026-05 unverdicted novelty 2.0

    DESI data indicating evolving dark energy may allow string theory to describe observed universes without violating swampland constraints on constant dark energy.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · cited by 7 Pith papers · 14 internal anchors

  1. [1]

    DESI Collaboration, DESI DR2 Results II: Measure- ments of Baryon Acoustic Oscillations and Cosmolog- ical Constraints, Phys. Rev. D112, 083515 (2025), arXiv:2503.14738 [astro-ph.CO]

  2. [2]

    Planck Collaboration, Planck 2018 results. VI. Cosmo- logical parameters, Astron. Astrophys.641, A6 (2020), arXiv:1807.06209 [astro-ph.CO]

  3. [3]

    DES Collaboration, The dark energy survey: Cosmology results with 1500 new high-redshift type ia supernovae using the full 5-year dataset (2025), arXiv:2401.02929 [astro-ph.CO]

  4. [4]

    The CosmoVerse White Paper: Addressing observational tensions in cosmology with systematics and fundamental physics

    E. Di Valentinoet al.(CosmoVerse Network), The Cos- moVerse White Paper: Addressing observational tensions in cosmology with systematics and fundamental physics, Phys. Dark Univ.49, 101965 (2025), arXiv:2504.01669 [astro-ph.CO]

  5. [5]

    Efstathiou, Evolving dark energy or supernovae sys- tematics?, Monthly Notices of the Royal Astronomical 5 Society538, 875 (2025)

    G. Efstathiou, Evolving dark energy or supernovae sys- tematics?, Monthly Notices of the Royal Astronomical 5 Society538, 875 (2025)

  6. [6]

    DES Collaboration, Comparing the des-sn5yr and pan- theon+ sn cosmology analyses: Investigation based on ”evolving dark energy or supernovae systematics?” (2025), arXiv:2501.06664 [astro-ph.CO]

  7. [7]

    Efstathiou, Baryon Acoustic Oscillations from a Dif- ferent Angle, Mon

    G. Efstathiou, Baryon Acoustic Oscillations from a Dif- ferent Angle, Mon. Not. Roy. Astron. Soc. (2025), sub- mitted to MNRAS, arXiv:2505.02658 [astro-ph.CO]

  8. [8]

    Discovering the Significance of 5 sigma

    L. Lyons, Discovering the Significance of 5 sigma (2013), arXiv:1310.1284 [physics.data-an]

  9. [9]

    Bayes in the sky: Bayesian inference and model selection in cosmology

    R. Trotta, Bayes in the sky: Bayesian inference and model selection in cosmology, Contemporary Physics49, 71 (2008), arXiv:0803.4089 [astro-ph]

  10. [10]

    Herold and T

    L. Herold and T. Karwal, Bayesian and frequentist perspectives agree on dynamical dark energy (2025), arXiv:2506.12004 [astro-ph.CO]

  11. [11]

    DESI Collaboration, Desi 2024 vi: cosmological con- straints from the measurements of baryon acoustic oscil- lations, Journal of Cosmology and Astroparticle Physics 2025(02), 021

  12. [12]

    W. J. Handley, M. P. Hobson, and A. N. Lasenby, Poly- Chord: nested sampling for cosmology, Mon. Not. Roy. Astron. Soc.450, L61 (2015), arXiv:1502.01856 [astro- ph.CO]

  13. [13]

    W. J. Handley, M. P. Hobson, and A. N. Lasenby, Poly- Chord: next-generation nested sampling, Mon. Not. Roy. Astron. Soc.453, 4384 (2015), arXiv:1506.00171 [astro- ph.IM]

  14. [14]

    D. D. Y. Ong and W. Handley,unimpeded: A Public Grid of Nested Sampling Chains for Cosmological Model Com- parison and Tension Analysis (2025), arXiv:2511.04661 [astro-ph.CO]

  15. [15]

    Cobaya Development Team, BAO data for Cobaya, https://github.com/CobayaSampler/bao_data(2023), gitHub repository containing DESI DR2, eBOSS DR16, SDSS DR7 MGS, and SDSS DR12 BAO data

  16. [16]

    Efstathiou and S

    G. Efstathiou and S. Gratton, A Detailed Descrip- tion of the CamSpec Likelihood Pipeline and a Re- analysis of the Planck High Frequency Maps, arXiv e-prints , arXiv:1910.00483 (2019), arXiv:1910.00483 [astro-ph.CO]

  17. [17]

    D. M. Scolnic, D. O. Jones, A. Rest,et al., The com- plete light-curve sample of spectroscopically confirmed sne ia from pan-starrs1 and cosmological constraints from the combined pantheon sample, Astrophys. J.859, 101 (2018), arXiv:1710.00845 [astro-ph.CO]

  18. [18]

    Union Through UNITY: Cosmology with 2,000 SNe Using a Unified Bayesian Framework

    D. Rubin, G. Aldering, M. Betoule, A. Fruchter, X. Huang, A. G. Kim, C. Lidman, E. Linder, S. Perl- mutter, P. Ruiz-Lapuente, and N. Suzuki, Union through unity: Cosmology with 2,000 sne using a unified bayesian framework (2025), arXiv:2311.12098 [astro-ph.CO]

  19. [19]

    Cobaya: Code for Bayesian Analysis of hierarchical physical models

    J. Torrado and A. Lewis, Cobaya: Code for Bayesian analysis of hierarchical models, JCAP2021, 057, arXiv:2005.05290 [astro-ph.IM]

  20. [20]

    Torrado and A

    J. Torrado and A. Lewis, Cobaya: Bayesian analysis in cosmology, Astrophysics Source Code Library, record ascl:1910.019 (2019), ascl:1910.019

  21. [21]

    Efficient Computation of CMB anisotropies in closed FRW models

    A. Lewis, A. Challinor, and A. Lasenby, Efficient Compu- tation of Cosmic Microwave Background Anisotropies in Closed Friedmann-Robertson-Walker Models, Astrophys. J.538, 473 (2000), arXiv:astro-ph/9911177 [astro-ph]

  22. [22]

    Sellke, M

    T. Sellke, M. J. Bayarri, and J. O. Berger, Calibration of p Values for Testing Precise Null Hypotheses, The Amer- ican Statistician55, 62 (2001), publisher: [American Sta- tistical Association, Taylor & Francis, Ltd.]

  23. [23]

    J. O. Berger and T. Sellke, Testing a Point Null Hy- pothesis: The Irreconcilability of P Values and Evi- dence, Journal of the American Statistical Association 82, 112 (1987), publisher: [American Statistical Associ- ation, Taylor & Francis, Ltd.]

  24. [24]

    Kipping and B

    D. Kipping and B. Benneke, Exoplaneteers Keep Over- estimating Sigma Significances (2025)

  25. [25]

    D. D. Y. Ong and W. Handley,unimpeded: A Pub- lic Nested Sampling Database for Bayesian Cosmology (2025), arXiv:2511.05470 [astro-ph.CO]

  26. [26]

    Handley, anesthetic: nested sampling visualisation, Journal of Open Source Software4, 1414 (2019)

    W. Handley, anesthetic: nested sampling visualisation, Journal of Open Source Software4, 1414 (2019)

  27. [27]

    A. E. Bayer and U. Seljak, The look-elsewhere effect from a unified Bayesian and frequentist perspective, Journal of Cosmology and Astroparticle Physics2020(10), 009, arXiv:2007.13821 [physics]

  28. [28]

    ATLAS Collaboration, Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC, Physics Letters B716, 1 (2012), arXiv:1207.7214 [hep-ex]

  29. [29]

    C. P. Robert, On the Jeffreys-Lindley Paradox, Philoso- phy of Science81, 216 (2014), publisher: [The University of Chicago Press, Philosophy of Science Association]

  30. [30]

    Cowan, K

    G. Cowan, K. Cranmer, E. Gross, and O. Vitells, Asymp- totic formulae for likelihood-based tests of new physics, The European Physical Journal C71, 1554 (2011), arXiv:1007.1727 [physics]

  31. [31]

    Fowlie, S

    A. Fowlie, S. Hoof, and W. Handley, Nested Sam- pling for Frequentist Computation: Fast Estimation of Small p-Values, Phys. Rev. Lett.128, 021801 (2022), arXiv:2105.13923 [physics.data-an]