pith. sign in

arxiv: 2606.02696 · v1 · pith:J6KCXQFJnew · submitted 2026-06-01 · 🌌 astro-ph.EP · astro-ph.IM

HERMES: HiERarchical Modelling for Exoplanet Science

Pith reviewed 2026-06-28 12:23 UTC · model grok-4.3

classification 🌌 astro-ph.EP astro-ph.IM
keywords exoplanet atmospheresBayesian hierarchical modelingmetallicity correlationspopulation trendssurvey simulationatmospheric characterizationAriel mission
0
0 comments X

The pith

HERMES recovers the correlation between stellar and planetary metallicity from Ariel surveys of at least 400 planets despite 1.2 dex scatter.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HERMES as a multidimensional Bayesian framework to extract population-level correlations from exoplanet atmospheric data. It tests the approach on simulated surveys built from the Ariel candidate sample by injecting known trends among stellar metallicity, planetary mass, and atmospheric metallicity, then attempting to recover those trends. Recovery of the stellar-planetary metallicity link succeeds for samples of 400 or more planets even when intrinsic scatter reaches 1.2 dex and measurement noise is included. This result is useful because upcoming large surveys will produce complex datasets, and the framework shows how to isolate meaningful trends amid the scatter. The tests also confirm that survey leverage continues to indicate how precisely trends can be measured when working in multiple dimensions at once.

Core claim

HERMES is a multidimensional Bayesian framework for probing population-level correlations across multiple axes of diversity. Starting from the Ariel Mission Candidate Sample, the authors select planets with known masses and stellar metallicities, inject plausible multidimensional trends, and generate simulated surveys with varying leverage, sample size, intrinsic astrophysical scatter, and measurement noise. By fitting independent Bayesian models to each survey they show that a Tier 2 transit survey of at least 400 planets allows robust recovery of the correlation between stellar and planetary metallicity despite intrinsic scatter in planetary abundances as large as 1.2 dex.

What carries the argument

HERMES, the hierarchical Bayesian framework that jointly models multidimensional population trends while accounting for measurement noise and intrinsic scatter.

If this is right

  • A sample of at least 400 planets suffices to recover the stellar-planetary metallicity correlation in the presence of 1.2 dex scatter.
  • Survey leverage remains a reliable predictor of trend precision even when multiple dimensions and intrinsic scatter are present.
  • The framework can be used for survey design and science yield forecasting ahead of large atmospheric characterization missions.
  • Recovery of injected trends holds across a variety of sample sizes and leverage values when realistic noise is included.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If applied to real observations, the same approach could test whether stellar composition directly shapes planetary atmospheric enrichment beyond what formation models predict.
  • Extending HERMES to additional variables such as orbital distance or host-star type could isolate which factors most influence atmospheric diversity.
  • The method might be adapted to other upcoming surveys to forecast the minimum sample size needed to detect weaker correlations.
  • Real data tests would reveal whether unmodeled selection biases alter the apparent strength of recovered trends.

Load-bearing premise

The multidimensional trends injected into the simulated surveys accurately capture the statistical structure and selection effects present in real exoplanet observations.

What would settle it

Applying HERMES to actual Ariel Tier 2 data and recovering a stellar-planetary metallicity correlation strength that deviates significantly from the value obtained in matching simulated surveys of the same size.

Figures

Figures reproduced from arXiv: 2606.02696 by Nicolas B. Cowan, Wasi M. F. Naqvi.

Figure 1
Figure 1. Figure 1: Nested mass-class scheme S1–S4. Each successive class removes the lowest-mass quartile, reducing mass leverage while preserving the high-mass tail. Surveys are drawn from each class independently [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Survey design space: the standard deviation in plane￾tary mass, σM, versus sample size, N. Each point represents one survey, with colours indicating mass class (S1–S4). Contours show curves of constant mass leverage, Lmass ∝ √ N σM. As expected, S1 surveys span the widest mass range and achieve the highest leverage at every N. deviations around the sample mean, ¯x: L = vuutXN i=1 (xi − x¯) 2. (2) Since we … view at source ↗
Figure 3
Figure 3. Figure 3: The left panel shows planetary atmospheric abundance log XH2O versus centered log( M MJ ), and the right panel gives log XH2O versus centered stellar metallicity [Fe/H]⋆. In both cases, the posterior median trend and its 68% credible band are shown. The error bars correspond to 0.2 dex for log XH2O and true measurement uncertainty on stellar metallicity from the MCS. At fixed sample size, σε and αp show li… view at source ↗
Figure 4
Figure 4. Figure 4: Posterior uncertainty in the mass–metallicity slope as a function of planetary-mass leverage (left panels) and host-star￾metallicity leverage (right panels), shown for surveys with 80 and 250 planets. Each point represents one mock survey. Dashed curves show power-law fits, with the shaded regions indicating the corresponding prediction bands. As sample size increases, the fit approaches the inverse-levera… view at source ↗
Figure 5
Figure 5. Figure 5: The heatmap shows the practical recovery threshold for the stellar–planetary metallicity correlation: small and moderate surveys lose sensitivity once intrinsic scatter becomes large, whereas Ariel-scale samples maintain high recovery fractions across the tested scatter range. for [Fe/H]⋆ into the posterior and avoiding the attenuation bias that would arise from treating observed stellar metal￾licities as … view at source ↗
read the original abstract

ESA's Ariel Space Mission will characterise the atmospheres of approximately 1000 exoplanets to quantify population-level trends. We present HERMES (HiERarchical Modelling for Exoplanet Science), a multidimensional Bayesian framework that probes population-level correlations across multiple axes of diversity. The specific use case we present is the multidimensional relation between stellar metallicity, planetary mass, and atmospheric metallicity. Starting from the Ariel Mission Candidate Sample (Edwards & Tinetti 2022), we select confirmed planets with available masses and stellar metallicity, inject plausible multidimensional trends and demonstrate successful parameter recovery. Simulated surveys are generated with a variety of leverage and sample size, in the presence of intrinsic astrophysical scatter and measurement noise. By fitting independent Bayesian models to each survey, we confirm that survey leverage remains a reliable predictor of trend precision even in multiple dimensions and in the presence of intrinsic astrophysical scatter. For an Ariel Tier 2 transit survey of at least 400 planets, HERMES robustly recovers the correlation between stellar and planetary metallicity despite intrinsic scatter in planetary abundances as large as 1.2 dex. These results establish HERMES as a practical tool for survey design and science yield forecasting in preparation for Ariel and other surveys probing population-level trends.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces HERMES, a multidimensional hierarchical Bayesian framework for probing population-level correlations in exoplanet data, with a focus on the joint relation between stellar metallicity, planetary mass, and atmospheric metallicity. Starting from the Ariel Mission Candidate Sample, the authors select planets with masses and stellar metallicities, inject plausible multidimensional trends into simulated surveys of varying size and leverage, add intrinsic scatter and measurement noise, and demonstrate parameter recovery using independent Bayesian fits to each simulated survey. The central result is that an Ariel Tier 2 transit survey of at least 400 planets allows robust recovery of the stellar-planetary metallicity correlation even with 1.2 dex scatter; the work positions HERMES as a tool for survey design and science-yield forecasting.

Significance. If the simulation assumptions hold, the framework offers a controlled way to forecast the detectability of multidimensional trends ahead of Ariel and similar missions, with the explicit demonstration that leverage remains predictive of precision even in the presence of scatter and multiple dimensions. The simulation-based validation with injected trends and noise provides a reproducible testbed for method performance, which is a constructive contribution to population-level exoplanet studies.

major comments (2)
  1. [Abstract] Abstract: the headline claim that HERMES 'robustly recovers the correlation between stellar and planetary metallicity' for an Ariel Tier 2 survey of at least 400 planets is demonstrated exclusively on synthetic surveys in which the authors themselves inject the multidimensional trends; no quantitative validation is provided that the injected joint distributions of mass, metallicity, and the survey selection function match the statistical structure or selection biases expected from real Ariel Tier 2 observations.
  2. [Abstract] Abstract (simulation description): the manuscript states that 'plausible multidimensional trends' are injected and that recovery succeeds, but does not report any sensitivity tests in which the fitted model is deliberately misspecified relative to the injection (e.g., different functional forms or additional unmodeled covariances); such tests are load-bearing for the claim of robustness when the true population may deviate from the authors' injection assumptions.
minor comments (1)
  1. [Abstract] The abstract cites Edwards & Tinetti 2022 for the Mission Candidate Sample but does not indicate whether any updates to that catalog or additional selection cuts are applied; a brief clarification would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The work is a controlled simulation study using the Ariel Mission Candidate Sample to demonstrate HERMES recovery performance ahead of real observations. We address the two major comments below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that HERMES 'robustly recovers the correlation between stellar and planetary metallicity' for an Ariel Tier 2 survey of at least 400 planets is demonstrated exclusively on synthetic surveys in which the authors themselves inject the multidimensional trends; no quantitative validation is provided that the injected joint distributions of mass, metallicity, and the survey selection function match the statistical structure or selection biases expected from real Ariel Tier 2 observations.

    Authors: We agree the demonstration is simulation-based. The injected trends and selection are drawn from the Ariel Mission Candidate Sample (Edwards & Tinetti 2022) combined with literature-informed relations for mass-metallicity trends. Because Ariel Tier 2 data do not yet exist, direct quantitative matching to real observations is not possible. We will revise the abstract and introduction to explicitly state that results are for synthetic surveys with injected trends and to frame the work as a forecasting tool rather than a claim of real-data validation. revision: yes

  2. Referee: [Abstract] Abstract (simulation description): the manuscript states that 'plausible multidimensional trends' are injected and that recovery succeeds, but does not report any sensitivity tests in which the fitted model is deliberately misspecified relative to the injection (e.g., different functional forms or additional unmodeled covariances); such tests are load-bearing for the claim of robustness when the true population may deviate from the authors' injection assumptions.

    Authors: We acknowledge that misspecification tests would strengthen the robustness assessment. In the revised manuscript we will add a dedicated section (or appendix) containing recovery experiments under deliberate misspecification, including (i) fitting a linear relation when data were generated with a power-law form and (ii) omitting an injected covariance term. These will quantify bias and precision degradation under realistic model mismatch. revision: yes

Circularity Check

0 steps flagged

No significant circularity; recovery demonstrated via forward simulation of injected trends

full rationale

The paper describes selecting planets from the Ariel Mission Candidate Sample, injecting plausible multidimensional trends (including stellar-planetary metallicity correlation), generating simulated surveys with varying sample sizes and scatter, and then fitting independent Bayesian models to recover the injected parameters. This is a standard forward-modeling validation exercise to test survey leverage and recovery fidelity; the reported success (e.g., robust recovery for ≥400 planets despite 1.2 dex scatter) is not equivalent to the inputs by construction, nor does it rely on self-citation chains, fitted parameters renamed as predictions, or ansatzes smuggled via prior work. The derivation chain is self-contained against external benchmarks because the injected trends serve as known ground truth for testing, not as the claimed scientific output itself.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are described. The framework implicitly assumes that Bayesian hierarchical models can separate measurement noise from intrinsic astrophysical scatter in multidimensional exoplanet data.

pith-pipeline@v0.9.1-grok · 5749 in / 1135 out tokens · 11777 ms · 2026-06-28T12:23:29.691220+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 15 canonical work pages · 2 internal anchors

  1. [1]

    AJ , year =

    Edwards, Billy and Tinetti, Giovanna , title =. AJ , year =

  2. [2]

    and Pascale, Enzo and Edwards, Billy and Papageorgiou, Andreas and Sarkar, Subhajit and others , title =

    Mugnai, Lorenzo V. and Pascale, Enzo and Edwards, Billy and Papageorgiou, Andreas and Sarkar, Subhajit and others , title =. Experimental Astronomy , year =

  3. [3]

    2025 , howpublished =

  4. [4]

    and Hubeny, Ivan and Spiegelman, Fernand and Leininger, Thierry , title =

    Welbanks, Luis and Madhusudhan, Nikku and Allard, Nicole F. and Hubeny, Ivan and Spiegelman, Fernand and Leininger, Thierry , title =. ApJL , year =

  5. [5]

    and Coull-Neveu, Ben , title =

    Cowan, Nicolas B. and Coull-Neveu, Ben , title =. The Open Journal of Astrophysics , year =

  6. [6]

    2025 , eprint=

    A Comprehensive Analysis Spitzer 4.5 m Phase Curve of Hot Jupiters , author=. 2025 , eprint=

  7. [7]

    , title =

    Keating, Dylan and Cowan, Nicolas B. , title =. MNRAS , year =

  8. [8]

    The Compositional Dimension of Planet Formation , ISBN=

    Turrini, Diego , year=. The Compositional Dimension of Planet Formation , ISBN=. doi:10.1142/9781800613140_0001 , booktitle=

  9. [9]

    Journal of Machine Learning Research , year =

    Watanabe, Sumio , title =. Journal of Machine Learning Research , year =

  10. [10]

    Statistics and Computing , year =

    Gelman, Andrew and Hwang, Jessica and Vehtari, Aki , title =. Statistics and Computing , year =

  11. [11]

    AJ , year =

    Sun, Qinghui and Wang, Sharon Xuesong and Welbanks, Luis and Teske, Johanna and Buchner, Johannes , title =. AJ , year =

  12. [12]

    and Cowan, Nicolas B

    D'Aoust, Lina and Coull-Neveu, Ben and Lee, Eve J. and Cowan, Nicolas B. , title =. arXiv e-prints , year =. doi:10.48550/arXiv.2507.13446 , archivePrefix =. 2507.13446 , primaryClass =

  13. [13]

    and Zellem, Robert T

    Burt, Jennifer A. and Zellem, Robert T. and Ciardi, David R. and Kanodia, Shubham and Bryden, Geoffrey and Kataria, Tiffany and Pearson, Kyle A. and Christiansen, Jessie L. and Beichman, Charles and Fulton, B. J. and Swain, Mark and others , title =. arXiv e-prints , year =. doi:10.48550/arXiv.2508.03801 , archivePrefix =. 2508.03801 , primaryClass =

  14. [14]

    and Ohno, Kazumasa and Thorngren, Daniel and Murray-Clay, Ruth , title =

    Chachan, Yayaati and Fortney, Jonathan J. and Ohno, Kazumasa and Thorngren, Daniel and Murray-Clay, Ruth , title =. arXiv e-prints , year =. doi:10.48550/arXiv.2509.20428 , archivePrefix =. 2509.20428 , primaryClass =

  15. [15]

    and Noyes, Robert W

    Charbonneau, David and Brown, Timothy M. and Noyes, Robert W. and Gilliland, Ronald L. , title =. ApJ , year =

  16. [16]

    and Megeath, S

    Charbonneau, David and Allen, Lori E. and Megeath, S. Thomas and Torres, Guillermo and Alonso, Roi and Brown, Timothy M. and Gilliland, Ronald L. and Latham, David W. and Mandushev, Georgi and O'Donovan, Francis T. and Sozzetti, Alessandro , title =. The Astrophysical Journal , year =. doi:10.1086/429991 , url =

  17. [17]

    and Agol, Eric , title =

    Cowan, Nicolas B. and Agol, Eric , title =. The Astrophysical Journal , year =. doi:10.1088/0004-637x/729/1/54 , url =

  18. [18]

    and Nymeyer, Sarah and Campo, Christopher J

    Madhusudhan, Nikku and Harrington, Joseph and Stevenson, Kevin B. and Nymeyer, Sarah and Campo, Christopher J. and Wheatley, Peter J. and Deming, Drake and Blecic, Jasmina and Hardy, Ryan A. and Lust, Nate B. and Anderson, David R. and Collier-Cameron, Andrew and Britt, Christopher B. T. and Bowman, William C. and Hebb, Leslie and Hellier, Coel and Maxted...

  19. [19]

    Exoplanetary Atmospheres: Chemistry, Formation Conditions, and Habitability , journal =

    Madhusudhan, Nikku and Ag. Exoplanetary Atmospheres: Chemistry, Formation Conditions, and Habitability , journal =. 2016 , volume =

  20. [20]

    and Fortney, Jonathan J

    Sing, David K. and Fortney, Jonathan J. and Nikolov, Nikolay and Wakeford, Hannah R. and Kataria, Tiffany and Evans, Thomas M. and Aigrain, Suzanne and Ballester, Gilda E. and Burrows, Adam S. and Deming, Drake and D. A continuum from clear to cloudy hot-Jupiter exoplanets without primordial water depletion , journal =. 2016 , volume =. doi:10.1038/nature...

  21. [21]

    Zingales, Tiziano and Tinetti, Giovanna and Pillitteri, Ignazio and Leconte, J. The. Experimental Astronomy , year =

  22. [22]

    A chemical survey of exoplanets with

    Tinetti, Giovanna and Drossart, Pierre and Eccleston, Paul and Hartogh, Paul and Heske, Astrid and Leconte, J. A chemical survey of exoplanets with. Experimental Astronomy , year =

  23. [23]

    Ariel: Enabling planetary science across light-years , year =

    Tinetti, Giovanna and Eccleston, Paul and Haswell, Carole and Lagage, Pierre-Olivier and Leconte, J. Ariel: Enabling planetary science across light-years , year =. 2104.04824 , archivePrefix =

  24. [24]

    Annual Review of Earth and Planetary Sciences , year =

    Ikoma, Masahiro and Kobayashi, Hiroshi , title =. Annual Review of Earth and Planetary Sciences , year =. doi:10.48550/arXiv.2504.04090 , archivePrefix =. 2504.04090 , primaryClass =

  25. [25]

    On the Information Content of Ariel Transmission Spectra: Reassessing the Tier System

    Radica, Michael and Cowan, Nicolas B. and Cloutier, Ryan and Wang, Leo Yang , title =. arXiv e-prints , year =. doi:10.48550/arXiv.2604.07598 , archivePrefix =. 2604.07598 , primaryClass =

  26. [26]

    and Heckman, Timothy M

    Tremonti, Christy A. and Heckman, Timothy M. and Kauffmann, Guinevere and Brinchmann, Jarle and Charlot, Stephane and White, Simon D. M. and Seibert, Mark and Peng, Eric W. and Schlegel, David J. and Uomoto, Alan and Fukugita, Masataka and Brinkmann, Jon , title =. ApJ , year =

  27. [27]

    Annual Review of Earth and Planetary Sciences , volume =

    Guillot, Tristan , title =. Annual Review of Earth and Planetary Sciences , volume =. 2005 , doi =

  28. [28]

    Kreidberg, Laura and Bean, Jacob L. and D. A Precise Water Abundance Measurement for the Hot Jupiter WASP-43b , journal =. 2014 , volume =

  29. [29]

    The metal-rich atmosphere of the exo-Neptune HAT-P-26b , volume=

    MacDonald, Ryan J and Madhusudhan, Nikku , year=. The metal-rich atmosphere of the exo-Neptune HAT-P-26b , volume=. Monthly Notices of the Royal Astronomical Society , publisher=. doi:10.1093/mnras/stz789 , number=

  30. [30]

    Kelly, Brandon C. , year=. Some Aspects of Measurement Error in Linear Regression of Astronomical Data , volume=. The Astrophysical Journal , publisher=. doi:10.1086/519947 , number=

  31. [31]

    2010 , eprint=

    Data analysis recipes: Fitting a model to data , author=. 2010 , eprint=

  32. [32]

    Astronomy & Astrophysics , volume =

    A correlation between the heavy element content of transiting extrasolar planets and the metallicity of their parent stars , author =. Astronomy & Astrophysics , volume =. 2006 , month =. doi:10.1051/0004-6361:20065476 , url =

  33. [33]

    and Fortney, Jonathan J

    Thorngren, Daniel P. and Fortney, Jonathan J. and Murray-Clay, Ruth A. and Lopez, Eric D. , year=. THE MASS–METALLICITY RELATION FOR GIANT PLANETS , volume=. The Astrophysical Journal , publisher=. doi:10.3847/0004-637x/831/1/64 , number=

  34. [34]

    and Stevenson, Kevin B

    Lustig-Yaeger, Jacob and Sotzen, Kristin S. and Stevenson, Kevin B. and Luger, Rodrigo and May, Erin M. and Mayorga, L. C. and Mandt, Kathleen and Izenberg, Noam R. , year=. Hierarchical Bayesian Atmospheric Retrieval Modeling for Population Studies of Exoplanet Atmospheres: A Case Study on the Habitable Zone , volume=. The Astronomical Journal , publishe...

  35. [35]

    and Hasegawa, Yasuhiro and Thorngren, Daniel P

    Swain, Mark R. and Hasegawa, Yasuhiro and Thorngren, Daniel P. and Roudier, Gaël M. , year=. Planet Mass and Metallicity: The Exoplanets and Solar System Connection , volume=. Space Science Reviews , publisher=. doi:10.1007/s11214-024-01098-7 , number=

  36. [36]

    and Cowan, Nicolas B

    Panek, Emilie and Roman, Alexander and Matcheva, Katia and Matchev, Konstantin T. and Cowan, Nicolas B. , title =. 2026 , eprint =

  37. [37]

    2011 , eprint=

    The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo , author=. 2011 , eprint=