pith. the verified trust layer for science. sign in

arxiv: 2509.24401 · v2 · submitted 2025-09-29 · 🌌 astro-ph.EP · astro-ph.GA· astro-ph.IM

Technique-agnostic exoplanet demography for the Roman era -- I. Testing a demography retrieval framework using simulated Kepler-like transit datasets

Pith reviewed 2026-05-18 12:52 UTC · model grok-4.3

classification 🌌 astro-ph.EP astro-ph.GAastro-ph.IM
keywords exoplanet demographyRoman Space Telescopetransit detectionmicrolensingstellar population synthesisparameter optimizationKepler missiondifferential evolution
0
0 comments X p. Extension

The pith

The TAED framework enables internally consistent exoplanet demographic forecasts for all detection methods by embedding planetary systems in a galactic stellar population model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a technique-agnostic exoplanet demography framework called TAED for use with the Roman Space Telescope's large haul of planets discovered by microlensing and transits. It uses parameterized demographic distributions placed inside a model of the Galaxy's stellar populations. This setup lets the same model generate predictions for different detection techniques that depend on the positions and motions of stars and planets. Tests on simulated data similar to Kepler transits show that differential evolution can optimize the parameters efficiently and recover them accurately.

Core claim

The TAED forward modelling and retrieval framework uses parameterised model exoplanet demographic distributions to embed planetary systems within a stellar population synthesis model of the Galaxy, enabling internally consistent forecasts to be made for all detection methods that are based on spatio-kinematic system properties. In this paper, as a first test of the TAED framework, we apply it to simulated transit datasets based on the Kepler Data Release 25 to assess parameter recovery accuracy and method scalability for a single large homogeneous dataset. We find that optimisation using differential evolution provides a computationally scalable framework that gives a good balance between

What carries the argument

TAED forward modelling and retrieval framework that embeds parameterized exoplanet demographic distributions within a stellar population synthesis model of the Galaxy

Load-bearing premise

The parameterized model exoplanet demographic distributions and the stellar population synthesis model of the Galaxy accurately capture the underlying distributions and selection effects without significant mismatch to real Galactic structure or planet occurrence rates.

What would settle it

A test where the parameters recovered from the simulated Kepler-like datasets significantly deviate from the input values used to create those simulations would show that the retrieval is not accurate.

Figures

Figures reproduced from arXiv: 2509.24401 by Akshay Priyadarshi, Alexander P. Stephan, Ali Crisp, Alison Duck, Amber Malpas, Aparna Bhattacharya, Arjun Murlidhar, Casey Lam, Chas Beichman, Chris Brandon, David P. Bennett, Eamonn Kerins, Elisa Quintana, Etienne Bachelet, Farzaneh Zohrabi, Greg Olmschenk, Himanshu Verma, Jay Anderson, Jessica R. Lu, Jessie Christiansen, Jon Hulberg, Jorge Martinez-Palomera, Katarzyna Kruszy\'nska, Keivan G. Stassun, Kelsey Hoffman, Kylee Carden, Macy J. Huston, Matthew Penny, Michael D. Albrow, Nestor Espinoza, Rachel A. Street, Robert Wilson, Scott Gaudi, Sean Carey, Sean K. Terry, Sebastiano Calchi Novati, Somayeh Khakpash, Stela Ishitani Silva, Susan Mullally, Takahiro Sumi, Valerio Bozza, Weicheng Zang, William DeRocco, Xavier Lesley-Salda\~na.

Figure 1
Figure 1. Figure 1: The Kepler focal plane array showing the arrangement of the 21 detector modules (black outlines). The red dots show the locations used for our BGM synthetic stellar catalogues. parent magnitudes can be generated in a range of standard optical and infrared passbands. For our test of the TAED framework, we simulate a Kepler-like transit-observables by generating BGM stellar catalogues towards the Kepler fiel… view at source ↗
Figure 2
Figure 2. Figure 2: The distribution of planet-to-star mass ratio and semi-major axis in the different injected sets listed in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples of parameters while recovering parameters for Set 1 (broad 𝑎, peaked 𝑞) using Dynesty nested sampling (NS). The multi-dimensional best-fit value is overplotted in blue colour, while the injected parameters are overplotted in green colour. The dashed lines show the upper and lower errors on the best-fit value. H… view at source ↗
Figure 4
Figure 4. Figure 4: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples of parameters while recovering parameters for Set 1 using the random uniform sampling method (UR). The multi-dimensional best-fit value is overplotted in blue colour, while the injected parameters are overplotted in green colour. The dashed grey lines show the upper-error and the lower-error on the best-fit val… view at source ↗
Figure 5
Figure 5. Figure 5: Figure showing the comparison of predicted likelihood using the 2- stage machine learning model for Set 1. The column of points on the left-most edge (false positives) of the original likelihood axis and a row of points on the bottom-most edge (false negatives) of the predicted likelihood axis correspond to 2.2% of all data points. These points originate from misclassification in the first stage of the mac… view at source ↗
Figure 6
Figure 6. Figure 6: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples of parameters while recovering parameters for Set 1, using the 2-stage machine learning model to predict log-likelihoods. The multi-dimensional best-fit value is overplotted in blue colour, while the injected parameters are overplotted in green colour. The dashed grey lines show the upper-error and the lower-er… view at source ↗
Figure 7
Figure 7. Figure 7: The figure shows the distribution of the injected dataset of exoplanets for Set 1, on the left and the corresponding recovered planets using differential evolution on the middle and right. In the middle panel, we only plot the bins where the injected distribution had non-zero planets. In the rightmost panel, we plot all the bins from the recovered population. The colourmap shows the number of planets on th… view at source ↗
Figure 8
Figure 8. Figure 8: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples of parameters while recovering parameters for Set 1 using differential evolution. The multi-dimensional best-fit value is overplotted in blue colour, while the injected parameters are overplotted in green colour. The dashed grey lines show the upper-error and the lower-error on the best-fit value. We have exami… view at source ↗
Figure 9
Figure 9. Figure 9: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples while fitting for Kepler population using differential evolution. The multi-dimensional best-fit value is overplotted in blue colour. The dashed grey lines show the upper-error and the lower-error on the best-fit value. it highlights how the framework can be used to evolve towards more complex exoplanet models … view at source ↗
Figure 10
Figure 10. Figure 10: The figure shows the distribution of the projected Kepler observations (KeplerPORTs-corrected counts per bin) on the left and the corresponding recovered planets using the 7-parameter model on the middle and right. In the middle panel, we only plot the bins where the projected distribution had non-zero planets. In the rightmost panel, we plot all the bins from the recovered population. The colourmap shows… view at source ↗
Figure 11
Figure 11. Figure 11: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples while fitting for Kepler population on 9-parameter model as defined in Section 6, using differential evolution. The multi-dimensional best-fit value is overplotted in blue colour. The dashed grey lines show the upper-error and the lower-error on the best-fit value. Burke C. J., Catanzarite J., 2017, Planet Det… view at source ↗
Figure 12
Figure 12. Figure 12: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples while fitting for Kepler population on 11-parameter model as defined in Section 6, using differential evolution. The multi-dimensional best-fit value is overplotted in blue colour. The dashed grey lines show the upper-error and the lower-error on the best-fit value. Perryman M., Hartman J., Bakos G. A., Lindeg… view at source ↗
Figure 13
Figure 13. Figure 13: The figure shows the distribution of the projected Kepler observations (KeplerPORTs-corrected counts per bin) on the left and the corresponding recovered planets using the 9-parameter model on the middle and right. In the middle panel, we only plot the bins where the projected distribution had non-zero planets. In the rightmost panel, we plot all the bins from the recovered population. The colourmap shows… view at source ↗
Figure 14
Figure 14. Figure 14: The figure shows the distribution of the projected Kepler observations (KeplerPORTs-corrected counts per bin) on the left and the corresponding recovered planets using the 11-parameter model in the middle and right. In the middle panel, we only plot the bins where the projected distribution had non-zero planets. In the rightmost panel, we plot all the bins from the recovered population. The colourmap show… view at source ↗
Figure 15
Figure 15. Figure 15: The figure shows the distribution of predicted 𝑎𝑚𝑖𝑛 vs observed 𝑎 for all Kepler exoplanets in orange dots, and Solar system planets in purple triangles, calculated using Equation (13). The green squares show the 𝑎𝑚𝑖𝑛 calculated using Equation (11) for Solar system planets. The red line shows the 𝑦 = 𝑥 line. Ideally, we would want all the orange dots to lie close to, and below the red line. MNRAS 000, 1–1… view at source ↗
Figure 16
Figure 16. Figure 16: Cornerplot showing 1-D marginals (diagonals) and 2-D joint posteriors (off-diagonals) of samples while fitting for Kepler population on 7-parameter CPL model using differential evolution. The multi-dimensional best-fit value is overplotted in blue colour. The dashed grey lines show the upper-error and the lower-error on the best-fit value. MNRAS 000, 1–16 (2025) [PITH_FULL_IMAGE:figures/full_fig_p019_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: The figure shows the distribution of the projected Kepler observations on the left and the corresponding recovered planets using the CPL model (Section 6.1) on the middle and right. In the middle panel, we only plot the bins where the projected distribution had non-zero planets. In the rightmost panel, we plot all the bins from the recovered population. The colourmap shows the number of planets on the per… view at source ↗
read the original abstract

The Nancy Grace Roman Space Telescope (Roman) will unveil for the first time the full architecture of planetary systems across Galactic distances through the discovery of up to 200,000 cool and hot exoplanets using microlensing and transit detection methods. Roman's huge exoplanet haul, and Galactic reach, will require new methods to leverage the full exoplanet demographic content of the combined microlensing and transit samples, given the different sensitivity bias of the techniques to planet and host properties and Galactic location. We present a framework for technique-agnostic exoplanet demography (TAED) that can allow large, multi-technique exoplanet samples distributed over Galactic distance scales to be combined for demographic studies. Our TAED forward modelling and retrieval framework uses parameterised model exoplanet demographic distributions to embed planetary systems within a stellar population synthesis model of the Galaxy, enabling internally consistent forecasts to be made for all detection methods that are based on spatio-kinematic system properties. In this paper, as a first test of the TAED framework, we apply it to simulated transit datasets based on the Kepler Data Release 25 to assess parameter recovery accuracy and method scalability for a single large homogeneous dataset. We find that optimisation using differential evolution provides a computationally scalable framework that gives a good balance between computational efficiency and accuracy of parameter recovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces the Technique-Agnostic Exoplanet Demography (TAED) framework, which embeds parameterized exoplanet demographic distributions within a stellar population synthesis model of the Galaxy. This enables internally consistent forecasts for exoplanet detection methods (transit, microlensing) based on spatio-kinematic properties, with the goal of supporting combined demographic analyses of Roman Space Telescope data. As a first test, the authors apply the framework to simulated Kepler DR25-like transit datasets and report that differential evolution optimization recovers the input parameters with good accuracy while offering a favorable balance of computational efficiency and scalability.

Significance. If the recovery results hold under the reported conditions, the TAED framework would offer a practical route to unified demographic inference across detection techniques with differing selection biases and Galactic reach. The forward-modeling approach with known-truth simulations provides a clear test of internal consistency, and the adoption of differential evolution is a pragmatic strength for scalability on large datasets. This is a useful foundation for subsequent papers extending the method to microlensing and real Roman data.

major comments (1)
  1. [Abstract] Abstract: the claim that differential evolution 'gives a good balance between computational efficiency and accuracy of parameter recovery' is not accompanied by quantitative metrics (e.g., fractional bias, credible-interval coverage, or runtime comparisons against alternative optimizers such as MCMC). Without these numbers or a results table/figure showing them, the central claim of the test remains difficult to evaluate.
minor comments (2)
  1. The manuscript should explicitly state the functional forms and free parameters of the exoplanet demographic distributions (e.g., occurrence-rate slopes, period-radius joint distribution) used in the forward model, ideally with a dedicated methods subsection or table.
  2. A short forward-looking paragraph on how the same framework will be applied to simulated microlensing datasets would strengthen the technique-agnostic framing, even if detailed results are reserved for Paper II.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive assessment of our manuscript and for the constructive comment, which helps us improve the clarity of our central claims. We address the point below and will incorporate revisions in the next version.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that differential evolution 'gives a good balance between computational efficiency and accuracy of parameter recovery' is not accompanied by quantitative metrics (e.g., fractional bias, credible-interval coverage, or runtime comparisons against alternative optimizers such as MCMC). Without these numbers or a results table/figure showing them, the central claim of the test remains difficult to evaluate.

    Authors: We thank the referee for this observation. The main text (Sections 3 and 4) presents the quantitative results of the differential evolution retrieval on the simulated Kepler DR25-like datasets, including direct comparisons of recovered demographic parameters to the known input values and discussion of the optimization's performance on large samples. We agree that the abstract would benefit from explicit metrics to support the stated balance. In the revised manuscript we will update the abstract to include concise quantitative indicators drawn from the existing results (e.g., typical recovery accuracy and computational scaling), while retaining the paper's focus on framework validation rather than a full optimizer benchmark. Direct runtime comparisons against MCMC are not part of the current analysis, as our goal was to demonstrate internal consistency of the TAED forward model with differential evolution; we can expand the methods discussion to justify this choice if the referee considers it helpful. revision: yes

Circularity Check

0 steps flagged

No circularity: framework validated via independent simulation benchmarks

full rationale

The paper describes a TAED forward-modeling framework that embeds parameterized exoplanet demographics inside a Galactic stellar population synthesis model to generate technique-agnostic forecasts. Its central test applies the retrieval pipeline to simulated Kepler DR25-like transit datasets whose inputs are known by construction of the simulation. Parameter recovery accuracy is then measured against those known inputs using differential evolution. This is a standard external benchmark test rather than a self-referential loop; the reported success demonstrates optimizer performance when generative assumptions match exactly, without any quoted equation or claim reducing a prediction to a fitted quantity by definition. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked in the provided text that would collapse the derivation chain.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on parameterized demographic distributions and a Galactic stellar population synthesis model whose accuracy is assumed rather than independently validated in the abstract.

free parameters (1)
  • parameters of the exoplanet demographic distributions
    The model uses parameterized distributions whose specific functional forms and values are fitted or chosen to match simulated data.
axioms (1)
  • domain assumption The stellar population synthesis model accurately represents the spatial, kinematic, and stellar properties of the Galaxy.
    Invoked to embed planetary systems consistently for all detection methods.

pith-pipeline@v0.9.0 · 6002 in / 1260 out tokens · 25137 ms · 2026-05-18T12:52:28.121228+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 2 internal anchors

  1. [1]

    Astropy Collaboration et al., 2022, @doi [ ] 10.3847/1538-4357/ac7c74 , https://ui.adsabs.harvard.edu/abs/2022ApJ...935..167A 935, 167

  2. [2]

    M., Latham D

    Brown T. M., Latham D. W., Everett M. E., Esquerdo G. A., 2011, @doi [AJ] 10.1088/0004-6256/142/4/112 , 142, 112

  3. [3]

    Buchner J., 2021, @doi [The Journal of Open Source Software] 10.21105/joss.03001 , https://ui.adsabs.harvard.edu/abs/2021JOSS....6.3001B 6, 3001

  4. [4]

    J., Catanzarite J., 2017, Planet Detection Metrics: Per-Target Detection Contours for Data Release 25 , KSCI-19111-002

    Burke C. J., Catanzarite J., 2017, Planet Detection Metrics: Per-Target Detection Contours for Data Release 25 , KSCI-19111-002

  5. [5]

    J., et al., 2015, @doi [ApJ] 10.1088/0004-637X/809/1/8 , 809, 8

    Burke C. J., et al., 2015, @doi [ApJ] 10.1088/0004-637X/809/1/8 , 809, 8

  6. [6]

    arXiv:2310.16733

    Edmondson K., Norris J., Kerins E., 2023, @doi [arXiv e-prints] 10.48550/arXiv.2310.16733 , https://ui.adsabs.harvard.edu/abs/2023arXiv231016733E p. arXiv:2310.16733

  7. [7]

    J., et al., 2017, @doi [The Astronomical Journal] 10.3847/1538-3881/aa80eb , 154, 109

    Fulton B. J., et al., 2017, @doi [The Astronomical Journal] 10.3847/1538-3881/aa80eb , 154, 109

  8. [8]

    S., Meyer M., Christiansen J., 2021, in 2514-3433, ExoFrontiers

    Gaudi B. S., Meyer M., Christiansen J., 2021, in 2514-3433, ExoFrontiers. IOP Publishing, pp 2--1 to 2--21, @doi 10.1088/2514-3433/ABFA8FCH2

  9. [9]

    C., Ford E

    Hsu D. C., Ford E. B., Ragozzine D., Morehead R. C., 2018, @doi [AJ] 10.3847/1538-3881/AAB9A8 , 155, 205

  10. [10]

    Jeyakumar G., Shanmugavelayutham C., 2011, @doi [International Journal of Artificial Intelligence & Applications] 10.5121/ijaia.2011.2209 , 2, 116–127

  11. [11]

    A., Penny M., Gaudi B

    Johnson S. A., Penny M., Gaudi B. S., Kerins E., Rattenbury N. J., Robin A. C., Calchi Novati S., Henderson C. B., 2020, @doi [The Astronomical Journal] 10.3847/1538-3881/aba75b , 160, 123

  12. [12]

    K., & Ammon, K

    Jordi K., Grebel E. K., Ammon K., 2006, @doi [A&A] 10.1051/0004-6361:20066082 , 460, 339

  13. [13]

    J., Robin A

    Marshall D. J., Robin A. C., Reyl \' e C., Schultheis M., Picaud S., 2006, @doi [A&A] 10.1051/0004-6361:20053842 , 453, 635

  14. [14]

    T., Yee J

    Montet B. T., Yee J. C., Penny M. T., 2017, @doi [Publications of the Astronomical Society of the Pacific] 10.1088/1538-3873/aa57fb , 129, 044401

  15. [15]

    D., Pascucci I., Apai D., Ciesla F

    Mulders G. D., Pascucci I., Apai D., Ciesla F. J., 2018, @doi [AJ] 10.3847/1538-3881/AAC5EA , 156, 24

  16. [16]

    Obertas A., Van Laerhoven C., Tamayo D., 2017, @doi [Icarus] https://doi.org/10.1016/j.icarus.2017.04.010 , 293, 52

  17. [17]

    D., Gould A., Fernandes R., 2018, @doi [ApJ] 10.3847/2041-8213/AAB6AC , 856, L28

    Pascucci I., Mulders G. D., Gould A., Fernandes R., 2018, @doi [ApJ] 10.3847/2041-8213/AAB6AC , 856, L28

  18. [18]

    Penny M., Scott Gaudi B., Kerins E., Rattenbury N., Mao S., Robin A., Calchi Novati S., 2019, @doi [Astrophysical Journal, Supplement Series] 10.3847/1538-4365/aafb69 , 241

  19. [19]

    A., Lindegren L., 2014, @doi [The Astrophysical Journal] 10.1088/0004-637X/797/1/14 , 797, 14

    Perryman M., Hartman J., Bakos G. A., Lindegren L., 2014, @doi [The Astrophysical Journal] 10.1088/0004-637X/797/1/14 , 797, 14

  20. [20]

    E., 2014, IEEE Transactions on Evolutionary Computation

    Qiang J., Mitchell C. E., 2014, IEEE Transactions on Evolutionary Computation

  21. [21]

    C., Reyl \' e C., Derri \` e re S., Picaud S., 2003, @doi [A&A] 10.1051/0004-6361:20031117 , 409, 523

    Robin A. C., Reyl \' e C., Derri \` e re S., Picaud S., 2003, @doi [A&A] 10.1051/0004-6361:20031117 , 409, 523

  22. [22]

    C., Marshall D

    Robin A. C., Marshall D. J., Schultheis M., Reyl \' e C., 2012, @doi [ aap] 10.1051/0004-6361/201116512 , 538, A106

  23. [23]

    Schwarz G., 1978, Annals of Statistics, https://ui.adsabs.harvard.edu/abs/1978AnSta...6..461S 6, 461

  24. [24]

    M., 1967, @doi [Zh

    Sobol' I. M., 1967, @doi [Zh. Vychisl. Mat. Mat. Fiz.] 10.1016/0041-5553(67)90144-9 , 7, 86

  25. [25]

    S., 2020, @doi [ ] 10.1093/mnras/staa278 , https://ui.adsabs.harvard.edu/abs/2020MNRAS.493.3132S 493, 3132

    Speagle J. S., 2020, @doi [ ] 10.1093/mnras/staa278 , https://ui.adsabs.harvard.edu/abs/2020MNRAS.493.3132S 493, 3132

  26. [26]

    Storn R., Price K., 1997, @doi [Journal of Global Optimization] 10.1023/A:1008202821328 , 11, 341

  27. [27]

    Virtanen P., et al., 2020, @doi [Nature Methods] 10.1038/s41592-019-0686-2 , https://rdcu.be/b08Wh 17, 261

  28. [28]

    F., et al., 2023, @doi [The Astrophysical Journal Supplement Series] 10.3847/1538-4365/acf3df , 269, 5

    Wilson R. F., et al., 2023, @doi [The Astrophysical Journal Supplement Series] 10.3847/1538-4365/acf3df , 269, 5

  29. [29]

    Zhang H., Si S., Hsieh C.-J., 2017, GPU-acceleration for Large-scale Tree Boosting ( @eprint arXiv 1706.08359 ), https://arxiv.org/abs/1706.08359