Recognition: 2 theorem links
· Lean TheoremNested Sampling for ARIMA Model Selection in Astronomical Time-Series Analysis
Pith reviewed 2026-05-17 02:18 UTC · model grok-4.3
The pith
Nested sampling computes Bayesian evidence to select optimal ARIMA orders for astronomical time series.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Integrating ARIMA models with nested sampling produces Bayesian evidences for model comparison across AR and MA orders while automatically incorporating an Occam's penalty for extra complexity, and the resulting framework enables both reliable order selection and parameter inference for astronomical time series.
What carries the argument
Nested sampling algorithm used to evaluate Bayesian evidence for ARIMA likelihoods over grids of model orders.
Load-bearing premise
Astronomical time series are adequately described by linear ARIMA processes whose orders can be reliably distinguished by Bayesian evidence from nested sampling without significant model misspecification or sampling failures.
What would settle it
Generating simulated time series from a known ARIMA order and showing that the method repeatedly selects a different order or fails to recover the parameters would falsify the recovery claim.
Figures
read the original abstract
The upcoming era of large-scale, high-cadence astronomical surveys demands efficient and robust methods for time-series analysis. ARIMA models provide a versatile parametric description of stochastic variability in this context. However, their practical use is limited by the challenge of selecting optimal model orders while avoiding overfitting. We present a novel solution to this problem using a Bayesian framework for time-series modelling in astronomy by combining Autoregressive Integrated Moving Average (ARIMA) models with the Nested Sampling algorithm. Our method yields Bayesian evidences for model comparison and also incorporates an intrinsic Occam's penalty for unnecessary model complexity. A vectorized ARIMA-Nested Sampling framework with GPU-acceleration support is implemented, allowing us to perform model selection across grids of Autoregressive (AR) and Moving Average (MA) orders, with efficient inference of selected model parameters. We validate the approach using simulated time series with known ground-truth parameters and demonstrate accurate recovery of both model order and parameters. We then apply the method to several astronomical datasets, including the historical sunspot number record, stellar light curves of KIC 12008916 and Kepler 17 from the Kepler mission, and quasar light curves of 3C 273 and S4 0954+65 from the TESS mission. In all cases, the ARIMA models selected by this method were able to accurately model the stochastic variability in the time series data. Our results demonstrate that nested sampling offers a rigorous and computationally tractable alternative to autoregressive model selection in astronomical time-series analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes combining ARIMA models with nested sampling to compute Bayesian evidences for selecting optimal orders (p, d, q) in astronomical time-series analysis. It implements a vectorized, GPU-accelerated framework that yields model evidences incorporating an Occam's penalty, validates the approach on simulated data with known ground-truth parameters, and applies it to real datasets including the sunspot record, Kepler light curves (KIC 12008916, Kepler 17), and TESS quasar light curves (3C 273, S4 0954+65), claiming accurate recovery of orders/parameters in simulations and plausible fits on observations.
Significance. If the central claim holds, the work would provide a computationally tractable Bayesian tool for ARIMA order selection in high-cadence surveys, unifying evidence-based model comparison with parameter inference. The GPU support and application to diverse real datasets are strengths; however, the absence of quantitative validation metrics and independent evidence cross-checks reduces the assessed significance.
major comments (2)
- [§4] §4 (simulation validation): the claim of 'accurate recovery of both model order and parameters' supplies no quantitative metrics (e.g., recovery fraction, RMSE on parameters, or evidence error bars across realizations), leaving the support for the central claim of reliable order distinction unquantified.
- [§3] §3 (nested sampling implementation): the reported Bayesian evidences for higher-order ARMA models are not cross-validated against an independent estimator such as bridge sampling or thermodynamic integration on the same series; given the high-dimensional, correlated parameter spaces of ARIMA models, this verification is load-bearing for trusting the evidence-based order selection.
minor comments (2)
- [§2] The abstract and method sections would benefit from explicit reference to standard ARIMA likelihood formulations (e.g., the recursive residual or Toeplitz covariance approaches) to clarify how the nested sampling likelihood is constructed.
- [§5] Figure captions for the real-data fits should include the selected (p,d,q) orders and the corresponding log-evidence values for direct comparison.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on our manuscript. We have carefully considered each point and revised the paper accordingly to strengthen the presentation of our results.
read point-by-point responses
-
Referee: [§4] §4 (simulation validation): the claim of 'accurate recovery of both model order and parameters' supplies no quantitative metrics (e.g., recovery fraction, RMSE on parameters, or evidence error bars across realizations), leaving the support for the central claim of reliable order distinction unquantified.
Authors: We agree that quantitative metrics are necessary to rigorously support the claim of accurate recovery. In the revised manuscript we have expanded §4 to include the fraction of simulations in which the ground-truth order (p, d, q) is correctly recovered, the root-mean-square error on the inferred AR and MA coefficients across realizations, and the standard deviation of the log-evidence values computed from repeated nested-sampling runs. These additions provide a clearer, numerical demonstration of the method’s reliability. revision: yes
-
Referee: [§3] §3 (nested sampling implementation): the reported Bayesian evidences for higher-order ARMA models are not cross-validated against an independent estimator such as bridge sampling or thermodynamic integration on the same series; given the high-dimensional, correlated parameter spaces of ARIMA models, this verification is load-bearing for trusting the evidence-based order selection.
Authors: We acknowledge that independent cross-validation of the evidence estimates would increase confidence, especially in the correlated, high-dimensional parameter spaces of ARIMA models. Performing bridge sampling or thermodynamic integration on the same series would require substantial additional implementation and compute time that lies outside the scope of the present work. We have therefore added an explicit discussion of this limitation in the revised text, while retaining the simulation-based validation (recovery of known ground-truth parameters and orders) as the primary empirical support for the reliability of the nested-sampling evidences. We believe this approach is sufficient for the current contribution but agree that future studies could usefully include such cross-checks. revision: partial
Circularity Check
No circularity: standard nested sampling applied to standard ARIMA likelihoods with external validation on simulations
full rationale
The paper combines established ARIMA likelihoods with nested sampling to compute Bayesian evidences for model-order selection. Validation proceeds by generating simulated time series with known ground-truth orders and parameters, then recovering both via the method; this constitutes an independent check against external benchmarks rather than any self-referential reduction. No equations are presented that define the evidence or selected orders in terms of the fitted parameters themselves, nor does any load-bearing step rely on a self-citation chain that is itself unverified. The derivation is therefore self-contained: the inputs are the standard ARIMA model class and the standard nested-sampling algorithm, while the outputs (evidences and order selections) are computed quantities that can be falsified by the simulation recovery tests.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Astronomical time series can be adequately modeled as ARIMA processes after appropriate differencing
- standard math Nested sampling correctly computes the Bayesian evidence for ARIMA models of varying orders
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our method yields Bayesian evidences for model comparison and also incorporates an intrinsic Occam’s penalty for unnecessary model complexity.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Akaike H., 1974, @doi [IEEE Transactions on Automatic Control] 10.1109/TAC.1974.1100705 , 19, 716
-
[2]
Akhter M., Hassan D., Abbas S., 2020, @doi [Astronomy and Computing] https://doi.org/10.1016/j.ascom.2020.100403 , 32, 100403
-
[3]
Ashton G., et al., 2022, Nature Reviews Methods Primers, 2, 39
work page 2022
-
[4]
Barndorff-Nielsen O., Schou G., 1973, @doi [Journal of Multivariate Analysis] https://doi.org/10.1016/0047-259X(73)90030-4 , 3, 408
-
[5]
Baron D., 2019, Machine Learning in Astronomy: a practical overview ( @eprint arXiv 1904.07248 ), https://arxiv.org/abs/1904.07248
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[6]
Box G. E. P., Jenkins G. M., 1976, Time Series Analysis: Forecasting and Control, 2nd edn. Holden-Day, San Francisco
work page 1976
-
[7]
Bradbury J., et al., 2018, JAX : composable transformations of P ython+ N um P y programs, http://github.com/google/jax
work page 2018
-
[8]
Brockwell P. J., Davis R. A., 2009, Time Series: Theory and Methods, 3rd edn. Springer Series in Statistics, Springer, New York
work page 2009
-
[9]
Buchner J., 2023, @doi [Statistics Surveys] 10.1214/23-SS144 , 17, 169
-
[10]
pp EPSC2021--36, @doi 10.5194/epsc2021-36
Carruba V., Aljbaae S., 2021, in European Planetary Science Congress. pp EPSC2021--36, @doi 10.5194/epsc2021-36
-
[11]
Davies G. R., Miglio A., 2016, @doi [Astronomische Nachrichten] https://doi.org/10.1002/asna.201612371 , 337, 774
-
[12]
Dickey D. A., Fuller W. A., 1979, @doi [Journal of the American Statistical Association] 10.2307/2286348 , 74, 427
-
[13]
Elorrieta, Felipe Eyheramendy, Susana Palma, Wilfredo 2019, @doi [A&A] 10.1051/0004-6361/201935560 , 627, A120
-
[14]
Feigelson E. D., Babu G. J., Caceres G. A., 2018, @doi [Frontiers in Physics] 10.3389/fphy.2018.00080 , Volume 6 - 2018
-
[15]
Foreman-Mackey D., Agol E., Ambikasaran S., Angus R., 2017, @doi [The Astronomical Journal] 10.3847/1538-3881/aa9332 , 154, 220
-
[16]
Gaia Collaboration Prusti T., de Bruijne J. H. J., Brown A. G. A., Vallenari A., Babusiaux C., Bailer-Jones C. A. L., et al. 2016, @doi [Astronomy & Astrophysics] 10.1051/0004-6361/201629272 , 595, A1
-
[17]
et al., 2025, @doi [A&A] 10.1051/0004-6361/202553719 , 700, A216
Geraldía-González, S. et al., 2025, @doi [A&A] 10.1051/0004-6361/202553719 , 700, A216
-
[18]
Handley W., 2018, @doi [The Journal of Open Source Software] 10.21105/joss.00849 , 3
-
[19]
H., 2015, @doi [Living Reviews in Solar Physics] 10.1007/lrsp-2015-4 , 12, 4
Hathaway D. H., 2015, @doi [Living Reviews in Solar Physics] 10.1007/lrsp-2015-4 , 12, 4
-
[20]
X., et al., 2020a, @doi [Research Notes of the AAS] 10.3847/2515-5172/abca2e , 4, 204
Huang C. X., et al., 2020a, @doi [Research Notes of the AAS] 10.3847/2515-5172/abca2e , 4, 204
-
[21]
X., et al., 2020b, @doi [Research Notes of the AAS] 10.3847/2515-5172/abca2d , 4, 206
Huang C. X., et al., 2020b, @doi [Research Notes of the AAS] 10.3847/2515-5172/abca2d , 4, 206
-
[22]
Ivezi \'c Z ., et al., 2019, @doi [ ] 10.3847/1538-4357/ab042c , http://adsabs.harvard.edu/abs/2019ApJ...873..111I 873, 111
-
[23]
Oxford University Press, Oxford
Jeffreys H., 1961, Theory of Probability, 3rd edn. Oxford University Press, Oxford
work page 1961
-
[24]
Johnson S. A., Penny M. T., Gaudi B. S., Kerins E., Rattenbury N. J., Robin A. C., Calchi Novati S., Poleski R., 2020, @doi [The Astronomical Journal] 10.3847/1538-3881/abaf50 , 160, 184
-
[25]
Kass R. E., Raftery A. E., 1995, @doi [Journal of the American Statistical Association] 10.1080/01621459.1995.10476572 , 90, 773
-
[26]
A., 1951, The Annals of Mathematical Statistics, 22, 79
Kullback S., Leibler R. A., 1951, The Annals of Mathematical Statistics, 22, 79
work page 1951
-
[27]
Kwiatkowski D., Phillips P. C., Schmidt P., Shin Y., 1992, @doi [Journal of Econometrics] https://doi.org/10.1016/0304-4076(92)90104-Y , 54, 159
- [28]
-
[29]
Lightkurve Collaboration et al., 2018, Lightkurve: Kepler and TESS time series analysis in Python , Astrophysics Source Code Library ( @eprint ascl 1812.013 )
work page 2018
-
[30]
Ljung G. M., Box G. E. P., 1978, @doi [Biometrika] 10.1093/biomet/65.2.297 , 65, 297
-
[31]
R., 1976, @doi [Astrophysics and Space Science] 10.1007/BF00648343 , 39, 447
Lomb N. R., 1976, @doi [Astrophysics and Space Science] 10.1007/BF00648343 , 39, 447
- [32]
-
[33]
Melton E. J., Feigelson E. D., Montalto M., Caceres G. A., Rosenswie A. W., Abelson C. S., 2024, @doi [The Astronomical Journal] 10.3847/1538-3881/ad29f1 , 167, 203
-
[34]
Naik A., 2025, Nested Sampling for ARIMA Model Selection, @doi 10.5281/zenodo.17771974 , https://doi.org/10.5281/zenodo.17771974
-
[35]
I., 2025, in Kahraman C., Cebi S., Oztaysi B., Cevik Onar S., Tolga C., Ucal Sari I., Otay \
Ndungi R., Stanislavovich L. I., 2025, in Kahraman C., Cebi S., Oztaysi B., Cevik Onar S., Tolga C., Ucal Sari I., Otay \. I ., eds, Intelligent and Fuzzy Systems. Springer Nature Switzerland, Cham, pp 294--302
work page 2025
-
[36]
M., 2003, @doi [The Annals of Statistics] 10.1214/aos/1056562461 , 31, 705
Neal R. M., 2003, @doi [The Annals of Statistics] 10.1214/aos/1056562461 , 31, 705
-
[37]
Dynamic or Systematic? Bayesian model selection between dark energy and supernova biases
Ormondroyd A. N., Handley W. J., Hobson M. P., Lasenby A. N., Yallup D., 2025, Dynamic or Systematic? Bayesian model selection between dark energy and supernova biases ( @eprint arXiv 2509.13220 ), https://arxiv.org/abs/2509.13220
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
Pont F., Zucker S., Queloz D., 2006, @doi [Monthly Notices of the Royal Astronomical Society] 10.1111/j.1365-2966.2006.11012.x , 373, 231
- [39]
-
[40]
Prathaban M., Bevins H., Handley W., 2025b, @doi [Monthly Notices of the Royal Astronomical Society] 10.1093/mnras/staf962 , 541, 200
-
[41]
W., et al., 2011, @doi [The Astrophysical Journal] 10.1088/0004-637X/733/1/10 , 733, 10
Richards J. W., et al., 2011, @doi [The Astrophysical Journal] 10.1088/0004-637X/733/1/10 , 733, 10
-
[42]
Ricker G. R., et al., 2015, @doi [Journal of Astronomical Telescopes, Instruments, and Systems] 10.1117/1.JATIS.1.1.014003 , 1, 014003
work page internal anchor Pith review doi:10.1117/1.jatis.1.1.014003 2015
-
[43]
D., 1982, @doi [The Astrophysical Journal] 10.1086/160554 , 263, 835
Scargle J. D., 1982, @doi [The Astrophysical Journal] 10.1086/160554 , 263, 835
-
[44]
Schwarz G., 1978, @doi [The Annals of Statistics] 10.1214/aos/1176344136 , 6, 461
-
[45]
Skilling J., 2006, @doi [Bayesian Analysis] 10.1214/06-BA127 , 1, 833
-
[46]
Spergel D., et al., 2015, Wide-Field InfrarRed Survey Telescope-Astrophysics Focused Telescope Assets WFIRST-AFTA Final Report ( @eprint arXiv 1503.03757 )
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[47]
Trotta R., 2008, @doi [Contemporary Physics] 10.1080/00107510802066753 , 49, 71
- [48]
-
[49]
https://openreview.net/forum?id=ekbkMSuPo4
Yallup D., Kroupa N., Handley W., 2025, in Frontiers in Probabilistic Inference: Learning meets Sampling. https://openreview.net/forum?id=ekbkMSuPo4
work page 2025
-
[50]
https://api.semanticscholar.org/CorpusID:265295307
Yu X., 2023. https://api.semanticscholar.org/CorpusID:265295307
work page 2023
-
[51]
U., 1927, @doi [Philosophical Transactions of the Royal Society of London
Yule G. U., 1927, @doi [Philosophical Transactions of the Royal Society of London. Series A] 10.1098/rsta.1927.0007 , 226, 267
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.