pith. sign in

arxiv: 1907.09762 · v1 · pith:FBAMIZS2new · submitted 2019-07-23 · 🧮 math.ST · stat.TH

Consistent model selection criteria and goodness-of-fit test for affine causal processes

Pith reviewed 2026-05-24 17:17 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords model selectionquasi-likelihoodcausal time seriesconsistencyBICportmanteau testaffine processesGARCH
0
0 comments X

The pith

Sufficient conditions on the penalty ensure consistent model selection by quasi-likelihood for affine causal processes, but BIC does not always work.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a penalized quasi-likelihood method for choosing among a broad family of causal time series models that includes ARMA, GARCH, APARCH, and their combinations. It supplies growth conditions on the penalty that make the selection procedure consistent and that make the quasi-maximum likelihood estimator in the chosen model both consistent and asymptotically normal. The same conditions imply that the usual BIC penalty fails to deliver consistency in some members of the class, for example autoregressive models driven by infinite-order ARCH noise. A portmanteau statistic is introduced to test the fit of the selected model. Simulations and an application to the FTSE index confirm the theoretical claims, including the observed inconsistency of BIC.

Core claim

We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. It appears from these conditions that the Bayesian Information Criterion (BIC) does not always guarantee the consistency. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on the portmanteau Test.

What carries the argument

Penalized quasi-likelihood contrast whose penalty term must satisfy explicit growth conditions to guarantee consistency of selection and of the resulting estimator.

If this is right

  • Model selection is consistent whenever the penalty meets the stated growth conditions.
  • The quasi-maximum likelihood estimator computed on the selected model is consistent and asymptotically normal.
  • BIC can produce inconsistent order selection for AR models with infinite ARCH errors.
  • The portmanteau statistic supplies an asymptotic test of goodness-of-fit after selection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners fitting financial series should check that their chosen penalty grows faster than log n when error structures are rich.
  • The same penalty conditions could be checked for other contrasts such as least squares or robust likelihoods.
  • The non-consistency result for BIC suggests re-examining default software choices for GARCH-type models on real data.

Load-bearing premise

The data must be generated by an affine causal process for which a quasi-likelihood contrast is well-defined and the penalty satisfies the paper's growth conditions.

What would settle it

A sequence of samples from an AR(p) process with ARCH(∞) errors in which the BIC-selected order fails to converge in probability to the true order as sample size tends to infinity.

Figures

Figures reproduced from arXiv: 1907.09762 by Jean-Marc Bardet (SAMM), Kare Kamila (SAMM), William Kengne (THEMA).

Figure 1
Figure 1. Figure 1: Daily closing FTSE 100 index (January 4th, 2010 to December 31 st, 2018). [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
read the original abstract

This paper studies the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. It appears from these conditions that the Bayesian Information Criterion (BIC) does not always guarantee the consistency. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on the portmanteau Test. Numerical simulations and an illustrative example on the FTSE index are performed to highlight the obtained asymptotic results, including a numerical evidence of the non consistency of the usual BIC penalty for order selection of an AR(p) models with ARCH($\infty$) errors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript develops a penalized quasi-likelihood procedure for model selection in the broad class of affine causal processes (encompassing ARMA, GARCH, APARCH, ARMA-GARCH and related models). It states sufficient conditions on the penalty term that ensure consistency of the selection procedure together with consistency and asymptotic normality of the post-selection quasi-maximum likelihood estimator. The authors show that the BIC penalty fails to satisfy these conditions in identifiable cases (e.g., AR(p) with ARCH(∞) errors), supply numerical confirmation, and propose a portmanteau test for goodness-of-fit of the selected model, illustrated by simulations and an FTSE-index example.

Significance. If the stated sufficient conditions on the penalty are correctly derived and the BIC counter-example holds, the work supplies a usable theoretical framework for consistent selection and post-selection inference in a large family of dependent processes where standard criteria can fail. The explicit growth conditions, the portmanteau diagnostic, and the reproducible numerical evidence constitute concrete strengths.

minor comments (3)
  1. [Abstract] The abstract and introduction would benefit from an explicit statement (even a high-level one) of the growth-rate conditions imposed on the penalty term, rather than only the claim that such conditions exist.
  2. Notation for the affine causal process and the quasi-likelihood contrast should be introduced once, with a single reference to the defining equation, to avoid repeated re-definition across sections.
  3. The numerical section would be strengthened by reporting the exact sample sizes and replication counts used in the BIC counter-example simulations.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and constructive report, which correctly summarizes the main contributions of the paper. The recommendation for minor revision is noted; we will make the corresponding editorial adjustments in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper derives explicit sufficient conditions on the penalty term that guarantee consistency of the penalized quasi-likelihood procedure and asymptotic normality of the post-selection QMLE for affine causal processes. These conditions are stated directly in the theorems, shown analytically to be violated by BIC in specific cases such as AR(p) with ARCH(∞) errors, and supported by separate numerical simulations. The weakest assumption (DGP belongs to the class where the quasi-likelihood contrast is well-defined) is precisely the setting in which the results are proved, with no reduction of any claimed prediction or consistency result to a fitted parameter, self-citation chain, or definitional equivalence. The portmanteau goodness-of-fit tool is likewise derived from standard residuals without circular input.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the data belonging to the affine causal class and on the quasi-likelihood being a valid contrast; no free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption The observed process belongs to the class of affine causal time series models
    The abstract opens by restricting attention to this class that includes ARMA, GARCH, APARCH and hybrids.
  • domain assumption A quasi-likelihood contrast can be defined for every candidate model in the class
    The penalized contrast is built directly on the quasi-likelihood of the model.

pith-pipeline@v0.9.0 · 5717 in / 1439 out tokens · 32968 ms · 2026-05-24T17:17:10.581041+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 1 internal anchor

  1. [1]

    Information theory and an extension of the maximum likelihood principle.Proceed- ings of the 2nd international symposium on information, Akademiai Kiado, Budapest(1973)

    Akaike, H. Information theory and an extension of the maximum likelihood principle.Proceed- ings of the 2nd international symposium on information, Akademiai Kiado, Budapest(1973)

  2. [2]

    The relationship between variable selection and data agumentation and a method for prediction

    Allen, D. The relationship between variable selection and data agumentation and a method for prediction. Technometrics 16, 1 (1974), 125–127

  3. [3]

    Model selection for weakly dependent time series fore- casting

    Alquier, P., and Wintenberger, O. Model selection for weakly dependent time series fore- casting. Bernoulli 18, 3 (2012), 883–913

  4. [4]

    Sequential Model Selection Method for Nonparametric Autoregression

    Arkoun, O., Brua, J.-Y., and Pergamenshchikov, S. Sequential model selection method for nonparametric autoregression.arXiv preprint arXiv:1809.02241(2018)

  5. [5]

    Data-driven calibration of penalties for least-squares regression

    Arlot, S., and Massart, P. Data-driven calibration of penalties for least-squares regression. Journal of Machine learning research 10(2009), 245–279

  6. [6]

    Asymptotic behavior of the laplacian quasi-maximum likelihood estimator of affine causal processes.Electronic journal of statistics 11, 1 (2017), 452–479

    Bardet, J.-M., Boularouk, Y., and Djaballah, K. Asymptotic behavior of the laplacian quasi-maximum likelihood estimator of affine causal processes.Electronic journal of statistics 11, 1 (2017), 452–479

  7. [7]

    Asymptotic normality of the quasi-maximum like- lihood estimator for multidimensional causal processes.The Annals of Statistics 37, 5B (2009), 2730–2759

    Bardet, J.-M., and Wintenberger, O. Asymptotic normality of the quasi-maximum like- lihood estimator for multidimensional causal processes.The Annals of Statistics 37, 5B (2009), 2730–2759

  8. [8]

    GARCH processes: structure and estimation

    Berkes, I., Hor v áth, L., and Kokoszka, P. GARCH processes: structure and estimation. Bernoulli 9 (2003), 201–227

  9. [9]

    Minimal penalties for gaussian model selection.Probability theory and related fields 138, 1-2 (2007), 33–73

    Birgé, L., and Massart, P. Minimal penalties for gaussian model selection.Probability theory and related fields 138, 1-2 (2007), 33–73

  10. [10]

    Bridging aic and bic: a new criterion for autoregression

    Ding, J., Tarokh, V., and Yang, Y. Bridging aic and bic: a new criterion for autoregression. IEEE Transactions on Information Theory 64, 6 (2018), 4024–4043

  11. [11]

    Model selection techniques: An overview.IEEE Signal Processing Magazine 35, 6 (2018), 16–34

    Ding, J., Tarokh, V., and Yang, Y. Model selection techniques: An overview.IEEE Signal Processing Magazine 35, 6 (2018), 16–34

  12. [12]

    A long memory property of stock market returns and a new model.Journal of empirical finance 1, 1 (1993), 83–106

    Ding, Z., Granger, C., and Engle, R. A long memory property of stock market returns and a new model.Journal of empirical finance 1, 1 (1993), 83–106

  13. [13]

    Weakly dependent chains with infinite memory

    Doukhan, P., and Wintenberger, O. Weakly dependent chains with infinite memory. Stochastic Processes and their Applications 118, 11 (2008), 1997–2013

  14. [14]

    On diagnostic checking time series models with portmanteau test statistics based on generalized inverses and

    Duchesne, P., and Francq, C. On diagnostic checking time series models with portmanteau test statistics based on generalized inverses and. InCOMPSTAT 2008. Springer, 2008, pp. 143– 154

  15. [15]

    Maximum likelihood estimation of pure garch and arma-garch processes

    Francq, C., and Zakoïan, J.-M. Maximum likelihood estimation of pure garch and arma-garch processes. Bernoulli 10 (2004), 605–637

  16. [16]

    Semiparametric non-linear time series model selection.Journal of the Royal Statistical Society: Series B 66, 2 (2004), 321–336

    Gao, J., and Tong, H. Semiparametric non-linear time series model selection.Journal of the Royal Statistical Society: Series B 66, 2 (2004), 321–336

  17. [17]

    The estimation of the order of an arma process.The Annals of Statistics 8, 5 (1980), 1071–1081

    Hannan, E. The estimation of the order of an arma process.The Annals of Statistics 8, 5 (1980), 1071–1081

  18. [18]

    Ridge regression: Biased estimation for nonorthogonal problems

    Hoerl, A., and Kennard, R. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (1970), 55–67

  19. [19]

    On model selection from a finite family of possibly misspecified time series models.The Annals of Statistics 47, 2 (2019), 1061–1087

    Hsu, H.-L., Ing, C.-K., and Tong, H. On model selection from a finite family of possibly misspecified time series models.The Annals of Statistics 47, 2 (2019), 1061–1087

  20. [20]

    Regression and time series model selection in small samples

    Hur vich, C., and Tsai, C.-L. Regression and time series model selection in small samples. Biometrika 76, 2 (1989), 297–307. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019 Bardet et al./Consistent model selection criteria and goodness-of-fit test for affine causal processes 29

  21. [21]

    Accumulated prediction errors, information criteria and optimal forecasting for au- toregressive time series.The Annals of Statistics 35, 3 (2007), 1238–1277

    Ing, C.-K. Accumulated prediction errors, information criteria and optimal forecasting for au- toregressive time series.The Annals of Statistics 35, 3 (2007), 1238–1277

  22. [22]

    Model selection for integrated autoregressive processes of infinite order.Journal of Multivariate Analysis 106(2012), 57–71

    Ing, C.-K., Sin, C.-Y., and Yu, S.-H. Model selection for integrated autoregressive processes of infinite order.Journal of Multivariate Analysis 106(2012), 57–71

  23. [23]

    Order selection for same-realization predictions in autoregressive processes

    Ing, C.-K., and Wei, C.-Z. Order selection for same-realization predictions in autoregressive processes. The Annals of Statistics 33, 5 (2005), 2423–2474

  24. [24]

    Strong consistency of estimators for multivariate arch models

    Jeantheau, T. Strong consistency of estimators for multivariate arch models. Econometric Theory 14, 1 (1998), 70–86

  25. [25]

    Model selection in threshold models

    Kapetanios, G. Model selection in threshold models. Journal of Time Series Analysis 22, 6 (2001), 733–754

  26. [26]

    Consistent and conservative model selection with the adaptive lasso in stationary and nonstationary autoregressions

    Kock, A. Consistent and conservative model selection with the adaptive lasso in stationary and nonstationary autoregressions. Econometric Theory 32, 1 (2016), 243–259

  27. [27]

    An inequality and almost sure convergence

    Kounias, E., and Weng, T. An inequality and almost sure convergence. The Annals of Mathematical Statistics 40, 3 (1969), 1091–1093

  28. [28]

    Optimal model selection for density estimation of stationary data under various mixing conditions

    Lerasle, M. Optimal model selection for density estimation of stationary data under various mixing conditions. The Annals of Statistics 39, 4 (2011), 1852–1877

  29. [29]

    Li, G., and Li, W. Least absolute deviation estimation for fractionally integrated autoregressive moving average time series models with conditional heteroscedasticity.Biometrika 95, 2 (2008), 399–414

  30. [30]

    On the asymptotic standard errors of residual autocorrelations in nonlinear time series modelling

    Li, W. On the asymptotic standard errors of residual autocorrelations in nonlinear time series modelling. Biometrika 79, 2 (1992), 435–437

  31. [31]

    On the squared residual autocorrelations in non-linear time series with conditional heteroskedasticity.Journal of Time Series Analysis 15, 6 (1994), 627–636

    Li, W., and Mak, T. On the squared residual autocorrelations in non-linear time series with conditional heteroskedasticity.Journal of Time Series Analysis 15, 6 (1994), 627–636

  32. [32]

    Diagnostic checking of nonlinear multivariate time series with multi- variate arch errors.Journal of Time Series Analysis 18, 5 (1997), 447–464

    Ling, S., and Li, W.-K. Diagnostic checking of nonlinear multivariate time series with multi- variate arch errors.Journal of Time Series Analysis 18, 5 (1997), 447–464

  33. [33]

    Asymptotic theory for a vector arma-garch model.Econometric theory 19, 2 (2003), 280–310

    Ling, S., and McAleer, M. Asymptotic theory for a vector arma-garch model.Econometric theory 19, 2 (2003), 280–310

  34. [34]

    Some comments on cp.Technometrics 15, 4 (1973), 661–675

    Mallows, C. Some comments on cp.Technometrics 15, 4 (1973), 661–675

  35. [35]

    Regression and Time Series Model Selection

    McQuarrie, A., and Tsai, C. Regression and Time Series Model Selection. World Scientific Pub Co Inc, 1998

  36. [36]

    On model selection

    Rao, C., Wu, Y., Konishi, S., and Mukerjee, R. On model selection. Lecture Notes- Monograph Series(2001), 1–64

  37. [37]

    Subset selection for vector autoregressive processes via adaptive lasso

    Ren, Y., and Zhang, X. Subset selection for vector autoregressive processes via adaptive lasso. Statistics & probability letters 80, 23-24 (2010), 1705–1712

  38. [38]

    Schw arz, G.Estimating the dimension of a model.The annals of statistics 6, 2 (1978), 461–464

  39. [39]

    Shao, Q., and Yang, L. Oracally efficient estimation and consistent model selection for auto- regressive moving average time series with trend.Journal of the Royal Statistical Society: Series B 79, 2 (2017), 507–524

  40. [40]

    Regression model selection-a residual likelihood approach.Journal of the Royal Statistical Society: Series B 64, 2 (2002), 237–252

    Shi, P., and Tsai, C.-L. Regression model selection-a residual likelihood approach.Journal of the Royal Statistical Society: Series B 64, 2 (2002), 237–252

  41. [41]

    Shibata, R.Asymptotically efficient selection of the order of the model for estimating parameters of a linear process.The Annals of Statistics(1980), 147–164

  42. [42]

    Information criteria for selecting possibly misspecified parametric models

    Sin, C.-Y., and White, H. Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics 71, 1-2 (1996), 207–225

  43. [43]

    Cross-validatory choice and assessment of statistical predictions.Journal of the royal statistical society

    Stone, M. Cross-validatory choice and assessment of statistical predictions.Journal of the royal statistical society. Series B(1974), 111–147

  44. [44]

    Quasi-maximum-likelihood estimation in conditionally het- eroscedastic time series: A stochastic recurrence equations approach.The Annals of Statistics 34, 5 (2006), 2449–2495

    Straumann, D., and Mikosch, T. Quasi-maximum-likelihood estimation in conditionally het- eroscedastic time series: A stochastic recurrence equations approach.The Annals of Statistics 34, 5 (2006), 2449–2495

  45. [45]

    Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society

    Tibshirani, R. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B(1996), 267–288

  46. [46]

    Order selection in nonstationary autoregressive models.The Annals of Statistics 12, 4 (1984), 1425–1433

    Tsay, R. Order selection in nonstationary autoregressive models.The Annals of Statistics 12, 4 (1984), 1425–1433

  47. [47]

    Testing for conditional heteroscedasticity: Some monte carlo results

    Tse, Y., and Zuo, X. Testing for conditional heteroscedasticity: Some monte carlo results. Journal of Statistical Computation and Simulation 58, 3 (1997), 237–253

  48. [48]

    Maximum likelihood estimation of misspecified models.Econometrica (1982), 1–25

    White, H. Maximum likelihood estimation of misspecified models.Econometrica (1982), 1–25. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019 Bardet et al./Consistent model selection criteria and goodness-of-fit test for affine causal processes 30

  49. [49]

    The adaptive lasso and its oracle properties.Journal of the American Statistical Asso- ciation 101, 476 (2006), 1418–1429

    Zou, H. The adaptive lasso and its oracle properties.Journal of the American Statistical Asso- ciation 101, 476 (2006), 1418–1429

  50. [50]

    Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society: Series B 67, 2 (2005), 301–320

    Zou, H., and Hastie, T. Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society: Series B 67, 2 (2005), 301–320. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019