Consistent model selection criteria and goodness-of-fit test for affine causal processes
Pith reviewed 2026-05-24 17:17 UTC · model grok-4.3
The pith
Sufficient conditions on the penalty ensure consistent model selection by quasi-likelihood for affine causal processes, but BIC does not always work.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. It appears from these conditions that the Bayesian Information Criterion (BIC) does not always guarantee the consistency. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on the portmanteau Test.
What carries the argument
Penalized quasi-likelihood contrast whose penalty term must satisfy explicit growth conditions to guarantee consistency of selection and of the resulting estimator.
If this is right
- Model selection is consistent whenever the penalty meets the stated growth conditions.
- The quasi-maximum likelihood estimator computed on the selected model is consistent and asymptotically normal.
- BIC can produce inconsistent order selection for AR models with infinite ARCH errors.
- The portmanteau statistic supplies an asymptotic test of goodness-of-fit after selection.
Where Pith is reading between the lines
- Practitioners fitting financial series should check that their chosen penalty grows faster than log n when error structures are rich.
- The same penalty conditions could be checked for other contrasts such as least squares or robust likelihoods.
- The non-consistency result for BIC suggests re-examining default software choices for GARCH-type models on real data.
Load-bearing premise
The data must be generated by an affine causal process for which a quasi-likelihood contrast is well-defined and the penalty satisfies the paper's growth conditions.
What would settle it
A sequence of samples from an AR(p) process with ARCH(∞) errors in which the BIC-selected order fails to converge in probability to the true order as sample size tends to infinity.
Figures
read the original abstract
This paper studies the model selection problem in a large class of causal time series models, which includes both the ARMA or AR($\infty$) processes, as well as the GARCH or ARCH($\infty$), APARCH, ARMA-GARCH and many others processes. To tackle this issue, we consider a penalized contrast based on the quasi-likelihood of the model. We provide sufficient conditions for the penalty term to ensure the consistency of the proposed procedure as well as the consistency and the asymptotic normality of the quasi-maximum likelihood estimator of the chosen model. It appears from these conditions that the Bayesian Information Criterion (BIC) does not always guarantee the consistency. We also propose a tool for diagnosing the goodness-of-fit of the chosen model based on the portmanteau Test. Numerical simulations and an illustrative example on the FTSE index are performed to highlight the obtained asymptotic results, including a numerical evidence of the non consistency of the usual BIC penalty for order selection of an AR(p) models with ARCH($\infty$) errors.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a penalized quasi-likelihood procedure for model selection in the broad class of affine causal processes (encompassing ARMA, GARCH, APARCH, ARMA-GARCH and related models). It states sufficient conditions on the penalty term that ensure consistency of the selection procedure together with consistency and asymptotic normality of the post-selection quasi-maximum likelihood estimator. The authors show that the BIC penalty fails to satisfy these conditions in identifiable cases (e.g., AR(p) with ARCH(∞) errors), supply numerical confirmation, and propose a portmanteau test for goodness-of-fit of the selected model, illustrated by simulations and an FTSE-index example.
Significance. If the stated sufficient conditions on the penalty are correctly derived and the BIC counter-example holds, the work supplies a usable theoretical framework for consistent selection and post-selection inference in a large family of dependent processes where standard criteria can fail. The explicit growth conditions, the portmanteau diagnostic, and the reproducible numerical evidence constitute concrete strengths.
minor comments (3)
- [Abstract] The abstract and introduction would benefit from an explicit statement (even a high-level one) of the growth-rate conditions imposed on the penalty term, rather than only the claim that such conditions exist.
- Notation for the affine causal process and the quasi-likelihood contrast should be introduced once, with a single reference to the defining equation, to avoid repeated re-definition across sections.
- The numerical section would be strengthened by reporting the exact sample sizes and replication counts used in the BIC counter-example simulations.
Simulated Author's Rebuttal
We thank the referee for the positive and constructive report, which correctly summarizes the main contributions of the paper. The recommendation for minor revision is noted; we will make the corresponding editorial adjustments in the revised manuscript.
Circularity Check
No significant circularity
full rationale
The paper derives explicit sufficient conditions on the penalty term that guarantee consistency of the penalized quasi-likelihood procedure and asymptotic normality of the post-selection QMLE for affine causal processes. These conditions are stated directly in the theorems, shown analytically to be violated by BIC in specific cases such as AR(p) with ARCH(∞) errors, and supported by separate numerical simulations. The weakest assumption (DGP belongs to the class where the quasi-likelihood contrast is well-defined) is precisely the setting in which the results are proved, with no reduction of any claimed prediction or consistency result to a fitted parameter, self-citation chain, or definitional equivalence. The portmanteau goodness-of-fit tool is likewise derived from standard residuals without circular input.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The observed process belongs to the class of affine causal time series models
- domain assumption A quasi-likelihood contrast can be defined for every candidate model in the class
Reference graph
Works this paper leans on
-
[1]
Akaike, H. Information theory and an extension of the maximum likelihood principle.Proceed- ings of the 2nd international symposium on information, Akademiai Kiado, Budapest(1973)
work page 1973
-
[2]
The relationship between variable selection and data agumentation and a method for prediction
Allen, D. The relationship between variable selection and data agumentation and a method for prediction. Technometrics 16, 1 (1974), 125–127
work page 1974
-
[3]
Model selection for weakly dependent time series fore- casting
Alquier, P., and Wintenberger, O. Model selection for weakly dependent time series fore- casting. Bernoulli 18, 3 (2012), 883–913
work page 2012
-
[4]
Sequential Model Selection Method for Nonparametric Autoregression
Arkoun, O., Brua, J.-Y., and Pergamenshchikov, S. Sequential model selection method for nonparametric autoregression.arXiv preprint arXiv:1809.02241(2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[5]
Data-driven calibration of penalties for least-squares regression
Arlot, S., and Massart, P. Data-driven calibration of penalties for least-squares regression. Journal of Machine learning research 10(2009), 245–279
work page 2009
-
[6]
Bardet, J.-M., Boularouk, Y., and Djaballah, K. Asymptotic behavior of the laplacian quasi-maximum likelihood estimator of affine causal processes.Electronic journal of statistics 11, 1 (2017), 452–479
work page 2017
-
[7]
Bardet, J.-M., and Wintenberger, O. Asymptotic normality of the quasi-maximum like- lihood estimator for multidimensional causal processes.The Annals of Statistics 37, 5B (2009), 2730–2759
work page 2009
-
[8]
GARCH processes: structure and estimation
Berkes, I., Hor v áth, L., and Kokoszka, P. GARCH processes: structure and estimation. Bernoulli 9 (2003), 201–227
work page 2003
-
[9]
Birgé, L., and Massart, P. Minimal penalties for gaussian model selection.Probability theory and related fields 138, 1-2 (2007), 33–73
work page 2007
-
[10]
Bridging aic and bic: a new criterion for autoregression
Ding, J., Tarokh, V., and Yang, Y. Bridging aic and bic: a new criterion for autoregression. IEEE Transactions on Information Theory 64, 6 (2018), 4024–4043
work page 2018
-
[11]
Model selection techniques: An overview.IEEE Signal Processing Magazine 35, 6 (2018), 16–34
Ding, J., Tarokh, V., and Yang, Y. Model selection techniques: An overview.IEEE Signal Processing Magazine 35, 6 (2018), 16–34
work page 2018
-
[12]
Ding, Z., Granger, C., and Engle, R. A long memory property of stock market returns and a new model.Journal of empirical finance 1, 1 (1993), 83–106
work page 1993
-
[13]
Weakly dependent chains with infinite memory
Doukhan, P., and Wintenberger, O. Weakly dependent chains with infinite memory. Stochastic Processes and their Applications 118, 11 (2008), 1997–2013
work page 2008
-
[14]
Duchesne, P., and Francq, C. On diagnostic checking time series models with portmanteau test statistics based on generalized inverses and. InCOMPSTAT 2008. Springer, 2008, pp. 143– 154
work page 2008
-
[15]
Maximum likelihood estimation of pure garch and arma-garch processes
Francq, C., and Zakoïan, J.-M. Maximum likelihood estimation of pure garch and arma-garch processes. Bernoulli 10 (2004), 605–637
work page 2004
-
[16]
Gao, J., and Tong, H. Semiparametric non-linear time series model selection.Journal of the Royal Statistical Society: Series B 66, 2 (2004), 321–336
work page 2004
-
[17]
The estimation of the order of an arma process.The Annals of Statistics 8, 5 (1980), 1071–1081
Hannan, E. The estimation of the order of an arma process.The Annals of Statistics 8, 5 (1980), 1071–1081
work page 1980
-
[18]
Ridge regression: Biased estimation for nonorthogonal problems
Hoerl, A., and Kennard, R. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (1970), 55–67
work page 1970
-
[19]
Hsu, H.-L., Ing, C.-K., and Tong, H. On model selection from a finite family of possibly misspecified time series models.The Annals of Statistics 47, 2 (2019), 1061–1087
work page 2019
-
[20]
Regression and time series model selection in small samples
Hur vich, C., and Tsai, C.-L. Regression and time series model selection in small samples. Biometrika 76, 2 (1989), 297–307. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019 Bardet et al./Consistent model selection criteria and goodness-of-fit test for affine causal processes 29
work page 1989
-
[21]
Ing, C.-K. Accumulated prediction errors, information criteria and optimal forecasting for au- toregressive time series.The Annals of Statistics 35, 3 (2007), 1238–1277
work page 2007
-
[22]
Ing, C.-K., Sin, C.-Y., and Yu, S.-H. Model selection for integrated autoregressive processes of infinite order.Journal of Multivariate Analysis 106(2012), 57–71
work page 2012
-
[23]
Order selection for same-realization predictions in autoregressive processes
Ing, C.-K., and Wei, C.-Z. Order selection for same-realization predictions in autoregressive processes. The Annals of Statistics 33, 5 (2005), 2423–2474
work page 2005
-
[24]
Strong consistency of estimators for multivariate arch models
Jeantheau, T. Strong consistency of estimators for multivariate arch models. Econometric Theory 14, 1 (1998), 70–86
work page 1998
-
[25]
Model selection in threshold models
Kapetanios, G. Model selection in threshold models. Journal of Time Series Analysis 22, 6 (2001), 733–754
work page 2001
-
[26]
Kock, A. Consistent and conservative model selection with the adaptive lasso in stationary and nonstationary autoregressions. Econometric Theory 32, 1 (2016), 243–259
work page 2016
-
[27]
An inequality and almost sure convergence
Kounias, E., and Weng, T. An inequality and almost sure convergence. The Annals of Mathematical Statistics 40, 3 (1969), 1091–1093
work page 1969
-
[28]
Optimal model selection for density estimation of stationary data under various mixing conditions
Lerasle, M. Optimal model selection for density estimation of stationary data under various mixing conditions. The Annals of Statistics 39, 4 (2011), 1852–1877
work page 2011
-
[29]
Li, G., and Li, W. Least absolute deviation estimation for fractionally integrated autoregressive moving average time series models with conditional heteroscedasticity.Biometrika 95, 2 (2008), 399–414
work page 2008
-
[30]
On the asymptotic standard errors of residual autocorrelations in nonlinear time series modelling
Li, W. On the asymptotic standard errors of residual autocorrelations in nonlinear time series modelling. Biometrika 79, 2 (1992), 435–437
work page 1992
-
[31]
Li, W., and Mak, T. On the squared residual autocorrelations in non-linear time series with conditional heteroskedasticity.Journal of Time Series Analysis 15, 6 (1994), 627–636
work page 1994
-
[32]
Ling, S., and Li, W.-K. Diagnostic checking of nonlinear multivariate time series with multi- variate arch errors.Journal of Time Series Analysis 18, 5 (1997), 447–464
work page 1997
-
[33]
Asymptotic theory for a vector arma-garch model.Econometric theory 19, 2 (2003), 280–310
Ling, S., and McAleer, M. Asymptotic theory for a vector arma-garch model.Econometric theory 19, 2 (2003), 280–310
work page 2003
-
[34]
Some comments on cp.Technometrics 15, 4 (1973), 661–675
Mallows, C. Some comments on cp.Technometrics 15, 4 (1973), 661–675
work page 1973
-
[35]
Regression and Time Series Model Selection
McQuarrie, A., and Tsai, C. Regression and Time Series Model Selection. World Scientific Pub Co Inc, 1998
work page 1998
-
[36]
Rao, C., Wu, Y., Konishi, S., and Mukerjee, R. On model selection. Lecture Notes- Monograph Series(2001), 1–64
work page 2001
-
[37]
Subset selection for vector autoregressive processes via adaptive lasso
Ren, Y., and Zhang, X. Subset selection for vector autoregressive processes via adaptive lasso. Statistics & probability letters 80, 23-24 (2010), 1705–1712
work page 2010
-
[38]
Schw arz, G.Estimating the dimension of a model.The annals of statistics 6, 2 (1978), 461–464
work page 1978
-
[39]
Shao, Q., and Yang, L. Oracally efficient estimation and consistent model selection for auto- regressive moving average time series with trend.Journal of the Royal Statistical Society: Series B 79, 2 (2017), 507–524
work page 2017
-
[40]
Shi, P., and Tsai, C.-L. Regression model selection-a residual likelihood approach.Journal of the Royal Statistical Society: Series B 64, 2 (2002), 237–252
work page 2002
-
[41]
Shibata, R.Asymptotically efficient selection of the order of the model for estimating parameters of a linear process.The Annals of Statistics(1980), 147–164
work page 1980
-
[42]
Information criteria for selecting possibly misspecified parametric models
Sin, C.-Y., and White, H. Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics 71, 1-2 (1996), 207–225
work page 1996
-
[43]
Stone, M. Cross-validatory choice and assessment of statistical predictions.Journal of the royal statistical society. Series B(1974), 111–147
work page 1974
-
[44]
Straumann, D., and Mikosch, T. Quasi-maximum-likelihood estimation in conditionally het- eroscedastic time series: A stochastic recurrence equations approach.The Annals of Statistics 34, 5 (2006), 2449–2495
work page 2006
-
[45]
Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society
Tibshirani, R. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society. Series B(1996), 267–288
work page 1996
-
[46]
Tsay, R. Order selection in nonstationary autoregressive models.The Annals of Statistics 12, 4 (1984), 1425–1433
work page 1984
-
[47]
Testing for conditional heteroscedasticity: Some monte carlo results
Tse, Y., and Zuo, X. Testing for conditional heteroscedasticity: Some monte carlo results. Journal of Statistical Computation and Simulation 58, 3 (1997), 237–253
work page 1997
-
[48]
Maximum likelihood estimation of misspecified models.Econometrica (1982), 1–25
White, H. Maximum likelihood estimation of misspecified models.Econometrica (1982), 1–25. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019 Bardet et al./Consistent model selection criteria and goodness-of-fit test for affine causal processes 30
work page 1982
-
[49]
Zou, H. The adaptive lasso and its oracle properties.Journal of the American Statistical Asso- ciation 101, 476 (2006), 1418–1429
work page 2006
-
[50]
Zou, H., and Hastie, T. Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society: Series B 67, 2 (2005), 301–320. imsart-ejs ver. 2014/10/16 file: final_5.tex date: July 24, 2019
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.