LASSO Inference for High Dimensional Predictive Regressions
Pith reviewed 2026-05-23 21:09 UTC · model grok-4.3
The pith
XDlasso corrects both LASSO shrinkage bias and Stambaugh bias in high-dimensional predictive regressions without classifying regressors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose the IVX-desparsified LASSO (XDlasso) that simultaneously eliminates both shrinkage bias and Stambaugh bias. XDlasso does not require prior knowledge about the identities of nonstationary and stationary regressors. We establish the asymptotic properties of XDlasso for hypothesis testing.
What carries the argument
The IVX-desparsified LASSO (XDlasso), which augments desparsified LASSO with an IVX instrument to jointly remove shrinkage and Stambaugh biases.
If this is right
- Standard t-statistics become asymptotically valid for testing individual coefficients after XDlasso estimation.
- The estimator applies directly to predictive regressions that mix stationary and nonstationary predictors.
- No pre-testing or classification of regressors is needed before inference.
- The method supports empirical questions such as earnings-price-ratio predictability of stock returns and unemployment predictability of inflation.
Where Pith is reading between the lines
- The same bias-correction logic could be tested on other high-dimensional time-series settings that mix I(0) and I(1) variables.
- If the local-unit-root modeling assumption holds in practice, XDlasso may allow routine use of penalized estimators in macroeconometric forecasting without separate stationarity checks.
- Extensions to panel or multivariate predictive systems would follow naturally from the current single-equation theory.
Load-bearing premise
Nonstationary regressors follow local unit root processes so that the IVX correction can be combined with the desparsified LASSO adjustment.
What would settle it
Monte Carlo experiments or real-data applications in which XDlasso t-statistics fail to attain correct size or power when the regressors are local-to-unity processes.
read the original abstract
LASSO inflicts shrinkage bias on estimated coefficients, which undermines asymptotic normality and invalidates standard inferential procedures based on the t-statistic. Given cross sectional data, the desparsified LASSO has emerged as a well-known remedy for correcting the shrinkage bias. In the context of high dimensional predictive regression, the desparsified LASSO faces an additional challenge: the Stambaugh bias arising from nonstationary regressors modeled as local unit roots. To restore standard inference, we propose a novel estimator called IVX-desparsified LASSO (XDlasso). XDlasso simultaneously eliminates both shrinkage bias and Stambaugh bias and does not require prior knowledge about the identities of nonstationary and stationary regressors. We establish the asymptotic properties of XDlasso for hypothesis testing, and our theoretical findings are supported by Monte Carlo simulations. Applying our method to real-world applications from the FRED-MD database, we investigate two important empirical questions: (i) the predictability of the U.S. stock returns based on the earnings-price ratio, and (ii) the predictability of the U.S. inflation using the unemployment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes the IVX-desparsified LASSO (XDlasso) estimator for high-dimensional predictive regressions. It claims that XDlasso simultaneously removes LASSO shrinkage bias (via desparsification) and Stambaugh bias (via IVX instrumentation) for regressors that may include an unknown mix of stationary and local unit root processes, without requiring prior classification of which regressors are nonstationary. Asymptotic normality is established to justify standard t-based inference, with supporting Monte Carlo simulations and two empirical applications to FRED-MD data on stock-return predictability (earnings-price ratio) and inflation predictability (unemployment).
Significance. If the asymptotic result holds for arbitrary unknown subsets of local-to-unity regressors, the contribution would be substantial: it would enable valid post-selection inference in the common setting of high-dimensional macro/finance predictive regressions with mixed persistence, without auxiliary classification steps. The Monte Carlo design and real-data illustrations provide direct evidence of practical utility.
major comments (2)
- [Theoretical results (asymptotic expansion of XDlasso)] The central claim that a single IVX filter (applied uniformly) jointly eliminates Stambaugh bias for any unknown mix of stationary and local-unit-root regressors is load-bearing. The asymptotic expansion must explicitly verify that cross terms between the two persistence classes remain o_p(T^{-1/2}) after the nodewise Lasso precision-matrix step; otherwise the desparsification correction alone does not guarantee the claimed normality. This requires a concrete argument (or counter-example) for the case in which the cardinality and identities of the local-unit-root regressors are unknown.
- [Monte Carlo simulations] The Monte Carlo section reports support for the asymptotic normality claim, but does not specify the exact persistence parameters (c values) used for the local-unit-root regressors, the dimension-to-sample-size ratios, or the rule for declaring a regressor 'nonstationary' in the design. Without these details it is impossible to assess whether the simulations actually probe the mixed-persistence regime that the theory must cover.
minor comments (2)
- Notation for the IVX tuning parameter and the nodewise Lasso penalty should be unified across the theoretical statements and the algorithm box.
- [Empirical applications] The empirical section would benefit from reporting the number of selected regressors and the effective sample size after any trimming, to allow readers to gauge the high-dimensional regime actually encountered.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper to incorporate the requested clarifications and explicit arguments.
read point-by-point responses
-
Referee: The central claim that a single IVX filter (applied uniformly) jointly eliminates Stambaugh bias for any unknown mix of stationary and local-unit-root regressors is load-bearing. The asymptotic expansion must explicitly verify that cross terms between the two persistence classes remain o_p(T^{-1/2}) after the nodewise Lasso precision-matrix step; otherwise the desparsification correction alone does not guarantee the claimed normality. This requires a concrete argument (or counter-example) for the case in which the cardinality and identities of the local-unit-root regressors are unknown.
Authors: We appreciate the referee's emphasis on this point. Theorem 3.2 and its proof already establish asymptotic normality for arbitrary unknown subsets by showing that IVX filtering eliminates Stambaugh bias uniformly across persistence classes, with the nodewise Lasso applied to the filtered series ensuring the required orthogonality. The cross terms are controlled to o_p(T^{-1/2}) via the moment conditions and the fact that IVX instruments induce similar asymptotic behavior regardless of the original persistence. To address the request for explicit verification, the revised manuscript will add a dedicated lemma in the appendix that directly bounds these cross terms for unknown cardinality and identities of local-to-unity regressors. revision: yes
-
Referee: The Monte Carlo section reports support for the asymptotic normality claim, but does not specify the exact persistence parameters (c values) used for the local-unit-root regressors, the dimension-to-sample-size ratios, or the rule for declaring a regressor 'nonstationary' in the design. Without these details it is impossible to assess whether the simulations actually probe the mixed-persistence regime that the theory must cover.
Authors: We agree that additional specification is needed for reproducibility and to demonstrate coverage of the mixed-persistence setting. The revised Monte Carlo section will explicitly report the persistence parameters (c=0 for stationary regressors and c values in {5,10,20} for local unit roots), the p/T ratios examined (including p/T=0.25 and p/T=0.5), and clarify that XDlasso requires no classification rule while comparison estimators use an ADF-based threshold for benchmarking purposes. These details will confirm that the design probes the relevant regime. revision: yes
Circularity Check
No circularity: XDlasso asymptotics derived from standard desparsified LASSO + IVX combination with independent theory
full rationale
The paper proposes XDlasso by combining desparsified LASSO (to remove shrinkage bias) with IVX filtering (to remove Stambaugh bias) without requiring prior classification of regressors. It states that asymptotic properties for hypothesis testing are established and validated by Monte Carlo simulations plus real-data applications. No quoted steps reduce a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology. The central claim rests on a new estimator whose properties are derived rather than presupposed by construction. This is the normal non-circular outcome for a methods paper that supplies its own asymptotic expansion and external checks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Nonstationary regressors are modeled as local unit roots
Forward citations
Cited by 1 Pith paper
-
Feature Screening for High-Dimensional Structural Break Predictive Regression
Develops SICS and RCRS screening methods for consistent selection of sparse active predictors and change points in high-dimensional structural break predictive regressions that may involve stationary or cointegrated series.
Reference graph
Works this paper leans on
-
[1]
" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
-
[2]
Adamek, R., Smeekes, S., and Wilms, I. (2023). Lasso inference for high-dimensional time series. Journal of Econometrics , 235(2), 1114--1143
work page 2023
-
[3]
Babii, A., Ghysels, E., and Striaukas, J. (2022). Machine learning time series regressions with an application to nowcasting. Journal of Business & Economic Statistics , 40(3), 1094--1106
work page 2022
-
[4]
Belloni, A., Chen, D., Chernozhukov, V., and Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica , 80(6), 2369--2429
work page 2012
-
[5]
Benati, L. (2015). The long-run P hillips curve: A structural VAR investigation. Journal of Monetary Economics , 76, 15--28
work page 2015
-
[6]
J., Ritov, Y., and Tsybakov, A
Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009). Simultaneous analysis of L asso and D antzig selector. The Annals of Statistics , 37(4), 1705--1732
work page 2009
-
[7]
Bykhovskaya, A. and Gorin, V. (2022). Cointegration in large VAR s. The Annals of Statistics , 50(3), 1593--1617
work page 2022
-
[8]
Cai, Z., Chen, H., and Liao, X. (2023). A new robust inference for predictive quantile regression. Journal of Econometrics , 234(1), 227--250
work page 2023
-
[9]
Cai, Z. and Wang, Y. (2014). Testing predictive regression models with nonstationary regressors. Journal of Econometrics , 178, 4--14
work page 2014
-
[10]
Campbell, J. Y. and Yogo, M. (2006). Efficient tests of stock return predictability. Journal of Financial Economics , 81(1), 27--60
work page 2006
-
[11]
C aner, M. and K ock, A. B. (2018). A symptotically honest confidence regions for high dimensional parameters by the desparsified conservative L asso. J ournal of E conometrics , 203(1), 143--168
work page 2018
-
[12]
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters . The Econometrics Journal , 21(1), C1--C68
work page 2018
-
[13]
Chernozhukov, V., Escanciano, J. C., Ichimura, H., Newey, W. K., and Robins, J. M. (2022a). Locally robust semiparametric estimation. Econometrica , 90(4), 1501--1535
-
[14]
Chernozhukov, V., Newey, W. K., and Singh, R. (2022b). Automatic debiased machine learning of causal and structural effects. Econometrica , 90(3), 967--1027
-
[15]
Davydov, Y. A. (1968). Convergence of distributions generated by stationary stochastic processes. Theory of Probability & Its Applications , 13(4), 691--696
work page 1968
-
[16]
Demetrescu, M., Georgiev, I., Rodrigues, P. M., and Taylor, A. R. (2023). Extensions to IVX methods of inference for return predictability. Journal of Econometrics , 237(2), 105271
work page 2023
-
[17]
Deshpande, Y., Javanmard, A., and Mehrabi, M. (2023). Online debiasing for adaptively collected high-dimensional data with applications to time series analysis. Journal of the American Statistical Association , 118(542), 1126--1139
work page 2023
-
[18]
Dimand, R. W. and Geanakoplos, J. (2005). Celebrating I rving F isher: The legacy of a great economist. The American Journal of Economics and Sociology , 64(1), 3--vi
work page 2005
-
[19]
Dominguez, K. M., Fair, R. C., and Shapiro, M. D. (1988). Forecasting the depression: Harvard versus Yale . The American Economic Review , (pp.\ 595--612)
work page 1988
-
[20]
Engemann, K. (2020). What is the P hillips curve (and why has it flattened)? Federal Reserve Bank of St. Louis, January , 14
work page 2020
- [21]
-
[22]
Fan, R. and Lee, J. H. (2019). Predictive quantile regressions under persistence and conditional heteroskedasticity. Journal of Econometrics , 213(1), 261--280
work page 2019
-
[23]
Fisher, I. (1925). Our unstable dollar and the so-called business cycle. Journal of the American Statistical Association , 20(150), 179--202
work page 1925
-
[24]
Fisher, I. (1926). A statistical relation between unemployment and price changes. International Labour Review , 13, 785--792
work page 1926
-
[25]
Fisher, I. (1973). I discovered the Phillips curve: ` A statistical relation between unemployment and price changes'. Journal of Political Economy , 81(2, Part 1), 496--502
work page 1973
-
[26]
Fu, W. and Knight, K. (2000). Asymptotics for L asso-type estimators. The Annals of Statistics , 28(5), 1356--1378
work page 2000
-
[27]
Giannone, D., Lenza, M., and Primiceri, G. E. (2021). Economic predictions with big data: The illusion of sparsity. Econometrica , 89(5), 2409--2437
work page 2021
-
[28]
Gold, D., Lederer, J., and Tao, J. (2020). Inference for high-dimensional instrumental variables regression. Journal of Econometrics , 217(1), 79--111
work page 2020
-
[29]
Granger, C. W. and Newbold, P. (1974). Spurious regressions in econometrics. Journal of Econometrics , 2(2), 111--120
work page 1974
-
[30]
Jansson, M. and Moreira, M. J. (2006). Optimal inference in regression models with nearly integrated regressors. Econometrica , 74(3), 681--714
work page 2006
-
[31]
Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. Journal of Machine Learning Research , 15(1), 2869--2909
work page 2014
-
[32]
Koo, B., Anderson, H. M., Seo, M. H., and Yao, W. (2020). High-dimensional predictive regression in the presence of cointegration. Journal of Econometrics , 219(2), 456--477
work page 2020
-
[33]
Kostakis, A., Magdalinos, T., and Stamatogiannis, M. P. (2015). Robust econometric inference for stock return predictability. The Review of Financial Studies , 28(5), 1506--1553
work page 2015
-
[34]
Kostakis, A., Magdalinos, T., and Stamatogiannis, M. P. (2018). Taking stock of long-horizon predictability tests: Are factor returns predictable? Available at SSRN 3284149
work page 2018
-
[35]
Lee, J. H. (2016). Predictive quantile regression with persistent covariates: IVX-QR approach. Journal of Econometrics , 192(1), 105--118
work page 2016
-
[36]
Lee, J. H., Shi, Z., and Gao, Z. (2022). On LASSO for predictive regression. Journal of Econometrics , 229(2), 322--349
work page 2022
- [37]
-
[38]
Liu, X., Long, W., Peng, L., and Yang, B. (2023). A unified inference for predictive quantile regression. Journal of the American Statistical Association , (pp.\ 1--15)
work page 2023
-
[39]
Liu, X., Yang, B., Cai, Z., and Peng, L. (2019). A unified test for predictability of asset returns regardless of properties of predicting variables. Journal of Econometrics , 208(1), 141--159
work page 2019
-
[40]
Magdalinos, T. and Phillips, P. C. (2009). Limit theory for cointegrated systems with moderately integrated and moderately explosive regressors. Econometric Theory , 25(2), 482--526
work page 2009
-
[41]
Mankiw, N. G. (2024). Six beliefs I have about inflation: Remarks prepared for nber conference on ``inflation in the covid era and beyond''. Journal of Monetary Economics , (pp.\ 103631)
work page 2024
-
[42]
McCracken, M. W. and Ng, S. (2016). FRED-MD : A monthly database for macroeconomic research. Journal of Business & Economic Statistics , 34(4), 574--589
work page 2016
-
[43]
Medeiros, M. C., Vasconcelos, G. F., Veiga, \'A ., and Zilberman, E. (2021). Forecasting inflation in a data-rich environment: the benefits of machine learning methods. Journal of Business & Economic Statistics , 39(1), 98--119
work page 2021
-
[44]
Mei, Z., Phillips, P. C., and Shi, Z. (2024). The boosted hodrick-prescott filter is more general than you might think. Journal of Applied Econometrics
work page 2024
-
[45]
Mei, Z. and Shi, Z. (2024). On LASSO for high dimensional predictive regression. Journal of Econometrics , 242(2), 105809
work page 2024
-
[46]
Onatski, A. and Wang, C. (2018). Alternative asymptotics for cointegration tests in large VAR s. Econometrica , 86(4), 1465--1478
work page 2018
-
[47]
Phillips, A. W. (1958). The relation between unemployment and the rate of change of money wage rates in the united kingdom, 1861-1957. Economica , 25(100), 283--299
work page 1958
-
[48]
Phillips, P. C. (2015). Halbert White Jr. memorial JFEC lecture: Pitfalls and possibilities in predictive regression. Journal of Financial Econometrics , 13(3), 521--555
work page 2015
-
[49]
Phillips, P. C. and Lee, J. H. (2013). Predictive regression under various degrees of persistence and robust long-horizon regression. Journal of Econometrics , 177(2), 250--264
work page 2013
-
[50]
Phillips, P. C. and Lee, J. H. (2016). Robust econometric inference with mixed integrated and mildly explosive regressors. Journal of Econometrics , 192(2), 433--450
work page 2016
-
[51]
Phillips, P. C. and Magdalinos, T. (2007). Limit theory for moderate deviations from a unit root. Journal of Econometrics , 136(1), 115--130
work page 2007
-
[52]
Phillips, P. C. and Magdalinos, T. (2009). Econometric inference in the vicinity of unity. Singapore Management University, CoFie Working Paper , 7
work page 2009
-
[53]
Phillips, P. C. and Shi, Z. (2021). Boosting: Why you can use the HP filter. International Economic Review , 62(2), 521--570
work page 2021
-
[54]
Shi, Z. (2016). Estimation of sparse structural parameters with many endogenous variables. Econometric Reviews , 35(8-10), 1582--1608
work page 2016
-
[55]
Smeekes, S. and Wijler, E. (2018). Macroeconomic forecasting using penalized regression methods. International Journal of Forecasting , 34(3), 408--430
work page 2018
-
[56]
Smeekes, S. and Wijler, E. (2021). An automated approach towards sparse single-equation cointegration modelling. Journal of Econometrics , 221(1), 247--276
work page 2021
-
[57]
Stambaugh, R. F. (1999). Predictive regressions. Journal of Financial Economics , 54(3), 375--421
work page 1999
-
[58]
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology , 58(1), 267--288
work page 1996
-
[59]
Tu, Y. and Xie, X. (2023). Penetrating sporadic return predictability. Journal of Econometrics , 237(1), 105509
work page 2023
-
[60]
van de Geer, S., B \"u hlmann, P., Ritov, Y., and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics , 42(3), 1166--1202
work page 2014
-
[61]
Xu, K.-L. (2020). Testing for multiple-horizon predictability: Direct regression based versus implication based. The Review of Financial Studies , 33(9), 4403--4443
work page 2020
-
[62]
Yang, B., Liu, X., Peng, L., and Cai, Z. (2021). Unified tests for a dynamic predictive regression. Journal of Business & Economic Statistics , 39(3), 684--699
work page 2021
-
[63]
Yang, B., Long, W., Peng, L., and Cai, Z. (2020). Testing the predictability of us housing price index returns based on an IVX-AR model. Journal of the American Statistical Association , 115(532), 1598--1619
work page 2020
-
[64]
Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology , 76(1), 217--242
work page 2014
-
[65]
Zhang, R., Robinson, P., and Yao, Q. (2019). Identifying cointegration by eigenanalysis. Journal of the American Statistical Association , 114(526), 916--927
work page 2019
-
[66]
Zhang, X. and Cheng, G. (2017). Simultaneous inference for high-dimensional linear models. Journal of the American Statistical Association , 112(518), 757--768
work page 2017
-
[67]
Zhu, F., Cai, Z., and Peng, L. (2014). Predictive regressions for macroeconomic data . The Annals of Applied Statistics , 8(1), 577 -- 594
work page 2014
-
[68]
Zhu, Y. (2018). Sparse linear models and _1 -regularized 2SLS with high-dimensional endogenous regressors and instruments. Journal of Econometrics , 202(2), 196--213
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.