arxiv: 2604.09821 · v2 · submitted 2026-04-10 · 💰 econ.EM · q-fin.PM· q-fin.ST

Recognition: unknown

Global Persistence, Local Residual Structure: Forecasting Heterogeneous Investment Panels

Oleg Roshka

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:49 UTC · model grok-4.3

classification 💰 econ.EM q-fin.PMq-fin.ST

keywords panel forecastinginvestment dynamicsglobal-local modelsAR(1) persistenceheterogeneous panelsout-of-sample evaluationresidual structureforecast accuracy

0 comments

The pith

A global AR(1) model combined with block-specific local residuals raises out-of-sample R-squared on investment panel forecasts from 0.630 to 0.677.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a two-stage model can better forecast investment behavior when panels mix macro indicators, institutional data, and firm-level ratios. It applies a shared global AR(1) term to capture common persistence across actors and then models the remaining dynamics separately inside data-type or sector blocks. This architecture lifts full-panel out-of-sample R² from 0.630 to 0.677 on a 93-actor US panel, with the gain appearing in all ten test windows and surviving placebo checks. The improvement also holds on a UK/EU panel and a combined sample, but only when cross-sectional variation in autoregressive behavior is present.

Core claim

The paper claims that a two-stage architecture consisting of a global pooled AR(1) for shared persistence and block-specific local models for residual dynamics improves full-panel out-of-sample R² from 0.630 to 0.677 on heterogeneous investment panels, with the gain confirmed in held-out decade testing, cross-regime replication on UK/EU and combined panels, and stratified placebo permutations that isolate the role of data-type partitions.

What carries the argument

The two-stage forecasting architecture that first applies a global pooled AR(1) to capture common persistence and then fits block-specific local models to the remaining residuals.

If this is right

Full-panel out-of-sample R² rises from 0.630 to 0.677 on the main US panel with a confidence interval of +0.036 to +0.058.
The gain persists when the block partition is fixed on 2005-2014 data and evaluated on unseen 2015-2024 windows.
Similar improvements appear on a UK/EU panel of 109 actors and a joint US+UK/EU panel of 202 actors.
The performance edge requires cross-sectional dispersion in autoregressive coefficients, which data-type partitions reliably produce.
A 146-firm robustness check using CapEx/Assets ratios shows the condition can hold in firm-only samples under suitable ratio choices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same global-plus-local separation could be applied to other economic panels that combine aggregate and micro data, such as consumption or trade flows.
Because the blocks are defined by observable categories, practitioners can implement the method with standard regression tools rather than complex algorithms.
Alternative block definitions, such as grouping firms by size or age, might further isolate residual heterogeneity in firm-only panels.
More accurate subgroup forecasts could support targeted analysis of investment responses to policy changes across sectors or firm types.

Load-bearing premise

The chosen blocks correctly isolate the residual heterogeneity and that local models fitted on those blocks do not overfit when sample sizes per block are modest.

What would settle it

A test in which the proposed data-type or sector blocks are replaced by random assignments and the out-of-sample R² improvement disappears.

Figures

Figures reproduced from arXiv: 2604.09821 by Oleg Roshka.

**Figure 2.** Figure 2: Architecture comparison. (a) The global pipeline applies one residual model to all [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Out-of-sample R2 by test year for global always-on augmentation (G1) and the mixture architecture (M2). M2 exceeds G1 in all 10 windows. Horizontal reference lines show pooled-only and per-actor AR(1) baselines. 4.2 Per-Block Decomposition [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Per-block R2 under three architectures: pooled-only (G0), global augmentation (G1), and mixture PCA+ridge (M2). The diversified sector is degraded by global augmentation (G1 < G0); the tech/health block benefits most from local treatment. 4.3 Geometric Description At the rank K=4 used for block-level comparison (block sizes preclude the model’s operating rank K=8; the K=8 rotation and its unpredictability … view at source ↗

**Figure 5.** Figure 5: Distribution of ∆R2 (mixture − global) across 1,000 random block partitions of identical sizes (grey histogram) versus the real economic partition (vertical red line at +0.047). Random blocks produce a mean gain of −0.004; no random partition exceeds the real gain. The real gain exceeds the placebo distribution by z = 7.82 (Monte Carlo p < 0.001). The 270-actor result reflects the ratio-type distinction: … view at source ↗

read the original abstract

On a 93-actor quarterly panel mixing macro indicators, institutional data, and firm-level investment ratios, global factor augmentation degrades prediction for actor subgroups whose dynamics are misrepresented by the shared basis. A two-stage architecture -- global pooled AR(1) for shared persistence, block-specific local models for residual dynamics -- improves full-panel out-of-sample $R^2$ from 0.630 to 0.677 ($\Delta = +0.047$, CI $[+0.036, +0.058]$, 10/10 windows, placebo $p \leq 0.001$). A held-out decade test (block partition frozen on 2005--2014 data, evaluated on unseen 2015--2024 windows) confirms the gain ($\Delta = +0.050$, 10/10), and a stratified placebo that fixes the macro/firm data-type split and permutes only firm-sector assignments corroborates ($z = 7.25$, $p \leq 0.001$). Cross-regime replication on a 109-actor UK/EU heterogeneous panel ($\Delta = +0.017$, 8/8 windows) and a combined US + UK/EU panel of 202 actors ($\Delta = +0.030$, placebo $z = 9.68$ -- exceeding the original US-only $z = 7.82$) confirms the architecture transfers across regimes. A 146-firm CapEx/Assets robustness check refines the scope condition: the gain depends on cross-sectional dispersion in autoregressive structure, which data-type heterogeneity reliably produces but which is also present in firm-only panels under suitable ratio choices.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The two-stage global AR(1) plus local residuals model gives modest, consistent out-of-sample R² gains on heterogeneous investment panels, with validation that mostly addresses overfitting worries.

read the letter

The paper's main claim is that a pooled AR(1) across the full panel followed by block-specific models on the residuals improves forecast accuracy on mixed macro and firm investment data. On the 93-actor US quarterly panel the out-of-sample R² rises from 0.630 to 0.677, the gain appears in every rolling window, and a placebo that keeps the macro/firm split but shuffles sectors produces a clear z-statistic. A held-out decade test with the partition fixed on earlier data and replications on UK/EU and combined panels show similar though smaller lifts. Those checks are more thorough than the usual single hold-out in this literature, and the numbers come with intervals rather than point estimates alone. The robustness note on CapEx ratios also clarifies when the pattern appears, which is useful. The architecture itself is not brand new, but the targeted application to investment panels plus the layered validation is the concrete addition. The soft spot is still the local stage. Some blocks will have limited observations once the 93 actors are split by data type or sector, so even low-parameter local models can fit transient noise. The placebo and time-split tests push against pure overfitting, yet they cannot fully eliminate the possibility that the block definitions were refined until the gains looked stable. If the partitions were chosen after looking at the full sample, the reported deltas would shrink on truly fresh data. The paper is for econometricians who forecast panels that mix aggregate and disaggregated series and who already worry about subgroup dynamics. A reader working on practical panel forecasting will find the specification and the test sequence easy to adapt or criticize. It is not a broad methodological advance, but the evidence is specific and falsifiable enough to justify sending it to referees rather than rejecting at the desk.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-stage architecture for forecasting heterogeneous investment panels: a global pooled AR(1) capturing shared persistence across all actors, followed by block-specific local models fitted to residuals. On a 93-actor quarterly US panel mixing macro, institutional, and firm-level data, the approach raises full-panel out-of-sample R² from 0.630 to 0.677 (Δ = +0.047, CI [+0.036, +0.058]). Supporting evidence includes 10/10 rolling windows, a stratified placebo (permuting firm-sector labels), a held-out decade test with frozen blocks, and replications on a 109-actor UK/EU panel and a combined 202-actor panel. The gain is scoped to settings with cross-sectional dispersion in autoregressive coefficients, which data-type heterogeneity reliably induces.

Significance. If the reported gains survive scrutiny of block construction and local-model complexity, the paper supplies a practical, interpretable alternative to global factor models that can degrade subgroup performance. The consistent out-of-sample improvements, placebo statistics, held-out validation, and cross-regime replications constitute a reasonably strong empirical case. The explicit scope condition tying gains to dispersion in AR structure is a constructive contribution that could guide application in macro-finance panels.

major comments (2)

[Methods (block construction)] The block partitions (data-type or sector) are load-bearing for the local stage. The abstract states that the partition is frozen on 2005–2014 data for the held-out test, but does not indicate whether the initial choice of partitions was pre-specified on economic grounds or selected after inspecting in-sample fit; without this, the ΔR² = +0.047 could partly reflect data-driven block selection rather than genuine residual heterogeneity.
[Two-stage architecture (local stage)] The local models’ specification is not described in sufficient detail to evaluate overfitting risk. With 93 actors partitioned into data-type or sector blocks, some blocks necessarily contain modest time-series length; if the local stage includes multiple lags, intercepts, or covariates, the reported improvement may capture block-specific noise that fails to generalize. The placebo and held-out tests mitigate but do not fully address this, because they evaluate models whose functional form was chosen on the same data regime.

minor comments (2)

[Abstract] The abstract reports '10/10 windows' and '8/8 windows' without defining window length, overlap, or how the 10/10 count is constructed; a brief clarification in the methods or a footnote would improve reproducibility.
[Throughout] Notation for R², confidence intervals, and placebo z-statistics should be standardized across text, tables, and figures to avoid ambiguity about whether intervals are bootstrap or analytic.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address the two major comments point by point below, clarifying the pre-specification of blocks and committing to expanded methodological detail on the local stage. All revisions will be incorporated in the next version of the manuscript.

read point-by-point responses

Referee: [Methods (block construction)] The block partitions (data-type or sector) are load-bearing for the local stage. The abstract states that the partition is frozen on 2005–2014 data for the held-out test, but does not indicate whether the initial choice of partitions was pre-specified on economic grounds or selected after inspecting in-sample fit; without this, the ΔR² = +0.047 could partly reflect data-driven block selection rather than genuine residual heterogeneity.

Authors: The data-type partitions (macro indicators, institutional series, and firm-level investment ratios) and sector classifications are pre-specified on substantive economic and data-construction grounds, as laid out in the data section prior to any estimation. These groupings reflect the documented heterogeneity in the panel and are fixed before model fitting or in-sample diagnostics. The held-out decade test freezes the already-chosen partitions using only 2005–2014 information for block assignment, but the initial selection itself is not the result of post-hoc inspection of forecasting fit. We will revise the abstract and Section 3 to state this pre-specification explicitly and to reference the relevant data-section discussion. revision: yes
Referee: [Two-stage architecture (local stage)] The local models’ specification is not described in sufficient detail to evaluate overfitting risk. With 93 actors partitioned into data-type or sector blocks, some blocks necessarily contain modest time-series length; if the local stage includes multiple lags, intercepts, or covariates, the reported improvement may capture block-specific noise that fails to generalize. The placebo and held-out tests mitigate but do not fully address this, because they evaluate models whose functional form was chosen on the same data regime.

Authors: We agree that fuller specification of the local stage is needed for readers to assess degrees of freedom and generalization. The local stage applies a simple block-specific AR(1) to the residuals of the global pooled AR(1), with intercept but no further lags or covariates. This parsimonious form is chosen to isolate residual autocorrelation at the block level. The held-out test (blocks frozen on 2005–2014, evaluated on 2015–2024) and the stratified placebo (which holds the functional form fixed while permuting only labels) provide direct evidence that the gains are not driven by in-sample noise. Nevertheless, we will expand the methods section with the exact local-model equation, report the number of observations and parameters per block, and add a brief robustness check on local-model complexity to address the concern fully. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the two-stage forecasting architecture

full rationale

The paper presents an empirical two-stage model (global pooled AR(1) plus block-specific local residuals) whose claimed value is an out-of-sample R² lift, confirmed on held-out decades, stratified placebos that permute only sector labels, and cross-regime replications. These performance metrics are computed on data partitions never used for block definition or parameter fitting, so the reported gains do not reduce by construction to the fitted parameters or to any self-citation. No load-bearing step equates a prediction to its own input via definition, renaming, or imported uniqueness theorem.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The architecture rests on the assumption that investment ratios contain a common AR(1) persistence component plus block-specific residuals; block partitions are treated as given by data-type or sector labels. No new entities are postulated.

free parameters (1)

block partition
The division of the panel into blocks for local modeling is chosen from data-type or sector information and is not derived from the model itself.

axioms (1)

domain assumption Investment ratios follow a linear AR(1) process with a shared persistence parameter across the full panel.
Invoked to justify the global pooled AR(1) stage.

pith-pipeline@v0.9.0 · 5599 in / 1315 out tokens · 40257 ms · 2026-05-10T15:49:49.527687+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 1 canonical work pages

[1]

and Bai, J

Ando, T. and Bai, J. (2017). Clustering huge number of financial time series: A panel data approach with high-dimensional predictors and factor structures. Journal of the American Statistical Association, 112(519):1182--1198

2017
[2]

Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4):1229--1279

2009
[3]

and Ng, S

Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1):191--221

2002
[4]

Bates, J. M. and Granger, C. W. J. (1969). The combination of forecasts. Journal of the Operational Research Society, 20(4):451--468

1969
[5]

and Manresa, E

Bonhomme, S. and Manresa, E. (2015). Grouped patterns of heterogeneity in panel models. Econometrica, 83(3):1147--1184

2015
[6]

and Pesaran, M

Chudik, A. and Pesaran, M. H. (2015). Common correlated effects estimation of heterogeneous dynamic panel data models with weakly exogenous regressors. Journal of Econometrics, 188(2):393--420

2015
[7]

Clark, T. E. and West, K. D. (2007). Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138(1):291--311

2007
[8]

Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3):253--263

1995
[9]

Doz, C., Giannone, D., and Reichlin, L. (2012). A quasi-maximum likelihood approach for large, approximate dynamic factor models. The Review of Economics and Statistics, 94(4):1014--1024

2012
[10]

Gu, S., Kelly, B., and Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5):2223--2273

2020
[11]

and White, H

Giacomini, R. and White, H. (2006). Tests of conditional predictive ability. Econometrica, 74(6):1545--1578

2006
[12]

Harvey, D., Leybourne, S., and Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of Forecasting, 13(2):281--291

1997
[13]

Kiefer, N. M. and Vogelsang, T. J. (2005). A new asymptotic theory for heteroskedasticity-autocorrelation robust tests. Econometric Theory, 21(6):1130--1164

2005
[14]

T., Fan, J., and Wu, Y

Ke, Z. T., Fan, J., and Wu, Y. (2015). Homogeneity pursuit. Journal of the American Statistical Association, 110(511):175--194

2015
[15]

N., Brunton, S

Kutz, J. N., Brunton, S. L., Brunton, B. W., and Proctor, J. L. (2016). Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems. SIAM

2016
[16]

and Wolf, M

Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2):365--411

2004
[17]

Mehra, R. K. (1970). On the identification of variances and adaptive Kalman filtering. IEEE Transactions on Automatic Control, 15(2):175--184

1970
[18]

Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74(4):967--1012

2006
[19]

Schmid, P. J. (2010). Dynamic mode decomposition of numerical and experimental data. Journal of Fluid Mechanics, 656:5--28

2010
[20]

Stock, J. H. and Watson, M. W. (2002). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460):1167--1179

2002
[21]

Su, L., Shi, Z., and Phillips, P. C. B. (2016). Identifying latent structures in panel data. Econometrica, 84(6):2215--2264

2016
[22]

Timmermann, A. (2006). Forecast combinations. In Elliott, G., Granger, C. W. J., and Timmermann, A., editors, Handbook of Economic Forecasting, volume 1, pages 135--196. Elsevier

2006
[23]

Roshka, O. (2026). Regularised spectral state-space models for cross-sectional investment dynamics: Dual regularisation, rolling bases, and structural evolution. Available at SSRN: https://dx.doi.org/10.2139/ssrn.6512600

work page doi:10.2139/ssrn.6512600 2026