On estimation of the effect lag of predictors and prediction in functional linear model
Pith reviewed 2026-05-24 17:18 UTC · model grok-4.3
The pith
A penalized basis-expansion functional linear model estimates effect lags of predictors via grid search on prediction error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By expanding coefficient functions in a fixed basis system, estimating the basis coefficients via penalization, and determining time lags through simultaneous search on a prior grid mesh that minimizes a prediction-error criterion, the model simultaneously estimates effect lags and produces response predictions, with the estimated parameters and predicted responses satisfying studied mathematical properties.
What carries the argument
penalized basis-expansion functional linear model with simultaneous grid search over predictor lags based on prediction-error minimization
If this is right
- The estimated parameters and predicted responses satisfy the mathematical properties derived in the paper.
- Multiple functional and longitudinal predictors can be handled simultaneously under the same penalized criterion.
- Lag selection is performed jointly rather than predictor by predictor.
- Performance can be assessed directly by how well the method recovers known lags and forecasts in simulation settings.
Where Pith is reading between the lines
- The grid-search strategy trades off exhaustive search for computational tractability but may require denser meshes or adaptive refinement when lags are expected to be fine-scale.
- Because lag selection is driven by prediction error, the procedure could be sensitive to the choice of basis or penalty when the functional linear assumption is only approximately true.
- The framework naturally extends to settings with irregularly spaced longitudinal observations by incorporating appropriate basis representations for each predictor.
- Real-data applications in longitudinal studies would benefit from comparing the grid-selected lags against domain-knowledge windows to check consistency.
Load-bearing premise
Searching lags on a pre-specified finite grid mesh and selecting by prediction error will recover the true lags or sufficiently close values without the grid being too coarse or the criterion being misled by model misspecification or multiple local minima.
What would settle it
A controlled simulation in which the true lag lies between grid points or outside the mesh and the procedure returns a substantially different lag that produces visibly worse out-of-sample prediction error than the oracle lag.
Figures
read the original abstract
We propose a functional linear model to predict a response using multiple functional and longitudinal predictors and to estimate the effect lags of predictors. The coefficient functions are written as the expansion of a basis system (e.g. functional principal components, splines), and the coefficients of the fixed basis functions are estimated via optimizing a penalization criterion. Then time lags are determined by simultaneously searching on a prior grid mesh based on minimization of prediction error criterion. Moreover, mathematical properties of the estimated parameters and predicted responses are studied and performance of the method is evaluated by extensive simulations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a functional linear model for predicting a scalar response from multiple functional/longitudinal predictors. Coefficient functions are expanded in a basis (e.g., FPCs or splines), coefficients are obtained by penalized least squares, and predictor lags are selected simultaneously by grid search over a pre-specified mesh that minimizes a prediction-error criterion. Mathematical properties of the resulting estimators and predictions are derived, and performance is assessed via simulations.
Significance. If the theoretical properties can be shown to hold after data-driven lag selection and if the simulation design covers realistic misspecification, the procedure would supply a practical, simultaneously estimated lag-and-prediction method for multivariate functional regression. The penalized basis-expansion framework itself is standard and the explicit treatment of lags is a useful extension, but the significance is tempered by the absence of supporting theory for the combined estimator.
major comments (3)
- [Mathematical properties section] Mathematical properties section: the consistency and rate results are stated for the penalized basis estimator with fixed lags; once lags are chosen by discrete minimization of prediction error over a finite grid, the data-dependent selection step is not shown to preserve the same rates or to converge to the true lags at a rate compatible with the basis asymptotics.
- [Lag-selection procedure] Lag-selection procedure: the claim that simultaneous grid search recovers (near-)true lags rests on the unproven assumption that the prediction-error criterion is immune to local minima and to bias induced by coarse grids or correlated predictors; no supporting argument or additional regularity condition is supplied.
- [Simulation design] Simulation design: the reported experiments do not include the regimes highlighted as weakest (coarse grid mesh, correlated predictors, or model misspecification), so they do not directly test whether the prediction criterion reliably recovers the lags under the conditions where the method is most likely to fail.
minor comments (2)
- Notation for the penalty parameter and the grid mesh should be introduced once and used consistently; currently the same symbols appear with different meanings in the estimation and selection steps.
- The abstract states that 'mathematical properties … are studied,' yet the manuscript provides only sketches rather than complete proofs; moving the full derivations to an appendix would improve readability.
Simulated Author's Rebuttal
We thank the referee for the detailed and insightful comments on our manuscript. We appreciate the recognition of the practical utility of the proposed method for simultaneous lag estimation and prediction in multivariate functional regression. Below, we provide point-by-point responses to the major comments. We acknowledge several limitations in the current theoretical and simulation results and will revise the manuscript accordingly to clarify these points and strengthen the presentation.
read point-by-point responses
-
Referee: [Mathematical properties section] Mathematical properties section: the consistency and rate results are stated for the penalized basis estimator with fixed lags; once lags are chosen by discrete minimization of prediction error over a finite grid, the data-dependent selection step is not shown to preserve the same rates or to converge to the true lags at a rate compatible with the basis asymptotics.
Authors: We agree with this observation. The mathematical properties, including consistency and convergence rates, are derived assuming the lags are fixed and known. The lag selection via grid search is a practical step, but we do not provide theoretical guarantees for the data-dependent case. In the revised manuscript, we will explicitly note that the theoretical results apply to fixed lags and add a discussion on the potential impact of lag selection on the rates, including conditions (such as sufficiently fine grid and unique minimum) under which the rates are expected to remain valid. However, a full proof for the combined estimator is beyond the scope of this work. revision: partial
-
Referee: [Lag-selection procedure] Lag-selection procedure: the claim that simultaneous grid search recovers (near-)true lags rests on the unproven assumption that the prediction-error criterion is immune to local minima and to bias induced by coarse grids or correlated predictors; no supporting argument or additional regularity condition is supplied.
Authors: The manuscript does not make a strong claim of always recovering the true lags; it describes the procedure as determining lags by grid search minimizing the prediction error. We recognize that the prediction-error criterion may have local minima or be affected by grid coarseness and predictor correlations, and no regularity conditions are provided to ensure global optimality. In the revision, we will include additional discussion on these potential issues, suggest ways to mitigate them (e.g., multiple starting points or finer grids), and note this as a limitation of the current approach. revision: yes
-
Referee: [Simulation design] Simulation design: the reported experiments do not include the regimes highlighted as weakest (coarse grid mesh, correlated predictors, or model misspecification), so they do not directly test whether the prediction criterion reliably recovers the lags under the conditions where the method is most likely to fail.
Authors: The current simulations evaluate performance under several settings but do not specifically include challenging cases such as coarse grids, highly correlated predictors, or model misspecification. We will expand the simulation section in the revision to incorporate these regimes, providing a more comprehensive assessment of the method's robustness and the reliability of the lag selection procedure. revision: yes
- The provision of complete theoretical results establishing consistency and rates for the estimator after data-driven lag selection, as this would require substantial new theoretical development.
Circularity Check
No circularity: estimation procedure and lag search are algorithmic, not self-referential.
full rationale
The paper describes a penalized basis-expansion estimator for the functional linear model followed by discrete grid search over lags chosen to minimize a prediction-error criterion. Mathematical properties of the resulting estimates are stated to be studied. No equation or claim in the abstract reduces a derived quantity (prediction or property) to a fitted input by construction, nor invokes a self-citation chain, uniqueness theorem, or ansatz that would make the result tautological. The lag-selection step is an explicit optimization over a pre-specified mesh; its outputs are not defined in terms of themselves. This is a standard estimation algorithm whose claimed properties, if derived, would be external to the fitting procedure itself.
Axiom & Free-Parameter Ledger
free parameters (2)
- penalty parameter
- lag grid mesh
axioms (1)
- domain assumption Coefficient functions admit expansion in a chosen basis system (FPCs or splines).
Reference graph
Works this paper leans on
-
[1]
Beran, J. and Liu , H. (2014). On estimation of mean and covariance functions in repeated time series with long-memory errors. Lithuanian Mathematical Journal, 54(1), 8-34. 15 Table 1: NPEs based on correct lags n 50 100 150 200 NPE×100 2.08 1.95 1.86 1.79
work page 2014
-
[2]
Fan, J. and Gijbels, I. (1996). Local polynomial modeling and its applica- tions. CRC Press
work page 1996
-
[3]
Harezlak, J., Coull, B. A., Laird, N. M., Magari, S. R., and Christiani, D. C. (2007). Penalized solutions to functional regression problems. Compu- tational statistics and data analysis , 51(10), 4911-4925
work page 2007
-
[4]
Horvath, L. and Kokoszka, P. (2012). Inference for functional data with applications. Springer Science and Business Media
work page 2012
-
[5]
Kim, K., Sent¨ urk, D., and Li, R. (2011). Recent history functional linear models for sparse longitudinal data. Journal of statistical planning and inference, 141(4), 1554-1566
work page 2011
-
[6]
Liu, H. and Houwing-Duistermaat, J. (2018). On trend and its deriva- tive estimation in repeated unevenly spaced time series with long-range dependent errors. arXiv:1803.05411
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[7]
Liu, H., Del Galdo, F. and Houwing-Duistermaat, J. (2018). Functional principal component analysis in predicting Scleroderma disease based on patients historical data
work page 2018
-
[8]
T. E. Harris and branching processes
Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M. and Phillips, O.L. (2011). ForestPlots.net: a web application and research tool to manage and anal- yse tropical forest plot data. Journal of Vegetation Science 22: 610613. doi: 10.1111/j.1654-1103.2011.01312.x
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1111/j.1654-1103.2011.01312.x 2011
-
[9]
Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M., Baker T.R. and Phillips, O.L. (2009). ForestPlots.net Database.www.forestplots.net. Date of ex- traction [03,01,19]
work page 2009
-
[10]
Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate Anal- ysis. Academic Press
work page 1979
-
[11]
Malfait, N. and Ramsay, J. O. (2003). The historical functional linear model. Canadian Journal of Statistics , 31(2), 115-128. 16
work page 2003
-
[12]
Peng, J. and Paul, D. (2009). A geometric approach to maximum likeli- hood estimation of the functional principal components from sparse lon- gitudinal data. Journal of Computational and Graphical Statistics , 18(4), 995-1015
work page 2009
-
[13]
Pomann, G. M., Staicu, A. M., Lobaton, E. J., Mejia, A. F., Dewey, B. E., Reich, D. S., ... and Shinohara, R. T. (2016). A lag functional linear model for prediction of magnetization transfer ratio in multiple sclerosis lesions. The Annals of Applied Statistics , 10(4), 2325-2348
work page 2016
-
[14]
Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodologi- cal), 539-572
work page 1991
-
[15]
Ramsay, J.O. and Silverman, B.W. (2005). Functional Data Analysis (Second Edition)
work page 2005
-
[16]
Yao, F., M¨ uller, H. G. and Wang, J. L. (2005a). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Associ- ation, 100(470), 577-590
-
[17]
Yao, F., M¨ uller, H. G., and Wang, J. L. (2005b). Functional linear regression analysis for longitudinal data. The Annals of Statistics , 33(6), 2873-2903. 17
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.