On estimation of the effect lag of predictors and prediction in functional linear model

Georgios Aivaliotis; Haiyan Liu; Jeanine Houwing-Duistermaat

arxiv: 1907.09808 · v1 · pith:E6CRBMXZnew · submitted 2019-07-23 · 📊 stat.ME · stat.CO

On estimation of the effect lag of predictors and prediction in functional linear model

Haiyan Liu , Georgios Aivaliotis , Jeanine Houwing-Duistermaat This is my paper

Pith reviewed 2026-05-24 17:18 UTC · model grok-4.3

classification 📊 stat.ME stat.CO

keywords functional linear modeleffect lag estimationbasis expansionpenalized estimationprediction errorgrid searchlongitudinal predictors

0 comments

The pith

A penalized basis-expansion functional linear model estimates effect lags of predictors via grid search on prediction error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a functional linear model to predict a scalar response from multiple functional and longitudinal predictors while also recovering the time lags at which each predictor influences the response. Coefficient functions are expressed as expansions in a chosen basis system such as functional principal components or splines; the basis coefficients are obtained by minimizing a penalized criterion. Time lags are then located by simultaneously evaluating a discrete grid mesh and retaining the combination that yields the lowest prediction error. Mathematical properties of the resulting parameter estimates and predictions are derived, and the procedure is tested in extensive simulations.

Core claim

By expanding coefficient functions in a fixed basis system, estimating the basis coefficients via penalization, and determining time lags through simultaneous search on a prior grid mesh that minimizes a prediction-error criterion, the model simultaneously estimates effect lags and produces response predictions, with the estimated parameters and predicted responses satisfying studied mathematical properties.

What carries the argument

penalized basis-expansion functional linear model with simultaneous grid search over predictor lags based on prediction-error minimization

If this is right

The estimated parameters and predicted responses satisfy the mathematical properties derived in the paper.
Multiple functional and longitudinal predictors can be handled simultaneously under the same penalized criterion.
Lag selection is performed jointly rather than predictor by predictor.
Performance can be assessed directly by how well the method recovers known lags and forecasts in simulation settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The grid-search strategy trades off exhaustive search for computational tractability but may require denser meshes or adaptive refinement when lags are expected to be fine-scale.
Because lag selection is driven by prediction error, the procedure could be sensitive to the choice of basis or penalty when the functional linear assumption is only approximately true.
The framework naturally extends to settings with irregularly spaced longitudinal observations by incorporating appropriate basis representations for each predictor.
Real-data applications in longitudinal studies would benefit from comparing the grid-selected lags against domain-knowledge windows to check consistency.

Load-bearing premise

Searching lags on a pre-specified finite grid mesh and selecting by prediction error will recover the true lags or sufficiently close values without the grid being too coarse or the criterion being misled by model misspecification or multiple local minima.

What would settle it

A controlled simulation in which the true lag lies between grid points or outside the mesh and the procedure returns a substantially different lag that produces visibly worse out-of-sample prediction error than the oracle lag.

Figures

Figures reproduced from arXiv: 1907.09808 by Georgios Aivaliotis, Haiyan Liu, Jeanine Houwing-Duistermaat.

**Figure 2.** Figure 2: One simulation result: The first left above plot is the true [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

read the original abstract

We propose a functional linear model to predict a response using multiple functional and longitudinal predictors and to estimate the effect lags of predictors. The coefficient functions are written as the expansion of a basis system (e.g. functional principal components, splines), and the coefficients of the fixed basis functions are estimated via optimizing a penalization criterion. Then time lags are determined by simultaneously searching on a prior grid mesh based on minimization of prediction error criterion. Moreover, mathematical properties of the estimated parameters and predicted responses are studied and performance of the method is evaluated by extensive simulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable grid-search procedure for picking lags inside a multi-predictor penalized functional linear model, but the claimed mathematical properties are unlikely to hold once the lags are chosen from the data.

read the letter

The core contribution is a method that expands the coefficient functions in a basis, fits them with penalization, and then chooses the lags for several predictors at once by minimizing a prediction-error criterion over a fixed grid. The joint search across predictors is the part that is not already standard in the functional data literature. The simulations are described as extensive, which is the usual way these papers check practical behavior when the theory is incomplete. That combination of basis expansion, penalization, and simultaneous lag search is what the work actually adds. The abstract says mathematical properties are derived, but the stress-test concern is on target: standard convergence results for penalized functional linear models assume the lags are fixed in advance. Once the lags are picked by discrete search on prediction error, those results do not transfer without extra arguments showing that the selected lags are close enough to the truth at a suitable rate and that the criterion avoids bad local minima when predictors are correlated. No such argument is visible in the given information, so the properties of the final estimator are probably conditional on the lags being known. The grid mesh and the prediction criterion are also free parameters that can affect the outcome, yet the abstract gives no detail on how sensitive the results are to those choices. This paper is aimed at applied statisticians working with lagged functional predictors in longitudinal settings. A reader who needs a concrete algorithm for that task could extract the procedure and test it on their own data. It is coherent on its own terms and shows honest engagement with the functional data literature, so it is worth sending to a referee who can check whether the full derivations address the data-dependent lag issue or whether the claims need to be scaled back to the fixed-lag case. I would not cite it yet, but I would bring it to a reading group if someone is actively working on lag estimation in functional models.

Referee Report

3 major / 2 minor

Summary. The paper proposes a functional linear model for predicting a scalar response from multiple functional/longitudinal predictors. Coefficient functions are expanded in a basis (e.g., FPCs or splines), coefficients are obtained by penalized least squares, and predictor lags are selected simultaneously by grid search over a pre-specified mesh that minimizes a prediction-error criterion. Mathematical properties of the resulting estimators and predictions are derived, and performance is assessed via simulations.

Significance. If the theoretical properties can be shown to hold after data-driven lag selection and if the simulation design covers realistic misspecification, the procedure would supply a practical, simultaneously estimated lag-and-prediction method for multivariate functional regression. The penalized basis-expansion framework itself is standard and the explicit treatment of lags is a useful extension, but the significance is tempered by the absence of supporting theory for the combined estimator.

major comments (3)

[Mathematical properties section] Mathematical properties section: the consistency and rate results are stated for the penalized basis estimator with fixed lags; once lags are chosen by discrete minimization of prediction error over a finite grid, the data-dependent selection step is not shown to preserve the same rates or to converge to the true lags at a rate compatible with the basis asymptotics.
[Lag-selection procedure] Lag-selection procedure: the claim that simultaneous grid search recovers (near-)true lags rests on the unproven assumption that the prediction-error criterion is immune to local minima and to bias induced by coarse grids or correlated predictors; no supporting argument or additional regularity condition is supplied.
[Simulation design] Simulation design: the reported experiments do not include the regimes highlighted as weakest (coarse grid mesh, correlated predictors, or model misspecification), so they do not directly test whether the prediction criterion reliably recovers the lags under the conditions where the method is most likely to fail.

minor comments (2)

Notation for the penalty parameter and the grid mesh should be introduced once and used consistently; currently the same symbols appear with different meanings in the estimation and selection steps.
The abstract states that 'mathematical properties … are studied,' yet the manuscript provides only sketches rather than complete proofs; moving the full derivations to an appendix would improve readability.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the detailed and insightful comments on our manuscript. We appreciate the recognition of the practical utility of the proposed method for simultaneous lag estimation and prediction in multivariate functional regression. Below, we provide point-by-point responses to the major comments. We acknowledge several limitations in the current theoretical and simulation results and will revise the manuscript accordingly to clarify these points and strengthen the presentation.

read point-by-point responses

Referee: [Mathematical properties section] Mathematical properties section: the consistency and rate results are stated for the penalized basis estimator with fixed lags; once lags are chosen by discrete minimization of prediction error over a finite grid, the data-dependent selection step is not shown to preserve the same rates or to converge to the true lags at a rate compatible with the basis asymptotics.

Authors: We agree with this observation. The mathematical properties, including consistency and convergence rates, are derived assuming the lags are fixed and known. The lag selection via grid search is a practical step, but we do not provide theoretical guarantees for the data-dependent case. In the revised manuscript, we will explicitly note that the theoretical results apply to fixed lags and add a discussion on the potential impact of lag selection on the rates, including conditions (such as sufficiently fine grid and unique minimum) under which the rates are expected to remain valid. However, a full proof for the combined estimator is beyond the scope of this work. revision: partial
Referee: [Lag-selection procedure] Lag-selection procedure: the claim that simultaneous grid search recovers (near-)true lags rests on the unproven assumption that the prediction-error criterion is immune to local minima and to bias induced by coarse grids or correlated predictors; no supporting argument or additional regularity condition is supplied.

Authors: The manuscript does not make a strong claim of always recovering the true lags; it describes the procedure as determining lags by grid search minimizing the prediction error. We recognize that the prediction-error criterion may have local minima or be affected by grid coarseness and predictor correlations, and no regularity conditions are provided to ensure global optimality. In the revision, we will include additional discussion on these potential issues, suggest ways to mitigate them (e.g., multiple starting points or finer grids), and note this as a limitation of the current approach. revision: yes
Referee: [Simulation design] Simulation design: the reported experiments do not include the regimes highlighted as weakest (coarse grid mesh, correlated predictors, or model misspecification), so they do not directly test whether the prediction criterion reliably recovers the lags under the conditions where the method is most likely to fail.

Authors: The current simulations evaluate performance under several settings but do not specifically include challenging cases such as coarse grids, highly correlated predictors, or model misspecification. We will expand the simulation section in the revision to incorporate these regimes, providing a more comprehensive assessment of the method's robustness and the reliability of the lag selection procedure. revision: yes

standing simulated objections not resolved

The provision of complete theoretical results establishing consistency and rates for the estimator after data-driven lag selection, as this would require substantial new theoretical development.

Circularity Check

0 steps flagged

No circularity: estimation procedure and lag search are algorithmic, not self-referential.

full rationale

The paper describes a penalized basis-expansion estimator for the functional linear model followed by discrete grid search over lags chosen to minimize a prediction-error criterion. Mathematical properties of the resulting estimates are stated to be studied. No equation or claim in the abstract reduces a derived quantity (prediction or property) to a fitted input by construction, nor invokes a self-citation chain, uniqueness theorem, or ansatz that would make the result tautological. The lag-selection step is an explicit optimization over a pre-specified mesh; its outputs are not defined in terms of themselves. This is a standard estimation algorithm whose claimed properties, if derived, would be external to the fitting procedure itself.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full paper text unavailable, so ledger entries are inferred at the level of stated method components.

free parameters (2)

penalty parameter
Controls smoothness in the penalization criterion for basis coefficients; value chosen during optimization.
lag grid mesh
Finite set of candidate time lags searched exhaustively by prediction error.

axioms (1)

domain assumption Coefficient functions admit expansion in a chosen basis system (FPCs or splines).
Explicitly invoked as the representation step for the unknown coefficient functions.

pith-pipeline@v0.9.0 · 5623 in / 1341 out tokens · 24141 ms · 2026-05-24T17:18:13.570710+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 2 internal anchors

[1]

and Liu , H

Beran, J. and Liu , H. (2014). On estimation of mean and covariance functions in repeated time series with long-memory errors. Lithuanian Mathematical Journal, 54(1), 8-34. 15 Table 1: NPEs based on correct lags n 50 100 150 200 NPE×100 2.08 1.95 1.86 1.79

work page 2014
[2]

and Gijbels, I

Fan, J. and Gijbels, I. (1996). Local polynomial modeling and its applica- tions. CRC Press

work page 1996
[3]

A., Laird, N

Harezlak, J., Coull, B. A., Laird, N. M., Magari, S. R., and Christiani, D. C. (2007). Penalized solutions to functional regression problems. Compu- tational statistics and data analysis , 51(10), 4911-4925

work page 2007
[4]

and Kokoszka, P

Horvath, L. and Kokoszka, P. (2012). Inference for functional data with applications. Springer Science and Business Media

work page 2012
[5]

Kim, K., Sent¨ urk, D., and Li, R. (2011). Recent history functional linear models for sparse longitudinal data. Journal of statistical planning and inference, 141(4), 1554-1566

work page 2011
[6]

On trend and its derivatives estimation in repeated time series with subordinated long-range dependent errors

Liu, H. and Houwing-Duistermaat, J. (2018). On trend and its deriva- tive estimation in repeated unevenly spaced time series with long-range dependent errors. arXiv:1803.05411

work page internal anchor Pith review Pith/arXiv arXiv 2018
[7]

and Houwing-Duistermaat, J

Liu, H., Del Galdo, F. and Houwing-Duistermaat, J. (2018). Functional principal component analysis in predicting Scleroderma disease based on patients historical data

work page 2018
[8]

T. E. Harris and branching processes

Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M. and Phillips, O.L. (2011). ForestPlots.net: a web application and research tool to manage and anal- yse tropical forest plot data. Journal of Vegetation Science 22: 610613. doi: 10.1111/j.1654-1103.2011.01312.x

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1111/j.1654-1103.2011.01312.x 2011
[9]

and Phillips, O.L

Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M., Baker T.R. and Phillips, O.L. (2009). ForestPlots.net Database.www.forestplots.net. Date of ex- traction [03,01,19]

work page 2009
[10]

V., Kent, J

Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate Anal- ysis. Academic Press

work page 1979
[11]

and Ramsay, J

Malfait, N. and Ramsay, J. O. (2003). The historical functional linear model. Canadian Journal of Statistics , 31(2), 115-128. 16

work page 2003
[12]

and Paul, D

Peng, J. and Paul, D. (2009). A geometric approach to maximum likeli- hood estimation of the functional principal components from sparse lon- gitudinal data. Journal of Computational and Graphical Statistics , 18(4), 995-1015

work page 2009
[13]

M., Staicu, A

Pomann, G. M., Staicu, A. M., Lobaton, E. J., Mejia, A. F., Dewey, B. E., Reich, D. S., ... and Shinohara, R. T. (2016). A lag functional linear model for prediction of magnetization transfer ratio in multiple sclerosis lesions. The Annals of Applied Statistics , 10(4), 2325-2348

work page 2016
[14]

Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodologi- cal), 539-572

work page 1991
[15]

and Silverman, B.W

Ramsay, J.O. and Silverman, B.W. (2005). Functional Data Analysis (Second Edition)

work page 2005
[16]

Yao, F., M¨ uller, H. G. and Wang, J. L. (2005a). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Associ- ation, 100(470), 577-590

work page
[17]

G., and Wang, J

Yao, F., M¨ uller, H. G., and Wang, J. L. (2005b). Functional linear regression analysis for longitudinal data. The Annals of Statistics , 33(6), 2873-2903. 17

work page

[1] [1]

and Liu , H

Beran, J. and Liu , H. (2014). On estimation of mean and covariance functions in repeated time series with long-memory errors. Lithuanian Mathematical Journal, 54(1), 8-34. 15 Table 1: NPEs based on correct lags n 50 100 150 200 NPE×100 2.08 1.95 1.86 1.79

work page 2014

[2] [2]

and Gijbels, I

Fan, J. and Gijbels, I. (1996). Local polynomial modeling and its applica- tions. CRC Press

work page 1996

[3] [3]

A., Laird, N

Harezlak, J., Coull, B. A., Laird, N. M., Magari, S. R., and Christiani, D. C. (2007). Penalized solutions to functional regression problems. Compu- tational statistics and data analysis , 51(10), 4911-4925

work page 2007

[4] [4]

and Kokoszka, P

Horvath, L. and Kokoszka, P. (2012). Inference for functional data with applications. Springer Science and Business Media

work page 2012

[5] [5]

Kim, K., Sent¨ urk, D., and Li, R. (2011). Recent history functional linear models for sparse longitudinal data. Journal of statistical planning and inference, 141(4), 1554-1566

work page 2011

[6] [6]

On trend and its derivatives estimation in repeated time series with subordinated long-range dependent errors

Liu, H. and Houwing-Duistermaat, J. (2018). On trend and its deriva- tive estimation in repeated unevenly spaced time series with long-range dependent errors. arXiv:1803.05411

work page internal anchor Pith review Pith/arXiv arXiv 2018

[7] [7]

and Houwing-Duistermaat, J

Liu, H., Del Galdo, F. and Houwing-Duistermaat, J. (2018). Functional principal component analysis in predicting Scleroderma disease based on patients historical data

work page 2018

[8] [8]

T. E. Harris and branching processes

Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M. and Phillips, O.L. (2011). ForestPlots.net: a web application and research tool to manage and anal- yse tropical forest plot data. Journal of Vegetation Science 22: 610613. doi: 10.1111/j.1654-1103.2011.01312.x

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1111/j.1654-1103.2011.01312.x 2011

[9] [9]

and Phillips, O.L

Lopez-Gonzalez, G., Lewis, S.L., Burkitt, M., Baker T.R. and Phillips, O.L. (2009). ForestPlots.net Database.www.forestplots.net. Date of ex- traction [03,01,19]

work page 2009

[10] [10]

V., Kent, J

Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate Anal- ysis. Academic Press

work page 1979

[11] [11]

and Ramsay, J

Malfait, N. and Ramsay, J. O. (2003). The historical functional linear model. Canadian Journal of Statistics , 31(2), 115-128. 16

work page 2003

[12] [12]

and Paul, D

Peng, J. and Paul, D. (2009). A geometric approach to maximum likeli- hood estimation of the functional principal components from sparse lon- gitudinal data. Journal of Computational and Graphical Statistics , 18(4), 995-1015

work page 2009

[13] [13]

M., Staicu, A

Pomann, G. M., Staicu, A. M., Lobaton, E. J., Mejia, A. F., Dewey, B. E., Reich, D. S., ... and Shinohara, R. T. (2016). A lag functional linear model for prediction of magnetization transfer ratio in multiple sclerosis lesions. The Annals of Applied Statistics , 10(4), 2325-2348

work page 2016

[14] [14]

Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. Journal of the Royal Statistical Society. Series B (Methodologi- cal), 539-572

work page 1991

[15] [15]

and Silverman, B.W

Ramsay, J.O. and Silverman, B.W. (2005). Functional Data Analysis (Second Edition)

work page 2005

[16] [16]

Yao, F., M¨ uller, H. G. and Wang, J. L. (2005a). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Associ- ation, 100(470), 577-590

work page

[17] [17]

G., and Wang, J

Yao, F., M¨ uller, H. G., and Wang, J. L. (2005b). Functional linear regression analysis for longitudinal data. The Annals of Statistics , 33(6), 2873-2903. 17

work page