pith. sign in

arxiv: 2605.00765 · v1 · submitted 2026-05-01 · 📊 stat.ME

Efficient Longitudinal Function-on-Function Regression

Pith reviewed 2026-05-09 19:05 UTC · model grok-4.3

classification 📊 stat.ME
keywords longitudinal function-on-function regressionfunctional data analysiswearable sensor dataphysical activity interventionefficient inferencepointwise regressioncluster bootstrap
0
0 comments X

The pith

A marginal three-step procedure performs efficient estimation and inference for longitudinal function-on-function regression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method for longitudinal data in which both responses and predictors are functions observed repeatedly over time, as occurs with daily activity curves recorded across multiple study visits. It replaces a single joint model with three sequential steps: massive pointwise scalar-on-function regressions, bivariate smoothing of the coefficient surfaces, and construction of confidence bands by either analytic Gaussian formulas or cluster bootstrap. The approach is illustrated on wearable-sensor data from an older-adult physical-activity trial, where it identifies morning increases under interpersonal but not intrapersonal intervention strategies. Simulation results indicate that the procedure recovers the functional coefficients accurately and produces intervals with correct coverage while requiring far less computation than existing joint-model alternatives.

Core claim

The authors propose a marginal three-step approach for longitudinal function-on-function regression consisting of fitting massive pointwise longitudinal scalar-on-function regression models, smoothing the resulting estimates along the bivariate functional domain, and computing confidence bands using either an analytic approach for Gaussian data or a cluster bootstrap for Gaussian or non-Gaussian data. This procedure achieves accurate estimation and valid inference while substantially reducing computational burden compared to existing approaches, as demonstrated in simulation studies and an application to physical activity intervention data.

What carries the argument

The marginal three-step approach: pointwise model fitting followed by bivariate smoothing of coefficient surfaces and either analytic Gaussian or cluster-bootstrap inference.

If this is right

  • Accurate recovery of the bivariate functional coefficient surface is obtained even when the full joint model is intractable.
  • Valid pointwise and simultaneous inference holds for both Gaussian and non-Gaussian longitudinal responses.
  • The procedure scales to the high-dimensional wearable-sensor data typical of modern intervention trials.
  • Time-of-day specific intervention effects, such as morning activity increases, become detectable in practical run times.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pointwise-then-smooth strategy could be tested on other high-dimensional repeated functional observations, such as continuous glucose monitoring or gait analysis.
  • Extensions that accommodate irregular visit times or missing sensor readings would widen applicability to real-world longitudinal studies.
  • Direct runtime comparisons with fully joint Bayesian models would quantify the precise speed-accuracy trade-off for different data sizes.

Load-bearing premise

The smoothing step along the bivariate functional domain, combined with either analytic Gaussian bands or cluster bootstrap, produces valid pointwise and simultaneous inference without introducing bias from the marginal three-step approximation.

What would settle it

A simulation with known true functional coefficients in which the constructed confidence bands achieve coverage rates materially below nominal levels across repeated samples.

Figures

Figures reproduced from arXiv: 2605.00765 by Erjia Cui, Leif Verace, Siobhan McMahon.

Figure 1
Figure 1. Figure 1: Physical activity (PA) trajectories for four study participants in different treatment view at source ↗
Figure 2
Figure 2. Figure 2: Step counts by assessment across treatment groups in the RS3 data. Curves are view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of estimation accuracy, confidence band coverage, and computing view at source ↗
Figure 4
Figure 4. Figure 4: Estimated scalar coefficients from RS3 data. Smoothed coefficient estimates are view at source ↗
Figure 5
Figure 5. Figure 5: Estimated functional coefficients Γˆ1(s, u), . . . , Γˆ4(s, u) and contrasts between in￾terpersonal treatment groups and control group from RS3 data. The coefficient surface is shown via heat map, while pointwise 95% confidence intervals are contained within con￾tours. Significantly positive pointwise regions are shown in red, while significantly negative pointwise regions are shown in blue. 31 view at source ↗
read the original abstract

We propose a computationally efficient inferential procedure for longitudinal function-on-function regression. The method follows a marginal three-step approach: (1) fit massive pointwise longitudinal scalar-on-function regression models, (2) smooth the resulting estimates along the bivariate functional domain, and (3) compute confidence bands using either an analytic approach for Gaussian data or a cluster bootstrap for Gaussian or non-Gaussian data. Simulation studies demonstrate that the proposed method achieves accurate estimation and valid inference, while substantially reducing computational burden compared to existing approaches. Methods are motivated by a physical activity intervention trial in older adults where high-dimensional wearable data were collected longitudinally across multiple visits. Our applications reveal significant increases in physical activity in the morning using interpersonal intervention strategies, but not intrapersonal strategies. The proposed methods are implemented in an R package.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a computationally efficient marginal three-step procedure for longitudinal function-on-function regression: (1) fitting independent pointwise longitudinal scalar-on-function models, (2) smoothing the resulting coefficient surfaces over the bivariate functional domain, and (3) constructing pointwise and simultaneous confidence bands via either analytic Gaussian methods or subject-level cluster bootstrap. Simulations are claimed to show accurate estimation, valid inference, and substantial computational savings relative to existing methods. The approach is motivated by and applied to high-dimensional wearable physical activity data from an older-adult intervention trial, where interpersonal (but not intrapersonal) strategies are found to increase morning activity. An R package implementation is provided.

Significance. If the central claims hold, the work offers a practical, scalable tool for inference in high-dimensional longitudinal functional data settings that are increasingly common in wearable and sensor studies. The computational reduction and R-package release are concrete strengths that could facilitate broader adoption. The application provides a real-data illustration of detecting time-of-day specific intervention effects.

major comments (2)
  1. [Abstract and Simulation Studies] Abstract and Simulation Studies section: the claim that 'simulation studies demonstrate ... accurate estimation and valid inference' is unsupported by any quantitative details on design parameters (e.g., longitudinal correlation strength, grid density, error distributions), coverage rates, or direct comparisons to joint-model baselines. Without these, the evidence for the weakest assumption (that post-hoc smoothing plus cluster bootstrap yields asymptotically valid bands) cannot be assessed.
  2. [Method (three-step procedure)] Method description (three-step procedure): no theoretical argument or asymptotic result is supplied showing that the marginal approximation (independent pointwise fits followed by bivariate smoothing) preserves the dependence structure that the cluster bootstrap is intended to capture, or that smoothing does not distort the variability used for simultaneous bands. Validity therefore rests entirely on the (undetailed) simulation regimes; this is load-bearing for the inference claim.
minor comments (2)
  1. [Abstract/Introduction] The abstract and introduction would benefit from a brief statement of the precise functional data model (e.g., the form of the coefficient surface and the longitudinal dependence structure) to orient readers before the algorithmic description.
  2. [Figures/Tables] Figure captions and table legends should explicitly state the simulation settings (sample size, number of visits, grid size) so that the reported performance metrics can be interpreted without returning to the text.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their thoughtful comments, which help us improve the clarity and support for our methodological claims. We address the major comments point by point below, indicating where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [Abstract and Simulation Studies] Abstract and Simulation Studies section: the claim that 'simulation studies demonstrate ... accurate estimation and valid inference' is unsupported by any quantitative details on design parameters (e.g., longitudinal correlation strength, grid density, error distributions), coverage rates, or direct comparisons to joint-model baselines. Without these, the evidence for the weakest assumption (that post-hoc smoothing plus cluster bootstrap yields asymptotically valid bands) cannot be assessed.

    Authors: We appreciate this observation. While the full simulation studies section in the manuscript does provide details on the data-generating processes, including varying levels of longitudinal correlation and functional grid densities, we agree that a concise summary of quantitative results such as coverage probabilities and computational times would strengthen the abstract and the presentation. In the revision, we will add a summary table in the simulation section reporting empirical coverage rates for both pointwise and simultaneous confidence bands across different scenarios, along with comparisons to a joint modeling baseline where computationally feasible. This will provide direct quantitative support for the claims. revision: yes

  2. Referee: [Method (three-step procedure)] Method description (three-step procedure): no theoretical argument or asymptotic result is supplied showing that the marginal approximation (independent pointwise fits followed by bivariate smoothing) preserves the dependence structure that the cluster bootstrap is intended to capture, or that smoothing does not distort the variability used for simultaneous bands. Validity therefore rests entirely on the (undetailed) simulation regimes; this is load-bearing for the inference claim.

    Authors: The proposed method is designed as a marginal approximation to enable scalability for high-dimensional longitudinal functional data, where full joint modeling is often intractable. The cluster bootstrap operates at the subject level after the pointwise fits and smoothing to empirically capture the dependence. We do not provide a formal asymptotic theory in the current manuscript, as deriving such results for the composite procedure is technically challenging and beyond the scope of this applied methodological paper. However, the simulation studies are constructed to evaluate performance under realistic dependence structures. We will revise the discussion section to explicitly state the reliance on simulations for validating the inference procedure and note this as a limitation. Additionally, we will provide more detailed quantitative results from the simulations as requested in the previous comment. revision: partial

standing simulated objections not resolved
  • Deriving a full asymptotic theory justifying the validity of the post-smoothing cluster bootstrap in the marginal three-step procedure

Circularity Check

0 steps flagged

No significant circularity; method is a sequence of independent statistical operations

full rationale

The paper defines its contribution as a marginal three-step procedure (pointwise scalar-on-function fits, bivariate smoothing of coefficient surfaces, then analytic Gaussian bands or subject-level cluster bootstrap) whose validity is assessed via simulation coverage rather than algebraic identity. No equation reduces a claimed prediction to a fitted input by construction, no self-citation supplies a uniqueness theorem or ansatz that the current work merely renames, and the central claims of accurate estimation plus valid inference are not forced by the definition of the steps themselves. The procedure therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The abstract supplies insufficient detail to enumerate free parameters or invented entities; the method appears to rely on standard domain assumptions for functional regression and bootstrap validity.

axioms (2)
  • domain assumption Pointwise scalar-on-function regression models can be fitted independently at each location and then smoothed without invalidating downstream inference.
    This is the core of step (1) and (2) in the marginal approach.
  • domain assumption Either Gaussian analytic bands or cluster bootstrap produce valid confidence bands after smoothing.
    Invoked for step (3) inference.

pith-pipeline@v0.9.0 · 5428 in / 1293 out tokens · 35417 ms · 2026-05-09T19:05:10.609436+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    Philosophical Transactions of the Royal Society B: Biological Sciences , volume =

    McMahon, Siobhan K. and Lewis, Beth A. and Guan, Weihua and Wang, Qi and Hayes, Shannon M. and Wyman, Jean F. and Rothman, Alexander J. , title = ". JAMA Network Open , volume =. 2024 , month =. doi:10.1001/jamanetworkopen.2024.0298 , url =

  2. [2]

    and Crainiceanu, Ciprian , title = "

    Leroux, Andrew and Di, Junrui and Smirnova, Ekaterina, and Mcguffey, Elizabeth and Cao, Quy and Bayatmokhtari, Elham and Tabacu, Lucia and Zipunnikov, Vadim and Urbanek, Jacek K. and Crainiceanu, Ciprian , title = ". Statistics in Biosciences , year =. 2019 , doi =

  3. [3]

    Pubblicazioni del R istituto superiore di scienze economiche e commericiali di firenze , volume=

    Teoria statistica delle classi e calcolo delle probabilita , author=. Pubblicazioni del R istituto superiore di scienze economiche e commericiali di firenze , volume=

  4. [4]

    Journal of the Royal statistical society: series B (Methodological) , volume=

    Controlling the false discovery rate: a practical and powerful approach to multiple testing , author=. Journal of the Royal statistical society: series B (Methodological) , volume=. 1995 , publisher=

  5. [5]

    Statistics in medicine , volume=

    Bootstrap-based inference on the difference in the means of two correlated functional processes , author=. Statistics in medicine , volume=. 2012 , publisher=

  6. [6]

    Biostatistics , volume=

    Simple fixed-effects inference for complex functional models , author=. Biostatistics , volume=. 2018 , publisher=

  7. [7]

    Journal of the Royal Statistical Society Series C: Applied Statistics , volume=

    Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements , author=. Journal of the Royal Statistical Society Series C: Applied Statistics , volume=. 2012 , publisher=

  8. [8]

    1994 , publisher=

    An introduction to the bootstrap , author=. 1994 , publisher=

  9. [9]

    2024 , publisher=

    Functional data analysis with R , author=. 2024 , publisher=

  10. [10]

    Journal of the Royal Statistical Society Series C: Applied Statistics , volume=

    Longitudinal dynamic functional regression , author=. Journal of the Royal Statistical Society Series C: Applied Statistics , volume=. 2020 , publisher=

  11. [11]

    Biometrics , volume=

    Generalized multilevel function-on-scalar regression and principal component analysis , author=. Biometrics , volume=. 2015 , publisher=

  12. [12]

    Journal of Computational and Graphical Statistics , volume=

    Fast univariate inference for longitudinal functional models , author=. Journal of Computational and Graphical Statistics , volume=. 2022 , publisher=

  13. [13]

    Journal of Computational and Graphical Statistics , volume=

    Functional additive mixed models , author=. Journal of Computational and Graphical Statistics , volume=. 2015 , publisher=

  14. [14]

    2013 , publisher=

    Package ‘refund’ , author=. 2013 , publisher=

  15. [15]

    Physical activity among adults aged 18 and over: United States, 2020 , author=

  16. [16]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Fast bivariate P-splines: the sandwich smoother , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2013 , publisher=

  17. [17]

    Preventive medicine , volume=

    Income, physical activity, sedentary behavior, and the ‘weekend warrior’among US adults , author=. Preventive medicine , volume=. 2017 , publisher=

  18. [18]

    2003 , publisher=

    Semiparametric regression , author=. 2003 , publisher=

  19. [19]

    2017 , publisher=

    Generalized additive models: an introduction with R , author=. 2017 , publisher=

  20. [20]

    Recent Advances in Functional Data Analysis and Related Topics , pages=

    Longitudinal functional principal component analysis , author=. Recent Advances in Functional Data Analysis and Related Topics , pages=. 2011 , organization=

  21. [21]

    bioRxiv , pages=

    A statistical framework for analysis of trial-level temporal dynamics in fiber photometry experiments , author=. bioRxiv , pages=

  22. [22]

    Annals of statistics , volume=

    Surprises in high-dimensional ridgeless least squares interpolation , author=. Annals of statistics , volume=. 2022 , publisher=

  23. [23]

    arXiv preprint arXiv:2409.03296 , year=

    An Efficient Two-Dimensional Functional Mixed-Effect Model Framework for Repeatedly Measured Functional Data , author=. arXiv preprint arXiv:2409.03296 , year=

  24. [24]

    Journal of Computational and Graphical Statistics , volume=

    Additive functional Cox model , author=. Journal of Computational and Graphical Statistics , volume=. 2021 , publisher=

  25. [25]

    Journal of Computational and Graphical Statistics , volume=

    Fast multilevel functional principal component analysis , author=. Journal of Computational and Graphical Statistics , volume=. 2023 , publisher=

  26. [26]

    Biometrics , volume=

    A case study of glucose levels during sleep using multilevel fast function on scalar regression inference , author=. Biometrics , volume=. 2023 , publisher=

  27. [27]

    Biostatistics , volume=

    Quantifying the lifetime circadian rhythm of physical activity: a covariate-dependent functional approach , author=. Biostatistics , volume=. 2015 , publisher=

  28. [28]

    2005 , publisher=

    Functional data analysis , author=. 2005 , publisher=

  29. [29]

    Journal of the American Statistical Association , volume=

    Using wavelet-based functional mixed models to characterize population heterogeneity in accelerometer profiles: a case study , author=. Journal of the American Statistical Association , volume=. 2006 , publisher=

  30. [30]

    Biometrics , volume=

    Variable selection in nonlinear function-on-scalar regression , author=. Biometrics , volume=. 2023 , publisher=

  31. [31]

    Statistics in Medicine , volume=

    A function-based approach to model the measurement error in wearable devices , author=. Statistics in Medicine , volume=. 2022 , publisher=

  32. [32]

    The Journals of Gerontology: Series A , volume=

    Daily patterns of accelerometer activity predict changes in sleep, cognition, and mortality in older men , author=. The Journals of Gerontology: Series A , volume=. 2018 , publisher=

  33. [33]

    Alzheimer's Research & Therapy , volume=

    Impaired 24-h activity patterns are associated with an increased risk of Alzheimer’s disease, Parkinson’s disease, and cognitive decline , author=. Alzheimer's Research & Therapy , volume=. 2024 , publisher=

  34. [34]

    Plos one , volume=

    Applying time series analyses on continuous accelerometry data—A clinical example in older adults with and without cognitive impairment , author=. Plos one , volume=. 2021 , publisher=

  35. [35]

    Statistics in biosciences , volume=

    Longitudinal associations between timing of physical activity accumulation and health: application of functional data methods , author=. Statistics in biosciences , volume=. 2023 , publisher=

  36. [36]

    Chemometrics and Intelligent Laboratory Systems , volume=

    Using basis expansions for estimating functional PLS regression: applications with chemometric data , author=. Chemometrics and Intelligent Laboratory Systems , volume=. 2010 , publisher=

  37. [37]

    Biostatistics , volume=

    Estimation of sparse functional quantile regression with measurement error: a SIMEX approach , author=. Biostatistics , volume=. 2022 , publisher=

  38. [38]

    Journal of Computational and Graphical Statistics , volume=

    Ultra-efficient MCMC for Bayesian longitudinal functional data analysis , author=. Journal of Computational and Graphical Statistics , volume=. 2025 , publisher=

  39. [39]

    Journal of Computational and Graphical Statistics , volume=

    Bayesian function-on-scalars regression for high-dimensional data , author=. Journal of Computational and Graphical Statistics , volume=. 2020 , publisher=

  40. [40]

    Statistical Methods in Medical Research , volume=

    Compositional functional regression and isotemporal substitution analysis: Methods and application in time-use epidemiology , author=. Statistical Methods in Medical Research , volume=. 2023 , publisher=