Fixed-Effects Models for Causal Inference in Longitudinal Cluster Randomized and Quasi-Experimental Trials

Fan Li; Kenneth M. Lee

arxiv: 2604.07756 · v1 · submitted 2026-04-09 · 📊 stat.ME

Fixed-Effects Models for Causal Inference in Longitudinal Cluster Randomized and Quasi-Experimental Trials

Kenneth M. Lee , Fan Li This is my paper

Pith reviewed 2026-05-10 18:25 UTC · model grok-4.3

classification 📊 stat.ME

keywords fixed-effects modelslongitudinal cluster trialscausal inferencemodel robustnessstepped-wedge designsM-estimationtreatment effect estimationquasi-experimental trials

0 comments

The pith

Fixed-effects models with correctly specified treatment effects yield consistent estimators for marginal treatment effects in longitudinal cluster trials even if other model components are misspecified.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that fixed-effects models can target super-population marginal estimands in longitudinal cluster randomized and quasi-experimental trials through an M-estimation framework. It proves that linear and log-link versions remain consistent and asymptotically normal for nonparametrically defined treatment effects when only the treatment effect structure is correct. This holds even with arbitrary misspecification elsewhere and clarifies that these models are not limited to conditional estimands. The result positions fixed-effects models as a robust alternative to mixed-effects models for designs like stepped-wedge and crossover trials.

Core claim

Linear and log-link fixed-effects models with correctly specified treatment effect structures yield consistent and asymptotically normal estimators for nonparametrically defined treatment effect estimands in longitudinal CRTs, even under arbitrary misspecification of other model components. The constant treatment effect estimator targets the period-average treatment effect for the overlap population, and fixed-effects models can maintain consistency by adjusting for cluster-level and individual-level time-invariant confounding in longitudinal CQTs.

What carries the argument

M-estimation framework applied to linear and log-link fixed-effects models, which targets marginal estimands and provides robustness outside the treatment effect structure.

If this is right

Some CRT designs achieve model-robustness without correct specification of the treatment effect structure.
Fixed-effects models can adjust for both cluster-level and individual-level time-invariant confounding in longitudinal CQTs.
Fixed-effects models serve as a robust and potentially preferable alternative to mixed-effects models for longitudinal CT analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners may find fixed-effects models simpler to implement and interpret than mixed-effects models in large longitudinal datasets.
The overlap population estimand suggests results apply most directly to units with complete period coverage across treatment conditions.
Similar M-estimation robustness arguments could be tested for other link functions or outcome types beyond linear and log-link.

Load-bearing premise

The treatment effect structure must be correctly specified in the fixed-effects model.

What would settle it

A simulation study or re-analysis of longitudinal cluster trial data where the treatment effect structure is deliberately misspecified while other components are correct, checking whether the estimator becomes inconsistent or biased.

Figures

Figures reproduced from arXiv: 2604.07756 by Fan Li, Kenneth M. Lee.

**Figure 2.** Figure 2: Analysis results from simulation scenario 1 of a SW-CRT with binary outcomes [PITH_FULL_IMAGE:figures/full_fig_p024_2.png] view at source ↗

**Figure 3.** Figure 3: Analysis results from simulation scenario 2 of a PB-CQT with continuous outcomes [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Results from the re-analysis of binary outcomes using the linear fixed-effects model [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

**Figure 5.** Figure 5: Analysis results from simulation scenario 3 of a CRXO with continuous outcomes [PITH_FULL_IMAGE:figures/full_fig_p113_5.png] view at source ↗

**Figure 6.** Figure 6: Analysis results from simulation scenario 1 of a SW-CRT with [PITH_FULL_IMAGE:figures/full_fig_p114_6.png] view at source ↗

**Figure 7.** Figure 7: Analysis results from simulation scenario 2 of a PB-CQT with [PITH_FULL_IMAGE:figures/full_fig_p115_7.png] view at source ↗

**Figure 8.** Figure 8: Analysis results from simulation scenario 3 of a CRXO with [PITH_FULL_IMAGE:figures/full_fig_p116_8.png] view at source ↗

**Figure 9.** Figure 9: Analysis results from simulation scenario 4 of a SW-CRT with [PITH_FULL_IMAGE:figures/full_fig_p118_9.png] view at source ↗

**Figure 10.** Figure 10: Results from the re-analysis of binary outcomes using the linear fixed-effects [PITH_FULL_IMAGE:figures/full_fig_p120_10.png] view at source ↗

read the original abstract

This article investigates the model-robustness of fixed-effects models for analyzing a broad class of longitudinal cluster trials (CTs) such as stepped-wedge, parallel-with-baseline and crossover designs, encompassing both randomized (CRTs) and quasi-experimental (CQTs) designs. We clarify a longstanding misconception in biostatistics, demonstrating that fixed-effects models, traditionally perceived as targeting only finite-sample conditional estimands, can effectively target super-population marginal estimands through an M-estimation framework. We comprehensively prove that linear and log-link fixed-effects models with correctly specified treatment effect structures can broadly yield consistent and asymptotically normal estimators for nonparametrically defined treatment effect estimands in longitudinal CRTs, even under arbitrary misspecification of other model components. We identify that the constant treatment effect estimator generally targets the period-average treatment effect for the overlap population (P-ATO); accordingly, some CRT designs don't even require correct specification of the treatment effect structure for model-robustness. We further characterize conditions where fixed-effects models can maintain consistency by adjusting for both cluster-level and individual-level time-invariant confounding in longitudinal CQTs. Altogether, supported by simulation and a case study re-analysis, we establish fixed-effects models as a robust and potentially preferable alternative to mixed-effects models for longitudinal CT analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Fixed-effects models can consistently estimate marginal causal effects in longitudinal cluster trials when the treatment effect is correctly specified, per the new proofs.

read the letter

This paper shows that fixed-effects models can target super-population marginal estimands in longitudinal cluster trials via M-estimation, provided the treatment effect structure is correct. It does a good job clarifying the longstanding mix-up between finite-sample and super-population targets. The consistency proofs for linear and log-link fixed-effects models are the main contribution, showing they work for marginal treatment effects in stepped-wedge, parallel, and crossover designs even when other model parts are misspecified. They also note that the constant effect estimator hits the period-average treatment effect on the overlap population, which means some designs are robust without full specification. The extension to quasi-experimental trials with conditions for handling time-invariant confounding adds breadth. Simulations and a case study back it up. The approach avoids circularity by relying on external M-estimation and nonparametric tools. The main soft spot is the dependence on correct treatment effect specification. The paper states this clearly, but more on sensitivity to small errors there would strengthen it. For the quasi-experimental side, the confounding adjustment conditions are laid out but might need more practical guidance on checking them. Overall the math looks solid and the citation pattern builds on established tools without gaps. This paper is for biostatisticians and epidemiologists analyzing longitudinal cluster data who want a simpler fixed-effects option instead of mixed models. Readers dealing with public health trials would find the marginal estimand focus valuable. It deserves a serious referee to check the derivations in detail and see how the results hold up under review. I recommend sending it for peer review. The new proofs address a real gap and the evidence provided makes it worth the time.

Referee Report

0 major / 3 minor

Summary. The paper claims to resolve a misconception in biostatistics by showing that fixed-effects models in longitudinal cluster trials can target super-population marginal treatment effect estimands using an M-estimation approach. It provides comprehensive proofs that linear and log-link fixed-effects models, when the treatment effect structure is correctly specified, produce consistent and asymptotically normal estimators for nonparametric estimands despite misspecification of other components. The constant treatment effect is shown to target the period-average treatment effect for the overlap population (P-ATO). The work also addresses conditions for consistency in quasi-experimental designs with confounding and is backed by simulations and a case study re-analysis.

Significance. If the central claims hold, this manuscript makes a substantial contribution to causal inference methods for cluster randomized and quasi-experimental trials. It provides a theoretically grounded alternative to mixed-effects models that is robust to misspecification. The explicit identification of the P-ATO target and the conditions for model-robustness are valuable for practitioners. The inclusion of proofs, simulation studies, and a real-world case study adds credibility and practical utility to the findings.

minor comments (3)

Consider defining 'P-ATO' at its first mention for readers unfamiliar with the term.
The discussion of the longstanding misconception could benefit from citing specific prior works that hold the view being challenged.
The re-analysis results would be more informative if compared directly to mixed-effects model estimates in a table.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, recognition of the manuscript's contributions to causal inference methods for longitudinal cluster trials, and recommendation of minor revision. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained via external M-estimation

full rationale

The paper establishes consistency of linear and log-link fixed-effects models for nonparametric marginal treatment effect estimands in longitudinal CRTs/CQTs by invoking the standard M-estimation framework, with the key condition of correct treatment-effect structure specification made explicit. No load-bearing step reduces a claimed prediction to a fitted parameter by construction, nor does any derivation rely on self-citation chains or imported uniqueness theorems. The argument is supported by stated proofs, simulations, and a case study re-analysis, rendering it independent of the target results and externally falsifiable.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the domain assumption that the treatment effect structure is correctly specified and on standard regularity conditions from M-estimation theory; no free parameters or new invented entities are introduced in the abstract.

axioms (2)

domain assumption Correct specification of the treatment effect structure
Abstract states that consistency holds when this structure is correct even under arbitrary misspecification of other components.
standard math Standard regularity conditions for M-estimation and asymptotic normality
Invoked to guarantee consistency and asymptotic normality of the estimators.

pith-pipeline@v0.9.0 · 5526 in / 1466 out tokens · 90064 ms · 2026-05-10T18:25:40.963017+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Cameron, A. C. and Trivedi, P. K. (2005).Microeconometrics: Methods and Applications. Cambridge University Press. Google-Books-ID: TdlKAgAAQBAJ

work page 2005
[2]

and Li, F

Chen, X. and Li, F. (2025). Model-assisted analysis of covariance estimators for stepped wedge cluster randomized experiments.Scandinavian Journal of Statistics, 52(1):416–446. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/sjos.12755

work page doi:10.1111/sjos.12755 2025
[3]

C., Luo, Z., and Roman, L

Gardiner, J. C., Luo, Z., and Roman, L. A. (2009). Fixed effects, random effects and GEE: What are the differences?Statistics in Medicine, 28(2):221–239. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.3478

work page doi:10.1002/sim.3478 2009
[4]

Hausman, J. A. (1978). Specification Tests in Econometrics.Econometrica, 46(6):1251–1271. Publisher: [Wiley, Econometric Society]

work page 1978
[5]

Kiefer, N. M. (1980). Estimation of fixed effect models for time series of cross-sections with arbitrary intertemporal covariance.Journal of Econometrics, 14(2):195–202

work page 1980
[6]

Lee, K. M. and Cheung, Y. B. (2024). The fixed-effects model for robust analysis of stepped-wedge cluster trials with a small number of clusters and continuous outcomes: a simulation study.Trials, 25(1):718

work page 2024
[7]

M., Turner, E

Lee, K. M., Turner, E. L., and Kenny, A. (2025). Analysis of Stepped-Wedge Cluster Randomized Trials When Treatment Effects Vary by Exposure Time or Calendar Time.Statistics in Medicine, 44(20-22):e70256. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.70256

work page doi:10.1002/sim.70256 2025
[8]

P., Hemming, K., Taljaard, M., Melnick, E

Li, F., Hughes, J. P., Hemming, K., Taljaard, M., Melnick, E. R., and Heagerty, P. J. (2021). Mixed-effects models for the design and analysis of stepped wedge cluster randomized trials: An overview.Statistical Methods in Medical Research, 30(2):612–639. Publisher: SAGE Publications Ltd

work page 2021
[9]

and Li, F

Li, F. and Li, F. (2019). Propensity score weighting for causal inference with multiple treatments.The Annals of Applied Statistics, 13(4):2389–2415. Publisher: Institute of Mathematical Statistics

work page 2019
[10]

L., and Zaslavsky, A

Li, F., Morgan, K. L., and Zaslavsky, A. M. (2018). Balancing Covariates via Propensity Score Weighting. Journal of the American Statistical Association, 113(521):390–400. Publisher: Taylor & Francis

work page 2018
[11]

and Zeger, S

Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models.Biometrika, 73(1):13–22. 121

work page 1986
[12]

and Scott, E

Neyman, J. and Scott, E. L. (1948). Consistent Estimates Based on Partially Consistent Observations. Econometrica, 16(1):1

work page 1948
[13]

K., Zivich, P

Ross, R. K., Zivich, P. N., Stringer, J. S. A., and Cole, S. R. (2024). M-estimation for common epidemiological measures: introduction and applied examples.International Journal of Epidemiology, 53(2):dyae030

work page 2024
[14]

G., Liu, C., Huang, W., Liu, A., Zhang, Y., Smith, M

Hudgens, M. G., Liu, C., Huang, W., Liu, A., Zhang, Y., Smith, M. K., Mitchell, K. M., Ong, J. J., Fu, H., Vickerman, P., Yang, L., Wang, C., Zheng, H., Yang, B., and Tucker, J. D. (2018). Crowdsourcing to expand HIV testing among men who have sex with men in China: A closed cohort stepped wedge cluster randomized controlled trial.PLOS Medicine, 15(8):e1002645

work page 2018
[15]

Tsiatis, A. A. (2006).Semiparametric Theory and Missing Data. Springer Series in Statistics. Springer New

work page 2006
[16]

van der Vaart, A

York, New York, NY. van der Vaart, A. W. (1998).Asymptotic statistics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge, UK ; New York, NY, USA

work page 1998
[17]

Wang, B., Wang, X., and Li, F. (2024). How to achieve model-robust inference in stepped wedge trials with model-based methods?Biometrics, 80(4):ujae123

work page 2024
[18]

Wooldridge, J. M. (2010).Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, Mass., 2. ed edition. OCLC: 705585553. 122

work page 2010

[1] [1]

Cameron, A. C. and Trivedi, P. K. (2005).Microeconometrics: Methods and Applications. Cambridge University Press. Google-Books-ID: TdlKAgAAQBAJ

work page 2005

[2] [2]

and Li, F

Chen, X. and Li, F. (2025). Model-assisted analysis of covariance estimators for stepped wedge cluster randomized experiments.Scandinavian Journal of Statistics, 52(1):416–446. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/sjos.12755

work page doi:10.1111/sjos.12755 2025

[3] [3]

C., Luo, Z., and Roman, L

Gardiner, J. C., Luo, Z., and Roman, L. A. (2009). Fixed effects, random effects and GEE: What are the differences?Statistics in Medicine, 28(2):221–239. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.3478

work page doi:10.1002/sim.3478 2009

[4] [4]

Hausman, J. A. (1978). Specification Tests in Econometrics.Econometrica, 46(6):1251–1271. Publisher: [Wiley, Econometric Society]

work page 1978

[5] [5]

Kiefer, N. M. (1980). Estimation of fixed effect models for time series of cross-sections with arbitrary intertemporal covariance.Journal of Econometrics, 14(2):195–202

work page 1980

[6] [6]

Lee, K. M. and Cheung, Y. B. (2024). The fixed-effects model for robust analysis of stepped-wedge cluster trials with a small number of clusters and continuous outcomes: a simulation study.Trials, 25(1):718

work page 2024

[7] [7]

M., Turner, E

Lee, K. M., Turner, E. L., and Kenny, A. (2025). Analysis of Stepped-Wedge Cluster Randomized Trials When Treatment Effects Vary by Exposure Time or Calendar Time.Statistics in Medicine, 44(20-22):e70256. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.70256

work page doi:10.1002/sim.70256 2025

[8] [8]

P., Hemming, K., Taljaard, M., Melnick, E

Li, F., Hughes, J. P., Hemming, K., Taljaard, M., Melnick, E. R., and Heagerty, P. J. (2021). Mixed-effects models for the design and analysis of stepped wedge cluster randomized trials: An overview.Statistical Methods in Medical Research, 30(2):612–639. Publisher: SAGE Publications Ltd

work page 2021

[9] [9]

and Li, F

Li, F. and Li, F. (2019). Propensity score weighting for causal inference with multiple treatments.The Annals of Applied Statistics, 13(4):2389–2415. Publisher: Institute of Mathematical Statistics

work page 2019

[10] [10]

L., and Zaslavsky, A

Li, F., Morgan, K. L., and Zaslavsky, A. M. (2018). Balancing Covariates via Propensity Score Weighting. Journal of the American Statistical Association, 113(521):390–400. Publisher: Taylor & Francis

work page 2018

[11] [11]

and Zeger, S

Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models.Biometrika, 73(1):13–22. 121

work page 1986

[12] [12]

and Scott, E

Neyman, J. and Scott, E. L. (1948). Consistent Estimates Based on Partially Consistent Observations. Econometrica, 16(1):1

work page 1948

[13] [13]

K., Zivich, P

Ross, R. K., Zivich, P. N., Stringer, J. S. A., and Cole, S. R. (2024). M-estimation for common epidemiological measures: introduction and applied examples.International Journal of Epidemiology, 53(2):dyae030

work page 2024

[14] [14]

G., Liu, C., Huang, W., Liu, A., Zhang, Y., Smith, M

Hudgens, M. G., Liu, C., Huang, W., Liu, A., Zhang, Y., Smith, M. K., Mitchell, K. M., Ong, J. J., Fu, H., Vickerman, P., Yang, L., Wang, C., Zheng, H., Yang, B., and Tucker, J. D. (2018). Crowdsourcing to expand HIV testing among men who have sex with men in China: A closed cohort stepped wedge cluster randomized controlled trial.PLOS Medicine, 15(8):e1002645

work page 2018

[15] [15]

Tsiatis, A. A. (2006).Semiparametric Theory and Missing Data. Springer Series in Statistics. Springer New

work page 2006

[16] [16]

van der Vaart, A

York, New York, NY. van der Vaart, A. W. (1998).Asymptotic statistics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge, UK ; New York, NY, USA

work page 1998

[17] [17]

Wang, B., Wang, X., and Li, F. (2024). How to achieve model-robust inference in stepped wedge trials with model-based methods?Biometrics, 80(4):ujae123

work page 2024

[18] [18]

Wooldridge, J. M. (2010).Econometric Analysis of Cross Section and Panel Data. MIT Press, Cambridge, Mass., 2. ed edition. OCLC: 705585553. 122

work page 2010