pith. sign in

arxiv: 2511.09215 · v2 · submitted 2025-11-12 · 📊 stat.ME

Principled analysis of crossover designs: causal effects, efficient estimation, and robust inference

Pith reviewed 2026-05-17 22:38 UTC · model grok-4.3

classification 📊 stat.ME
keywords crossover designscausal inferencepotential outcomesleast squares estimationrobust variance estimationcarryover effectsdesign-based inferencetreatment assignment mechanism
0
0 comments X

The pith

Least squares regression unifies analysis of crossover designs by delivering consistent and efficient causal effect estimates with valid variance estimates even under model misspecification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a design-based framework for analyzing crossover designs using potential outcomes to define causal estimands for instantaneous and carryover effects. It shows how the treatment assignment mechanism plays a central role in identification and estimation for any given design. The key result is that a unified least squares procedure, incorporating coefficient restrictions and unit weights, produces reliable point estimates and variance estimates without requiring the regression model to be correctly specified. A sympathetic reader would care because crossover designs are common in biomedical studies and digital platforms, yet standard analyses risk invalid inferences from parametric assumptions.

Core claim

For general crossover designs, the causal estimands can be identified from the treatment assignment mechanism and assumptions on potential outcomes. The analysis is unified through least squares estimation with restrictions on coefficients and weights on units, which yields consistent and efficient point estimates along with valid variance estimates even when the working regression model is misspecified.

What carries the argument

The least squares estimator with restrictions on coefficients and weights on units, which serves as a unified method for assessing identifiability, constructing efficient estimators, and obtaining robust variance estimates.

If this is right

  • The procedure identifies both instantaneous effects of current treatments and carryover effects from past treatments.
  • Specification of the regression function, weighting scheme, and coefficient restrictions assesses identifiability for any crossover design.
  • Efficient estimators are constructed while maintaining valid inference under misspecification of the working model.
  • The same least squares output supplies both point estimates and variance estimates that remain consistent for the design-based target.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Analysts could safely use parsimonious regressions in sequential trials without separate robustness checks for variance.
  • The explicit role given to the assignment mechanism suggests treating randomization details as first-class inputs when planning crossover studies.
  • Similar least-squares unification might apply to other within-unit sequential designs where carryover is a concern.

Load-bearing premise

The assumptions on the potential outcomes and the data-generating process that allow identification of the target causal estimands given the crossover design and treatment assignment mechanism.

What would settle it

A Monte Carlo simulation in which the regression model is deliberately misspecified yet the least squares variance estimator still achieves nominal coverage for the true causal effects across repeated randomizations.

Figures

Figures reproduced from arXiv: 2511.09215 by Peng Ding, Zhichao Jiang.

Figure 1
Figure 1. Figure 1: Bias of the regression-based estimator under two crossover designs. Each panel shows [PITH_FULL_IMAGE:figures/full_fig_p030_1.png] view at source ↗
read the original abstract

Crossover designs randomly assign each unit to receive a sequence of treatments. By comparing outcomes within the same unit, these designs can effectively eliminate between-unit variation and facilitate the identification of both instantaneous effects of current treatments and carryover effects from past treatments. They are widely used in traditional biomedical studies and are increasingly adopted in modern digital platforms. However, standard analyses of crossover designs often rely on strong parametric models, making inference vulnerable to model misspecification. This paper adopts a design-based framework to analyze general crossover designs. We make two main contributions. First, we use potential outcomes to formally define the causal estimands and assumptions on the data-generating process. For any given type of crossover design and assumptions on potential outcomes, we outline a procedure for identification and estimation, emphasizing the central role of the treatment assignment mechanism in design-based inference. Second, we unify the analysis of crossover designs using least squares, with restrictions on the coefficients and weights on the units. Based on the theory, we recommend the specification of the regression function, weighting scheme, and coefficient restrictions to assess identifiability, construct efficient estimators, and estimate variances in a unified fashion. Crucially, the least squares procedure is simple to implement, and yields not only consistent and efficient point estimates but also valid variance estimates even when the working regression model is misspecified.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript develops a design-based causal framework for general crossover designs. It defines instantaneous and carryover causal estimands via potential outcomes, identifies them from the known treatment assignment mechanism, and unifies estimation and inference through a restricted weighted least-squares procedure whose sandwich variance remains consistent for the randomization distribution even under working-model misspecification.

Significance. If the derivations are correct, the work is significant because it supplies a simple, implementable procedure that simultaneously achieves identification, efficiency, and robust inference without relying on strong parametric outcome models. This is practically valuable for biomedical and digital-platform experiments that routinely employ crossover designs, and it extends design-based regression results to this sequential setting in a unified way.

major comments (2)
  1. §4.2, the statement that the sandwich variance is valid under arbitrary misspecification: the proof sketch appears to invoke general results for design-based regression, but it is not shown that the specific weighting and coefficient restrictions required for crossover identifiability preserve the necessary orthogonality or that the finite-population correction remains valid when carryover effects are present; a self-contained derivation or counter-example would strengthen the central robustness claim.
  2. §3.1, identification of carryover effects: the maintained assumptions on the potential-outcome process (e.g., no higher-order interactions or time-invariant unit effects) are load-bearing for the target estimands, yet the paper does not provide a formal sensitivity analysis or a concrete design in which these assumptions fail while the assignment mechanism remains known; this limits the scope of the “principled” guarantee.
minor comments (3)
  1. Notation for the restricted coefficient vector and the weighting matrix is introduced in §2 but used without explicit cross-reference in the estimation section; adding a short table that maps each restriction to the corresponding causal contrast would improve readability.
  2. The simulation section reports coverage probabilities but does not tabulate the realized variance of the point estimator relative to the oracle randomization variance; including this comparison would make the efficiency claim more transparent.
  3. A few typographical inconsistencies appear in the display of the weighted least-squares objective (e.g., the placement of the restriction matrix in Eq. (12) versus the text description).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments help clarify the presentation of our design-based framework for crossover designs. We address each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: §4.2, the statement that the sandwich variance is valid under arbitrary misspecification: the proof sketch appears to invoke general results for design-based regression, but it is not shown that the specific weighting and coefficient restrictions required for crossover identifiability preserve the necessary orthogonality or that the finite-population correction remains valid when carryover effects are present; a self-contained derivation or counter-example would strengthen the central robustness claim.

    Authors: We agree that the manuscript would benefit from an explicit verification rather than relying solely on the general theory. The weighting scheme and coefficient restrictions are determined solely by the known treatment assignment mechanism and the identifiability conditions; they do not depend on the outcome values. Consequently the key orthogonality between the score and the estimation error continues to hold under the randomization distribution. In the revision we will add a self-contained derivation in the appendix that directly verifies the required conditions for the restricted weighted least-squares estimator in the presence of carryover effects and confirms that the finite-population correction remains valid. revision: yes

  2. Referee: §3.1, identification of carryover effects: the maintained assumptions on the potential-outcome process (e.g., no higher-order interactions or time-invariant unit effects) are load-bearing for the target estimands, yet the paper does not provide a formal sensitivity analysis or a concrete design in which these assumptions fail while the assignment mechanism remains known; this limits the scope of the “principled” guarantee.

    Authors: The assumptions listed in §3.1 (no higher-order interactions, additive time-invariant unit effects) are explicitly stated as necessary for point identification of the carryover estimands. While a comprehensive sensitivity analysis lies outside the scope of the present paper, we will insert a brief remark with a simple numerical illustration showing how violation of the no-higher-order-carryover assumption biases the target estimand even when the assignment mechanism is known. This will better delineate the limits of the current guarantees without altering the main identification and estimation results. revision: partial

Circularity Check

0 steps flagged

No significant circularity: design-based identification relies on known randomization mechanism

full rationale

The paper anchors identification and estimation in the known treatment assignment mechanism of the crossover design, using potential outcomes to define instantaneous and carryover effects. The recommended restricted weighted least-squares procedure with sandwich variance is presented as consistent for the randomization distribution even under working-model misspecification, which follows directly from standard design-based results (e.g., Horvitz-Thompson-type estimators) without reducing to fitted parameters or self-referential definitions. No load-bearing step equates a prediction to its own input by construction, and the framework remains self-contained against external benchmarks of randomization-based inference.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on the standard potential-outcomes framework and design-based assumptions without introducing new free parameters or invented entities.

axioms (1)
  • domain assumption Assumptions on potential outcomes and the data-generating process that enable identification for the given crossover design
    Invoked to define causal estimands and to guarantee that the treatment assignment mechanism identifies the target quantities.

pith-pipeline@v0.9.0 · 5535 in / 1172 out tokens · 43018 ms · 2026-05-17T22:38:18.508303+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION article output.bibitem format.authors "author" output.check author format.key output output.year.check new.block format.title "title" output.check new.block crossref missing format.jour.vol output format.article.crossref output.nonnull format.pages output if new.block note output fin.entry FUNCTION b...

  2. [2]

    Arkhangelsky, D. and G. Imbens (2024). Causal models for longitudinal and panel data: A survey. The Econometrics Journal\/ 27\/ (3), C1--C61

  3. [3]

    Athey, S. and G. W. Imbens (2022). Design-based analysis in difference-in-differences settings with staggered adoption. Journal of Econometrics\/ 226\/ (1), 62--79

  4. [4]

    Burdick, G

    Bajari, P., B. Burdick, G. W. Imbens, L. Masoero, J. McQueen, T. Richardson, and I. M. Rosen (2021). Multiple randomization designs. arXiv preprint arXiv:2112.13495\/

  5. [5]

    Burdick, G

    Bajari, P., B. Burdick, G. W. Imbens, L. Masoero, J. McQueen, T. S. Richardson, and I. M. Rosen (2023). Experimental design in marketplaces. Statistical Science\/ 38\/ (3), 458--476

  6. [6]

    Balaam, L. (1968). A two-period design with t^2 experimental units. Biometrics\/ , 61--73

  7. [7]

    Basse, G. W., Y. Ding, and P. Toulis (2023). Minimax designs for causal effects in temporal experiments with treatment habituation. Biometrika\/ 110\/ (1), 155--168

  8. [8]

    Liu, C.-H

    Bloniarz, A., H. Liu, C.-H. Zhang, J. S. Sekhon, and B. Yu (2016). Lasso adjustments of treatment effect estimates in randomized experiments. Proceedings of the National Academy of Sciences\/ 113\/ (27), 7383--7390

  9. [9]

    Rambachan, and N

    Bojinov, I., A. Rambachan, and N. Shephard (2021). Panel experiments and dynamic causal effects: A finite population perspective. Quantitative Economics\/ 12\/ (4), 1171--1196

  10. [10]

    Bojinov, I. and N. Shephard (2019). Time series experiments and causal estimands: exact randomization tests and trading. Journal of the American Statistical Association\/ 114\/ (528), 1665--1682

  11. [11]

    Simchi-Levi, and J

    Bojinov, I., D. Simchi-Levi, and J. Zhao (2023). Design and analysis of switchback experiments. Management Science\/ 69\/ (7), 3759--3777

  12. [12]

    Brown Jr, B. W. (1980). The crossover experiment for clinical trials. Biometrics\/ , 69--79

  13. [13]

    Cochran, W. (1939). Long-term agricultural experiments. Supplement to the Journal of the Royal Statistical Society\/ 6\/ (2), 104--148

  14. [14]

    Cochran, W. G. and G. M. Cox (1950). Experimental designs , Volume 70. John Wiley and Sons

  15. [15]

    Cook, T. D. and D. L. DeMets (2007). Introduction to Statistical Methods for Clinical Trials . Chapman and Hall/CRC

  16. [16]

    Dasgupta, T., N. S. Pillai, and D. B. Rubin (2015). Causal inference from 2^ K factorial designs by using potential outcomes. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 77\/ (4), 727--753

  17. [17]

    Ding, P. (2024). A First Course in Causal Inference . CRC Press

  18. [18]

    Fogarty, C. B. (2018). Regression-assisted inference for the average treatment effect in paired experiments. Biometrika\/ 105\/ (4), 994--1000

  19. [19]

    Freedman, D. A. (2008a). On regression adjustments to experimental data. Advances in Applied Mathematics\/ 40\/ (2), 180--193

  20. [20]

    Freedman, D. A. (2008b). Randomization does not justify logistic regression. Statistical Science\/ , 237--249

  21. [21]

    Greene, W. H. and T. G. Seaks (1991). The restricted least squares estimator: a pedagogical note. The Review of Economics and Statistics\/ , 563--567

  22. [22]

    Guo, K. and G. Basse (2023). The generalized O axaca-- B linder estimator. Journal of the American Statistical Association\/ 118\/ (541), 524--536

  23. [23]

    Stufken, and M

    Hedayat, A., J. Stufken, and M. Yang (2006). Optimal and efficient crossover designs when subject effects are random. Journal of the American Statistical Association\/ 101\/ (475), 1031--1038

  24. [24]

    Hinkelmann, K. and O. Kempthorne (2007). Design and analysis of experiments, volume 2: Advanced experimental design , Volume 2. John Wiley & Sons

  25. [25]

    Imai, K. (2008). Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Statistics in Medicine\/ 27\/ (24), 4857--4873

  26. [26]

    Jiang, and A

    Imai, K., Z. Jiang, and A. Malani (2021). Causal inference with interference and noncompliance in two-stage randomized experiments. Journal of the American Statistical Association\/ 116\/ (534), 632--644

  27. [27]

    Imbens, G. W. and D. B. Rubin (2015). Causal inference in statistics, social, and biomedical sciences . Cambridge university press

  28. [28]

    Jankar, J. and A. Mandal (2021). Optimal crossover designs for generalized linear models: an application to work environment experiment. Statistics and Applications\/ 19\/ (1), 319--336

  29. [29]

    Chen, and P

    Jiang, Z., S. Chen, and P. Ding (2023). An instrumental variable method for point processes: generalized wald estimation based on deconvolution. Biometrika\/ 110\/ (4), 989--1008

  30. [30]

    Jones, B. and M. G. Kenward (1987). Modelling binary data from a three-period cross-over trial. Statistics in Medicine\/ 6\/ (5), 555--564

  31. [31]

    Jones, B. and M. G. Kenward (2003). Design and Analysis of Cross-Over Trials . Chapman and Hall/CRC

  32. [32]

    Kempthorne, O. (1952). The Design and Analysis of Experiments , Volume 73. LWW

  33. [33]

    Kershner, R. P. and W. T. Federer (1981). Two-treatment crossover designs for estimating a variety of effects. Journal of the American Statistical Association\/ 76\/ (375), 612--619

  34. [34]

    Kunert, J. and J. Stufken (2008). Optimal crossover designs for two treatments in the presence of mixed and self-carryover effects. Journal of the American Statistical Association\/ 103\/ (484), 1641--1647

  35. [35]

    Li, X. and P. Ding (2017). General forms of finite population central limit theorems with applications to causal inference. Journal of the American Statistical Association\/ 112\/ (520), 1759--1769

  36. [36]

    Liang, K.-Y. and S. L. Zeger (1986). Longitudinal data analysis using generalized linear models. Biometrika\/ 73\/ (1), 13--22

  37. [37]

    Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining F reedman's critique. The Annals of Applied Statistics\/ , 295--318

  38. [38]

    Liu, H. and Y. Yang (2020). Regression-adjusted average treatment effect estimates in stratified randomized experiments. Biometrika\/ 107\/ (4), 935--948

  39. [39]

    Madsen, J. E. H., T. Scheike, and C. Pipper (2023). Unbiased and efficient estimation of causal treatment effects in crossover trials. Biometrical Journal\/ 65\/ (8), 2200170

  40. [40]

    Milliken, G. A. and D. E. Johnson (2009). Analysis of Messy Data, Volume I: Designed Experiments . Chapman and Hall/CRC

  41. [41]

    J., A.-W

    Mills, E. J., A.-W. Chan, P. Wu, A. Vail, G. H. Guyatt, and D. G. Altman (2009). Design, analysis, and presentation of crossover trials. Trials\/ 10 , 1--6

  42. [42]

    Miratrix, L. W., J. S. Sekhon, and B. Yu (2013). Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 75\/ (2), 369--396

  43. [43]

    Masoero, C

    Missault, P., L. Masoero, C. Delb \'e , T. Richardson, and G. Imbens (2025). Robust and efficient multiple-unit switchback experimentation. arXiv preprint arXiv:2506.12654\/

  44. [44]

    Murray, G. D. (1994). Editorial: Special issue on crossover trials. Statistical Methods in Medical Research\/ 3\/ (4), 301--302

  45. [45]

    Neyman, J. (1990). On the application of probability theory to agricultural experiments. essay on principles. section 9. Statistical Science\/ 5\/ (4), 465--472. Translated from the 1923 Polish original and edited by D. M. Dabrowska and T. P. Speed

  46. [46]

    Ni, T. (2025). Decision analytics of switchback experiments: A robust optimization approach. Available at SSRN 5245482\/

  47. [47]

    Ni, T. and I. Bojinov (2025). Enhancing efficiency and robustness for switchback experiments: A practical model-assisted framework. Available at SSRN 5229804\/

  48. [48]

    Pashley, N. E. and L. W. Miratrix (2021). Insights on variance estimation for blocked and matched pairs designs. Journal of Educational and Behavioral Statistics\/ 46\/ (3), 271--296

  49. [49]

    Pashley, N. E. and L. W. Miratrix (2022). Block what you can, except when you shouldn’t. Journal of Educational and Behavioral Statistics\/ 47\/ (1), 69--100

  50. [50]

    Patterson, S. D. and B. Jones (2017). Bioequivalence and Statistics in Clinical Pharmacology . Chapman and Hall/CRC

  51. [51]

    Rao, C. R. (1973). Linear Statistical Inference and Its Applications , Volume 2. Wiley New York

  52. [52]

    Senn, S. and W. Richardson (1994). The first t -test? Statistical Methods in Medical Research\/ 3\/ (1), 3--9

  53. [53]

    Senn, S. S. (2002). Cross-Over Trials in Clinical Research , Volume 5. John Wiley & Sons

  54. [54]

    Shi, D. and T. Ye (2024). Behavioral carry-over effect and power consideration in crossover trials. Biometrics\/ 80\/ (2), ujae023

  55. [55]

    Sun, J., P. Guo, X. Chen, and X. Tan (2025). Dynamic treatment effect analysis in crossover designs through repeated measures. Statistics in Medicine\/ 44\/ (7), e70070

  56. [56]

    Theil, H. (1971). Principles of Econometrics . New York: John Wiley & Sons

  57. [57]

    Wang, X. and V. M. Chinchilli (2021). Analysis of crossover designs with nonignorable dropout. Statistics in Medicine\/ 40\/ (1), 64--84

  58. [58]

    Wu, C. J. and M. S. Hamada (2011). Experiments: Planning, Analysis, and Optimization . John Wiley & Sons

  59. [59]

    Xu, R., D. V. Mehrotra, and P. A. Shaw (2018). Incorporating baseline measurements into the analysis of crossover trials with time-to-event endpoints. Statistics in Medicine\/ 37\/ (23), 3280--3292

  60. [60]

    Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association\/ 57\/ (298), 348--368

  61. [61]

    Zhao, A. and P. Ding (2022a). Reconciling design-based and model-based causal inferences for split-plot experiments. The Annals of Statistics\/ 50\/ (2), 1170--1192

  62. [62]

    Zhao, A. and P. Ding (2022b). Regression-based causal inference with factorial experiments: estimands, model specifications and design-based properties. Biometrika\/ 109\/ (3), 799--815

  63. [63]

    Zhao, A. and P. Ding (2023). Covariate adjustment in multiarmed, possibly factorial experiments. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 85\/ (1), 1--23

  64. [64]

    @esa ( ) , n @biblabelnum##1 ##1

    \@ifclassloaded aguplus natbib The aguplus class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command natbib from the document \@ifclassloaded nlinproc natbib The nlinproc class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later r...

  65. [65]

    @stdbsttrue NAT@ctr \@lbibitem[ NAT@ctr ] \@lbibitem[#1]#2 \@ifundefined b@#2\@extra@b@citeb @num @parse #2 [ @natanchorstart #2 \@biblabel @num @natanchorend] @ifcmd#1()()\@nil #2 @lbibitem\@undefined @lbibitem\@lbibitem \@lbibitem[#1]#2 @lbibitem[#1] #2 @ @@label #2 @stdbst @stdbstfalse @stdbst @filesw \@auxout @numberstrue [2] \@ifundefined b@#1\@extra...

  66. [66]

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifundefined NAT@sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifundefined bib@heading @heading NAT@ctr thebibliography [1] @ \@biblabel NAT@ctr \@bibsetup #1 NAT@ctr 0 @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.=1000 \@...