pith. sign in

arxiv: 2303.05443 · v1 · submitted 2023-03-09 · 📊 stat.ME

Likelihood-based Inference for Skewed Responses in a Crossover Trial Setup

Pith reviewed 2026-05-24 09:10 UTC · model grok-4.3

classification 📊 stat.ME
keywords crossover designmixed effect modelsskew-normal distributionEM algorithmgene expressionasymmetric responses3x3 trial
0
0 comments X

The pith

Linear mixed effect models with skew-normal random effects or errors model asymmetric responses in crossover trials via EM algorithm estimation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a linear mixed effect model for crossover trials that measures multiple responses per period, allowing either the random effect or the random error to follow a skew-normal distribution to capture asymmetry. This setup is motivated by a 3x3 crossover trial in which gene expression levels for ten genes were recorded from asthma patients after different drug doses across periods. The EM algorithm computes the maximum likelihood estimates of all parameters under either skew-normal specification. Simulations and the gene expression dataset illustrate how the models operate in practice. A reader would care because many biological responses deviate from symmetry, so standard normal mixed models can produce misleading inferences in such repeated-measures designs.

Core claim

The paper claims that a linear mixed effect model with a skew-normal random effect or a skew-normal random error term adequately represents the asymmetric responses arising in crossover trials. Maximum likelihood estimates are obtained through the EM algorithm in both cases, and the resulting fitted models are shown to be applicable to the 3x3 design with gene expression measurements.

What carries the argument

Linear mixed effect model with skew-normal random effect or skew-normal random error, estimated by the EM algorithm for maximum likelihood.

If this is right

  • Maximum likelihood estimates remain computable when responses exhibit skewness in crossover designs.
  • The method directly handles the 3x3 layout with multiple gene measurements per subject and period.
  • Simulations recover the true parameters under both skew-normal random-effect and random-error variants.
  • Application to the asthma gene-expression trial demonstrates practical estimation and model use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same modeling strategy could be tested on other repeated-measures clinical trials that produce skewed outcomes.
  • Direct comparison of fit statistics between the skew-normal versions and conventional normal mixed models on the same dataset would quantify the gain from asymmetry.
  • Extension to four or more periods or to additional covariates such as baseline gene levels could be examined without altering the core estimation procedure.

Load-bearing premise

The skew-normal distribution captures the observed asymmetry in gene expression responses across periods without needing another asymmetric family or a transformation.

What would settle it

Fitting the proposed skew-normal mixed model to the gene expression data and finding that residuals remain asymmetric or that a standard normal mixed model produces equal or higher likelihood values would undermine the central claim.

Figures

Figures reproduced from arXiv: 2303.05443 by Kalyan Das, Savita Pareek, Siuli Mukhopadhyay.

Figure 1
Figure 1. Figure 1: Density and normal Q-Q plot comparisons: observed raw responses [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Interaction Plots: gene interactions by time period, treatment, [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Validation graphs for normal model fitting based on estimated [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Assesing model adequacy: a comparison of SN model ( [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Assesing model adequacy: a comparison of SN model ( [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Model fit comparison for Case 3: Mahalanobis Distances [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
read the original abstract

This work proposes a statistical model for crossover trials with multiple skewed responses measured in each period. A 3 $\times$ 3 crossover trial data where different drug doses were administered to subjects with a history of seasonal asthma rhinitis to grass pollen is used for motivation. In each period, gene expression values for ten genes were measured from each subject. It considers a linear mixed effect model with skew normally distributed random effect or random error term to model the asymmetric responses in the crossover trials. The paper examines cases (i) when a random effect follows a skew-normal distribution, as well as (ii) when a random error follows a skew-normal distribution. The EM algorithm is used in both cases to compute maximum likelihood estimates of parameters. Simulations and crossover data from the gene expression study illustrate the proposed approach. Keywords: Crossover design, Mixed effect models, Skew-normal distribution, EM algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper proposes a linear mixed-effects model for analyzing skewed responses in 3×3 crossover trials, specifically motivated by gene-expression data from an asthma study. It considers two cases: skew-normal random effects or skew-normal errors, derives EM algorithms to obtain maximum likelihood estimates in each case, and illustrates the methods with simulations and real-data analysis of ten genes across periods.

Significance. If the EM derivations are correct and the skew-normal specification adequately captures the observed asymmetry, the work supplies a practical likelihood-based tool for crossover designs with non-normal responses, an area relevant to clinical and genomic studies. The dual-case treatment (random effect vs. error) and the combination of simulation and real-data examples are strengths.

major comments (2)
  1. [§3.2] §3.2 (EM algorithm for skew-normal random effects): the E-step expressions for the conditional expectations involving the skew-normal latent variables are stated without an explicit derivation or reference to the standard skew-normal stochastic representation; this step is load-bearing for verifying that the subsequent M-step yields true MLEs.
  2. [Table 4] Table 4 (real-data parameter estimates): the reported standard errors for the skewness parameter under the two model variants are obtained from the observed information matrix, but no check is provided that the matrix is positive definite or that the estimates are interior to the parameter space; this affects the reliability of inference for the gene-expression application.
minor comments (3)
  1. [Abstract] The abstract uses 'skew normally distributed' twice; consistent hyphenation ('skew-normally') would improve readability.
  2. [§2.1] Section 2.1 defines the crossover design but does not cite the original trial protocol or the source of the gene-expression measurements; adding this reference would aid reproducibility.
  3. [§4] In the simulation section the number of Monte Carlo replications is stated as 500, yet the reported coverage probabilities in Table 3 appear to be based on a different count; clarify the exact replication number used for each metric.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive comments on our manuscript. We address each major comment below and indicate the revisions we plan to make.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (EM algorithm for skew-normal random effects): the E-step expressions for the conditional expectations involving the skew-normal latent variables are stated without an explicit derivation or reference to the standard skew-normal stochastic representation; this step is load-bearing for verifying that the subsequent M-step yields true MLEs.

    Authors: We agree that an explicit derivation or reference would improve verifiability of the EM algorithm. In the revised manuscript we will add a concise derivation of the relevant conditional expectations in the E-step, drawing on the standard stochastic representation of the skew-normal distribution (Azzalini and Capitanio, 2014). revision: yes

  2. Referee: [Table 4] Table 4 (real-data parameter estimates): the reported standard errors for the skewness parameter under the two model variants are obtained from the observed information matrix, but no check is provided that the matrix is positive definite or that the estimates are interior to the parameter space; this affects the reliability of inference for the gene-expression application.

    Authors: We acknowledge the importance of these checks for the reported standard errors. In the revision we will add a brief statement confirming that the observed information matrix is positive definite at the reported MLEs and that the skewness-parameter estimates lie in the interior of the parameter space for the genes analyzed. revision: yes

Circularity Check

0 steps flagged

No significant circularity; model definition precedes estimation

full rationale

The paper defines a linear mixed-effects model with skew-normal random effect or error term, then applies the EM algorithm to obtain MLEs from data. No equation reduces a claimed prediction or fitted quantity to an input by construction, no self-citation chain supplies a load-bearing uniqueness result, and the skew-normal choice is presented as a modeling decision rather than derived from prior self-work. Simulations and real-data analysis are downstream of the model specification, with no renaming of known results or smuggling of ansatzes via citation. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that skew-normal components capture the observed asymmetry and on the standard regularity conditions needed for the EM algorithm to converge to the MLE in this mixed-model setting.

free parameters (2)
  • skewness parameter
    The extra parameter that governs asymmetry in the skew-normal density is estimated from the trial data rather than fixed a priori.
  • variance components and fixed effects
    All location, scale, and covariance parameters of the linear mixed model are estimated by maximum likelihood from the observed gene-expression vectors.
axioms (1)
  • domain assumption Responses follow a linear mixed-effects structure whose random terms or errors belong to the skew-normal family
    This modeling choice replaces the usual normality assumption and is required for the likelihood and EM derivation to be valid.

pith-pipeline@v0.9.0 · 5683 in / 1390 out tokens · 44158 ms · 2026-05-24T09:10:36.657762+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Arellano-Valle and A

    R.B. Arellano-Valle and A. Azzalini. The centred parametrization for the multivariate skew-normal distribution. Journal of Multivariate Analysis , 100(4):816, 2009

  2. [2]

    Arellano-Valle and Marc G

    R.B. Arellano-Valle and Marc G. Genton. On fundamental skew distri- butions. Journal of Multivariate Analysis , 96(1):93–116, 2005

  3. [3]

    Arellano-Valle, H

    R.B. Arellano-Valle, H. Bolfarine, and V.H. Lachos. Skew-normal Linear Mixed Models. Journal of Data Science , 3:415–438, 2005

  4. [4]

    Azzalini

    A. Azzalini. A Class of Distributions Which Includes the Normal Ones. Scandinavian Journal of Statistics , 12(2):171–178, 1985

  5. [5]

    Azzalini and A

    A. Azzalini and A. Capitanio. Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society. Series B: Statistical Methodology , 61(3):579–602, 1999. 25

  6. [6]

    Azzalini and A

    A. Azzalini and A. Capitanio. The Skew-Normal and Related Families . Cambridge, 2014

  7. [7]

    Azzalini and A

    A. Azzalini and A. Dalla Valle. The multivariate skew-normal distribu- tion. Biometrika, 83(4):715–726, 1996

  8. [8]

    Branco and Dipak K

    M´ arcia D. Branco and Dipak K. Dey. A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis , 79(1): 99–113, 2001

  9. [9]

    A robust method for the assessment of average bioequivalence in the presence of outliers and skewness

    Divan Aristo Burger, Robert Schall, and Sean van der Merwe. A robust method for the assessment of average bioequivalence in the presence of outliers and skewness. Pharmaceutical Research, 38(10):1697–1709, 2021

  10. [10]

    Chinchilli and James D

    Vernon M. Chinchilli and James D. Esinhart. Design and analy- sis of intra-subject variability in cross-over experiments. Statistics in Medicine, 15:1619–1634, 1996

  11. [11]

    The Gene Expression Omnibus database

    Emily Clough and Tanya Barrett. The Gene Expression Omnibus database. Methods in Molecular Biology , 1418:93–110, 2016

  12. [12]

    The use of asymmetric distributions in average bioequivalence

    Roberto Molina de Souza, Jorge Alberto Achcar, Edson Zangiacomi Martinez, and Josmar Mazucheli. The use of asymmetric distributions in average bioequivalence. Statistics in Medicine , 35(15):2525–2542, 2016

  13. [13]

    Dempster, N.M

    A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) , 39:1–38, 1977

  14. [14]

    Nonlinear mixed-effects models with misspecified random-effects distribution

    Reza Drikvandi. Nonlinear mixed-effects models with misspecified random-effects distribution. Pharmaceutical Statistics , 19(3):187–201, 2019

  15. [15]

    Diagnosing mis- specification of the random-effects distribution in mixed models

    Reza Drikvandi, Geert Verbeke, and Geert Molenberghs. Diagnosing mis- specification of the random-effects distribution in mixed models. Biometrics, 73(1):63–71, 2016

  16. [16]

    Changyong Feng, Hongyue Wang, Naiji Lu, Tian Chen, Hua He, Ying Lu, and Xin M. Tu. Log-transformation and its implications for data analysis. Shanghai Archives of Psychiatry , 26(2):105–109, 2014. 26

  17. [17]

    Smooth Random Effects Distribution in a Linear Mixed Model

    Wendimagegn Ghidey, Emmanuel Lesaffre, and Paul Eilers. Smooth Random Effects Distribution in a Linear Mixed Model. Biometrics, 60 (4):945–953, 2004

  18. [18]

    Glosup and M.C

    J.G. Glosup and M.C. Axelrod. Use of the AIC with the EM Algorithm: A Demonstration of a Probability Model Selection Technique. In Joint Statistical Meeting, 1994

  19. [19]

    Grender and William D

    Julie M. Grender and William D. Johnson. Analysis of crossover designs with multivariate response. Statistics in Medicine , 12(1):69–89, 1993

  20. [20]

    M.J.R. Healy. Multivariate Normal Plotting. Journal of the Royal Sta- tistical Society. Series C (Applied Statistics) , 17(2):157–161, 1968

  21. [21]

    Johnson and D.E

    W.D. Johnson and D.E. Mercante. Analyzing multivariate data in crossover designs using permutation tests. Journal of Biopharmaceu- tical Statistics, 6(3):327–342, 1996

  22. [22]

    Byron Jones and Michael G. Kenward. Design and Analysis of Cross- Over Trials. Chapman & Hall/CRC, second edition, 2003

  23. [23]

    Lachos, Pulak Ghosh, and R.B

    V.H. Lachos, Pulak Ghosh, and R.B. Arellano-Valle. Likelihood based inference for skew-normal independent linear mixed models. Statistica Sinica, 20(1):303–322, 2010

  24. [24]

    Laird and James H

    Nan M. Laird and James H. Ware. Random-Effects Models for Longi- tudinal Data. Biometrics, 38(4):963–974, 1982

  25. [25]

    Leaker, V.A

    B.R. Leaker, V.A. Malkov, R. Mogg, M.K. Ruddy, G.C. Nicholson, A.J. Tan, C. Tribouley, and G. Chen. The nasal mucosal late allergic reaction to grass pollen involves type 2 inflammation ( IL-5 and IL-13 ), the inflammasome ( IL-1 b ), and complement. Nature, 10(2):408–420, 2016

  26. [26]

    Frank J. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit. Journal of the American Statistical Association , 46(253):68–78, 1951

  27. [27]

    Likelihood-based missing data analysis in multivariate crossover trials

    Savita Pareek, Kalyan Das, and Siuli Mukhopadhyay. Likelihood-based missing data analysis in multivariate crossover trials. arXiv pre-print , 2021

  28. [28]

    Nonlinear 27 mixed-effects models with scale mixture of skew-normal distributions

    Marcos Antonio Alves Pereira and Cibele Maria Russo. Nonlinear 27 mixed-effects models with scale mixture of skew-normal distributions. Journal of Applied Statistics , 46(9):1602–1620, 2019

  29. [29]

    Chinchilli

    Mary Putt and Vernon M. Chinchilli. A mixed effects model for the analysis of repeated measures cross-over studies. Statistics in Medicine , 18(22):3037–3058, 1999

  30. [30]

    R: A Language and Environment for Statistical Com- puting

    R Core Team. R: A Language and Environment for Statistical Com- puting. R Foundation for Statistical Computing, Vienna, Austria, 2022. URL https://www.R-project.org/

  31. [31]

    Schumacher, V.H

    Fernanda L. Schumacher, V.H. Lachos, and Larissa A. Matos. Scale mixture of skew-normal linear mixed models with within-subject serial dependence. Statistics in Medicine , 40(7):1790–1810, 2021

  32. [32]

    Cross-Over Trials in Clinical Research

    Stephen Senn. Cross-Over Trials in Clinical Research . John Wiley & Sons, Ltd., 2002

  33. [33]

    Shapiro and M.B

    S.S. Shapiro and M.B. Wilk. An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52(3):591–611, 1965

  34. [34]

    Tudor, Gary G

    Gail E. Tudor, Gary G. Koch, and Diane Catellier. Statistical meth- ods for crossover designs in bioenvironmental and public health studies. Handbook of Statistics , 18:571–614, 2000

  35. [35]

    Linear Mixed Models for Longi- tudinal Data

    Geert Verbeke and Geert Molenberghs. Linear Mixed Models for Longi- tudinal Data . Springer-Verlag New York, Inc., 2000

  36. [36]

    C.F. Jeff Wu. On the convergence properties of the EM algorithm. Annals of Statistics , 11(1):95–103, 1983

  37. [37]

    Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data

    Daowen Zhang and Marie Davidian. Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data. Biometrics, 57: 795–802, 2001. 28 9 Appendix 9.1 Skew-Normal Distribution The following is a brief overview of the skew-normal distribution and the terminology that we have used in our analysis. (i) Univariate skew-normal variate (Azzal...