CARhy: Comprehensive Analyses of Circadian Rhythms in Transcriptomic Experiments with Multiple Conditions

Jerome S. Menet; Samiran Sinha; Weiyi Huang

arxiv: 2604.26765 · v1 · submitted 2026-04-29 · 📊 stat.ME

CARhy: Comprehensive Analyses of Circadian Rhythms in Transcriptomic Experiments with Multiple Conditions

Weiyi Huang , Jerome S. Menet , Samiran Sinha This is my paper

Pith reviewed 2026-05-07 12:34 UTC · model grok-4.3

classification 📊 stat.ME

keywords circadian rhythmstranscriptomicsFourier regressionrhythmicity testingmulti-condition analysisheteroscedastic modelsgene expressionstatistical framework

0 comments

The pith

CARhy uses first-harmonic Fourier regression to test for rhythmicity and differences in amplitude, phase, and baseline across multiple conditions in transcriptomic data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CARhy as a unified statistical framework for analyzing circadian gene-expression rhythms in experiments with more than two conditions. It relies on first-harmonic Fourier regression to deliver formal tests for the presence of rhythmicity and for condition differences in rhythmicity, amplitude, phase, and baseline level. The model incorporates condition-specific variances and handles unbalanced sampling, keeping type-I error and false-discovery rates controlled under heteroscedastic noise. Existing methods often restrict users to pairwise comparisons or to one rhythm feature at a time, so CARhy fills a gap for studies that examine how genotypes, treatments, or exposures jointly alter daily cycles. Simulations and a mouse-liver application show the framework maintains power while remaining practical for realistic experimental designs.

Core claim

CARhy is a unified statistical framework for transcriptomic data collected under more than two conditions. Based on first-harmonic Fourier regression, CARhy provides formal tests for the presence of rhythmicity and for differences across conditions in rhythmicity, amplitude, phase, and baseline level. By allowing condition-specific variances and accommodating unbalanced designs, the framework remains reliable under heteroscedastic noise and realistic sampling constraints.

What carries the argument

First-harmonic Fourier regression model that assigns condition-specific parameters for amplitude, phase, baseline, and variance, then uses likelihood-ratio or Wald-type tests to evaluate rhythm presence and inter-condition differences.

If this is right

A single model simultaneously tests rhythm presence and all major parameter differences instead of requiring separate pairwise runs.
Condition-specific variance estimates protect inference when noise levels differ across experimental groups.
Unbalanced sampling schedules do not invalidate the tests for amplitude or phase shifts.
Simulation results indicate type-I error remains near nominal levels while power exceeds that of existing pairwise or single-feature tools.
The R package implementation allows direct application to new multi-condition transcriptomic datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same regression structure could be reused for other biological oscillations such as cell-cycle or ultradian rhythms once the period is known.
If non-sinusoidal waveforms prove common, pre-screening genes for harmonic content before applying CARhy would reduce false phase calls.
Clustering genes by their estimated amplitude and phase vectors after CARhy testing might reveal coordinated downstream regulatory modules.
Extension to longitudinal human cohorts with repeated measures under several interventions would test whether the framework scales to clinical rhythm-disorder studies.

Load-bearing premise

A pure first-harmonic sine wave is adequate to describe the rhythmic part of gene expression for every gene and every condition.

What would settle it

If time-series residuals for many genes show systematic patterns or if adding a second harmonic significantly improves model fit, the tests for amplitude and phase differences become unreliable.

Figures

Figures reproduced from arXiv: 2604.26765 by Jerome S. Menet, Samiran Sinha, Weiyi Huang.

**Figure 1.** Figure 1: Type-I error rates for Cases 1-8 of Table view at source ↗

**Figure 2.** Figure 2: Power for Cases 9-16 of Table view at source ↗

**Figure 3.** Figure 3: F1 scores (upper panel) and FDR (lower panel) for different methods. For DODR view at source ↗

**Figure 4.** Figure 4: Type-I error rate and power for testing rhythmicity. Black: CARhy-TR, Gray: view at source ↗

**Figure 5.** Figure 5: CARhy’s logic diagram. 26 view at source ↗

**Figure 6.** Figure 6: Rhythmic expression differences of DR and NDR genes identified by CARhy-TDR view at source ↗

**Figure 7.** Figure 7: Venn diagram showing overlaps among rhythmic genes detected by CARhy-TR view at source ↗

**Figure 8.** Figure 8: Coverage and Jaccard index of CARhy, DODR and dryR across thresholds. view at source ↗

read the original abstract

Circadian rhythms are endogenous oscillations that regulate various physiological processes and their disruption has been linked to many diseases, making it important to determine how gene-expression rhythms are altered across genotypes, treatments, or environmental exposures. Existing approaches for circadian transcriptomic analysis are often limited to pairwise comparisons or to a single aspect of rhythmic behavior, making them inadequate for comprehensive inference in multi-condition experimental designs. We propose CARhy (Comprehensive Analysis of Rhythmicity), a unified statistical framework for transcriptomic data collected under more than two conditions. Based on first-harmonic Fourier regression, CARhy provides formal tests for the presence of rhythmicity and for differences across conditions in rhythmicity, amplitude, phase, and baseline level. By allowing condition-specific variances and accommodating unbalanced designs, the framework remains reliable under heteroscedastic noise and realistic sampling constraints. Simulations show that CARhy controls type I error and false discovery rates well while achieving higher power than existing approaches in challenging settings. In mouse liver transcriptomic data, CARhy offers an interpretable and practical tool for characterizing how circadian rhythms differ across multiple experimental conditions. CARhy is implemented as an R package and is publicly available at: https://github.com/DrHuang123/Comprehensive-Analyses-of-Circadian-Rhythms-CARhy.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CARhy gives a single framework for testing rhythmicity plus condition differences in amplitude, phase, and baseline via first-harmonic Fourier regression, but its claims rest on how well that model fits real transcriptomic patterns.

read the letter

The main thing to know is that CARhy supplies formal tests for whether a gene is rhythmic and for differences across conditions in amplitude, phase, and baseline level, all from one first-harmonic Fourier regression setup that allows condition-specific variances and unbalanced sampling. It is positioned as an improvement over tools that force pairwise comparisons or look at only one rhythm feature at a time. The R package is a practical plus for users who need to run this on mouse liver data or similar experiments. Simulations are said to show type I error control and higher power than alternatives, which is the kind of evidence that matters for a methods paper. The work is new in the sense that this exact combination of tests under heteroscedasticity is not already standard in the cited literature. The soft spot is the modeling choice itself. Circadian gene expression frequently shows non-sinusoidal shapes or needs higher harmonics, and if those are present the derived amplitude and phase estimates become biased; the tests for condition differences then no longer target the biological quantities the authors intend. The abstract does not indicate whether the simulations included such non-sinusoidal generators, so the robustness claim under realistic conditions is not yet secured. This paper is for chronobiologists and statistical genomics people who analyze multi-condition transcriptomics and want one coherent set of tests instead of stitching separate methods together. It deserves peer review because it targets a clear practical gap with a coherent extension of existing regression tools, even though reviewers will need to press on the sinusoidal assumption and the simulation coverage.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces CARhy, a unified statistical framework for transcriptomic data under multiple conditions. Based on first-harmonic Fourier regression (reparameterized via cos/sin terms), it supplies formal tests for rhythmicity presence and for condition differences in rhythmicity, amplitude (sqrt(β_cos² + β_sin²)), phase (atan2), and baseline. The approach allows condition-specific variances and unbalanced designs. Simulations are reported to control type I error and FDR while achieving higher power than existing methods; the framework is applied to mouse liver data and released as an R package.

Significance. If the first-harmonic assumption is adequate and the simulation coverage is complete, CARhy would supply a practical, interpretable tool for multi-condition circadian transcriptomics that overcomes the pairwise or single-aspect limitations of prior methods. The public R package is a clear strength for reproducibility.

major comments (2)

[Methods (model definition)] Methods (first-harmonic Fourier regression model): the central claim that the reparameterized linear cos/sin model supplies valid tests for amplitude and phase differences rests on the assumption that the 24 h component dominates and higher harmonics or non-sinusoidal shapes are negligible. When this does not hold, the derived amplitude and phase estimates are biased and the likelihood-ratio/Wald tests no longer target the intended biological quantities. The simulations section must explicitly state whether non-sinusoidal generators were included; without that coverage the robustness claim under realistic transcriptomic patterns remains unsecured.
[Simulations] Simulations: the abstract asserts type-I and FDR control plus higher power, yet provides no numerical details on the exact data-generating processes, sample sizes, variance structures, or data-exclusion rules. This prevents verification that post-hoc modeling choices do not inflate the reported performance advantages.

minor comments (1)

[Abstract] Abstract: the phrase 'formal tests' is used without naming the test statistic (likelihood-ratio versus Wald) or the multiple-testing procedure employed for FDR control.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will revise the manuscript to provide the requested clarifications and details.

read point-by-point responses

Referee: [Methods (model definition)] Methods (first-harmonic Fourier regression model): the central claim that the reparameterized linear cos/sin model supplies valid tests for amplitude and phase differences rests on the assumption that the 24 h component dominates and higher harmonics or non-sinusoidal shapes are negligible. When this does not hold, the derived amplitude and phase estimates are biased and the likelihood-ratio/Wald tests no longer target the intended biological quantities. The simulations section must explicitly state whether non-sinusoidal generators were included; without that coverage the robustness claim under realistic transcriptomic patterns remains unsecured.

Authors: We acknowledge that the validity of amplitude and phase tests in CARhy relies on the first-harmonic assumption. Our simulations were generated exclusively from first-harmonic sinusoidal signals plus noise (with condition-specific variances) to evaluate the method when the model is correctly specified. We agree the Simulations section must be explicit on this point. In revision we will add a dedicated paragraph describing the exact data-generating processes and will qualify the robustness claims to scenarios where the 24 h component dominates. We do not claim performance under strongly non-sinusoidal patterns, as those fall outside the method's intended scope. revision: yes
Referee: [Simulations] Simulations: the abstract asserts type-I and FDR control plus higher power, yet provides no numerical details on the exact data-generating processes, sample sizes, variance structures, or data-exclusion rules. This prevents verification that post-hoc modeling choices do not inflate the reported performance advantages.

Authors: We agree that the Simulations section currently lacks the numerical detail required for independent verification. In the revised manuscript we will expand this section to report the precise simulation parameters: sample sizes per condition and time point, exact variance structures (including heteroscedasticity levels), amplitude/phase/baseline values used in data generation, number of Monte Carlo replicates, and any filtering or exclusion rules applied to simulated profiles. These additions will allow readers to confirm the reported type-I error, FDR control, and power comparisons. revision: yes

Circularity Check

0 steps flagged

No significant circularity in CARhy's derivation of tests from first-harmonic Fourier regression

full rationale

The paper defines a linear regression model using reparameterized first-harmonic Fourier terms (cos and sin components) to represent 24-hour periodicity under multiple conditions, then derives formal test statistics (likelihood-ratio or Wald tests) for rhythmicity presence, amplitude, phase, and baseline differences directly from the estimated coefficients and their covariance structure. These steps follow standard linear model inference without any reduction where a fitted parameter is renamed or reused as a prediction of itself, without self-citation load-bearing for uniqueness theorems, and without smuggling ansatzes via prior work. The framework's handling of heteroscedasticity and unbalanced designs is achieved through explicit model specification rather than tautological redefinition. The only substantive assumption (adequacy of the first harmonic) is an external modeling choice, not a circularity in the derivation chain.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The framework rests on the standard linear-regression assumptions implicit in Fourier modeling (independent errors after accounting for the sine/cosine terms, correct specification of the 24-hour period) plus the modeling decision to use only the first harmonic. No new physical entities are postulated.

free parameters (1)

period
Fixed at 24 hours by domain convention; not estimated from data in the described framework.

axioms (2)

domain assumption Gene-expression time series can be adequately modeled by a first-harmonic Fourier regression with additive error.
Invoked when the method is introduced as 'based on first-harmonic Fourier regression'.
domain assumption Condition-specific variances are sufficient to handle heteroscedasticity.
Stated as allowing reliable inference under heteroscedastic noise.

pith-pipeline@v0.9.0 · 5533 in / 1535 out tokens · 34151 ms · 2026-05-07T12:34:15.924798+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,

Benjamini, Y., and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,”Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300. DOI: 10.1111/j.2517-6161.1995.tb02031.x. 34 Brooks, T. G., Manjrekar, A., Mrcˇ cela, A., and Grant, G. R. (2023), “Meta-Analysis of Diurna...

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995
[2]

An Approximate Distribution of Estimates of Variance Compo- nents,

DOI: 10.1186/s13073-019-0704-0. Satterthwaite, F. E. (1946), “An Approximate Distribution of Estimates of Variance Compo- nents,”Biometrics Bulletin, 2(6), 110–114. DOI: 10.2307/3002019. Schrader, L. A., Ronnekleiv-Kelly, S. M., Hogenesch, J. B., Bradfield, C. A., and Malecki, K. M. C. (2024), “Circadian Disruption, Clock Genes, and Metabolic Health,”Jour...

work page doi:10.1186/s13073-019-0704-0 1946

[1] [1]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,

Benjamini, Y., and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,”Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300. DOI: 10.1111/j.2517-6161.1995.tb02031.x. 34 Brooks, T. G., Manjrekar, A., Mrcˇ cela, A., and Grant, G. R. (2023), “Meta-Analysis of Diurna...

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995

[2] [2]

An Approximate Distribution of Estimates of Variance Compo- nents,

DOI: 10.1186/s13073-019-0704-0. Satterthwaite, F. E. (1946), “An Approximate Distribution of Estimates of Variance Compo- nents,”Biometrics Bulletin, 2(6), 110–114. DOI: 10.2307/3002019. Schrader, L. A., Ronnekleiv-Kelly, S. M., Hogenesch, J. B., Bradfield, C. A., and Malecki, K. M. C. (2024), “Circadian Disruption, Clock Genes, and Metabolic Health,”Jour...

work page doi:10.1186/s13073-019-0704-0 1946