Bayesian Copula Directional Dependence is Cross-Network Robust for Gene-Regulatory Pair Direction: A Benchmark Study on DREAM5

Clara Grazian; Xiaoying Wei

arxiv: 2606.29402 · v1 · pith:7ZKTQJIKnew · submitted 2026-06-28 · 📊 stat.AP

Bayesian Copula Directional Dependence is Cross-Network Robust for Gene-Regulatory Pair Direction: A Benchmark Study on DREAM5

Xiaoying Wei , Clara Grazian This is my paper

Pith reviewed 2026-06-30 02:03 UTC · model grok-4.3

classification 📊 stat.AP

keywords Bayesian inferencecopula directional dependencegene regulatory networksdirection inferenceDREAM5 benchmarknetwork robustnessposterior distributiondirectional dependence

0 comments

The pith

Bayesian copula directional dependence is the only method that maintains accuracy above 60 percent, coverage above 88 percent, and AUROC above 0.6 across all three core DREAM5 gene networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper embeds a copula measure of directional dependence inside a Bayesian model to infer whether one gene regulates another or the reverse. It tests this estimator against eight competing direction methods on the DREAM5 benchmarks that include one synthetic network and two real bacterial networks. The Bayesian version is the only approach that keeps called accuracy above 60 percent, coverage above 88 percent, and direction AUROC above 0.6 on every core network, while every other method drops to chance or below on at least one. The same estimator ranks first on the real-organism networks and stays stable when sample size shrinks, unlike bootstrap methods that collapse. This positions the method as a practical post-screening tool that also returns credible intervals and no-call decisions.

Core claim

Bayesian CDD is the only method whose called accuracy is always above 60%, whose coverage is always above 88%, and whose direction AUROC is always above 0.6 across the three core DREAM5 networks; every competing method falls to chance or below on at least one network. CDD ranks first on both real-organism networks, remains stable on the smallest-sample network where bootstrap-interval methods collapse, and is the only Bayesian method that is simultaneously above chance and high-coverage under a 95% posterior gate.

What carries the argument

Bayesian embedding of a copula-based measure of directional dependence that returns, for each candidate pair, a posterior distribution over a directional contrast, a 95% credible interval, a posterior sign-support score, and a principled no-call.

If this is right

CDD ranks first on both real-organism networks in the benchmark.
The method remains stable on the smallest-sample network where bootstrap-interval methods collapse.
It is the only Bayesian method that stays above chance and high-coverage under a 95% posterior gate.
CDD can be used as a post-screening, uncertainty-aware direction-refinement tool for candidate regulatory pairs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the robustness pattern holds on additional networks, the method could be applied directly to post-process candidate pairs from other inference pipelines.
The credible-interval and sign-support outputs could be combined with edge-detection scores to produce full directed networks with uncertainty quantification.
Testing the same estimator on eukaryotic or larger mammalian datasets would reveal whether the cross-network stability generalizes beyond the bacterial and synthetic cases examined.

Load-bearing premise

The DREAM5 gold-standard direction labels are treated as accurate ground truth for scoring every method.

What would settle it

A new benchmark on an independent real-organism gene network with verified direction labels where Bayesian CDD drops below 60% called accuracy or 0.6 AUROC.

Figures

Figures reproduced from arXiv: 2606.29402 by Clara Grazian, Xiaoying Wei.

**Figure 1.** Figure 1: Cross-network called accuracy (%) across the six core network×benchmark conditions, shown as an annotated heatmap; cells are coloured diverging about the 50% chance line (green above, red below). Bayesian CDD (outlined top row) is the only method above 60% in every condition, whereas competing methods each drop to chance or below on at least one real-organism network. Network 4 (eukaryotic boundary). Behav… view at source ↗

**Figure 2.** Figure 2: Cross-network coverage (%) of the interval-capable methods, shown as an annotated heatmap (darker = higher coverage). Bayesian CDD (outlined top row) holds a uniformly high 88–94% across every condition, whereas the bootstrap-interval methods are lighter and uneven, swinging sharply across networks. 5 [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Cross-network direction AUROC (per-network bars; paired and independent averaged; dashed line = 0.5 no-signal baseline). Bayesian CDD is the top bar on every network. 4.5 Comparison against Bayesian DAG baselines Because our central claim is a Bayesian one — a full posterior with a principled no-call — the most demanding test is against other Bayesian methods that also produce a posterior over edge orienta… view at source ↗

**Figure 4.** Figure 4: Bayesian-method comparison. (A) Called accuracy: BCDAG and BiDAG sit at chance even when forced to call every pair; CDD is well above chance on N1–N3. (B) Coverage under the native 95% posterior gate: the DAG baselines produce zero confident calls everywhere, while CDD keeps 88–93% coverage. Values average the paired and independent benchmarks. 5 Discussion What the benchmark shows. Five findings stand out… view at source ↗

read the original abstract

Inferring the direction of a gene-regulatory relationship is harder than inferring whether a relationship exists, and most direction-inference methods are validated mainly on a single in silico benchmark. We ask which method remains reliable as the data move from a synthetic network to real organisms and as sample size decreases. We embed a copula-based measure of directional dependence (CDD) in a Bayesian framework that returns, for each candidate pair, a posterior distribution over a directional contrast, a 95% credible interval, a posterior sign-support score, and a principled no-call. We benchmark this estimator against eight direction-inference methods, including two Bayesian DAG-posterior baselines, on the three core DREAM5 networks (in silico, S. aureus, and E. coli), with S. cerevisiae used as an out-of-domain eukaryotic stress test. Across the three core networks, Bayesian CDD is the only method whose called accuracy is always above 60%, whose coverage is always above 88%, and whose direction AUROC is always above 0.6; every competing method falls to chance or below on at least one network. CDD ranks first on both real-organism networks, remains stable on the smallest-sample network where bootstrap-interval methods collapse, and is the only Bayesian method that is simultaneously above chance and high-coverage under a 95% posterior gate. We position CDD as a post-screening, uncertainty-aware direction-refinement tool for candidate regulatory pairs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Bayesian CDD clears the three performance bars on all core DREAM5 networks where the other eight methods do not, but the results rest on treating the provided direction labels as ground truth.

read the letter

The paper's central empirical claim is that its Bayesian copula directional dependence estimator is the only method among nine that stays above 60% called accuracy, 88% coverage, and 0.6 direction AUROC across the in-silico, S. aureus, and E. coli DREAM5 networks.

It takes an existing copula directional dependence measure and wraps it in a Bayesian model that supplies posterior distributions, 95% credible intervals, a sign-support score, and an explicit no-call rule. The benchmark then runs this estimator plus eight baselines on the three core networks plus an out-of-domain S. cerevisiae test. The multi-network design and the stability check on the smallest-sample network are the parts that add value; most direction methods are validated only on synthetic data.

The main soft spot is the lack of any sensitivity check on the DREAM5 direction labels. The real-organism labels are inferred or only partially validated, yet the paper scores every method against them without label-noise simulations or alternative gold-standard constructions. If those labels contain systematic bias, the claim that competitors fall to chance while Bayesian CDD does not will not transfer. The abstract also omits the Bayesian model derivation and the exact implementation of the baselines, which makes the reported numbers hard to reproduce from the text.

This is for computational biologists who already have candidate regulatory pairs and want an uncertainty-aware direction refinement step. Readers who run DREAM5-style benchmarks or work with real-organism expression data will find the comparative numbers useful. It deserves peer review because the benchmark is concrete and the method is practical, but referees should be asked to examine label robustness.

Referee Report

4 major / 1 minor

Summary. The paper embeds a copula-based directional dependence measure in a Bayesian framework to infer the direction of gene-regulatory relationships and benchmarks the resulting estimator (Bayesian CDD) against eight other direction-inference methods on the three core DREAM5 networks (in silico, S. aureus, E. coli) plus an out-of-domain S. cerevisiae test. It claims that Bayesian CDD is the only method whose called accuracy exceeds 60%, coverage exceeds 88%, and direction AUROC exceeds 0.6 on all three core networks, while every competitor falls to chance or below on at least one network; CDD is also positioned as stable on the smallest-sample network and as the only Bayesian method that is simultaneously above chance and high-coverage under a 95% posterior gate.

Significance. If the empirical rankings hold after the missing details are supplied, the work supplies concrete evidence that a posterior-aware copula directional measure can outperform both bootstrap-interval and DAG-posterior baselines when moving from synthetic to real-organism networks and when sample size shrinks. The multi-network design and explicit use of a no-call region are positive features for an applied-statistics audience.

major comments (4)

[Abstract] Abstract: the numerical thresholds (called accuracy >60%, coverage >88%, AUROC >0.6) and the 95% posterior gate are stated without definitions of these quantities or any derivation of the Bayesian model that produces the posterior over the directional contrast; without these the comparative claims cannot be evaluated.
[Methods] Methods (model specification): no explicit equations, priors, or sampling procedure are given for embedding the copula directional dependence into the Bayesian framework, so it is impossible to verify how the 95% credible interval, posterior sign-support score, or principled no-call are obtained or why this Bayesian method alone remains above chance with high coverage.
[Results] Results (benchmark tables): the headline claim that only Bayesian CDD meets the three thresholds on all three networks rests on treating the DREAM5 gold-standard direction labels as error-free ground truth; no label-noise simulation, alternative label construction, or sensitivity check is reported, even though the real-organism labels are themselves inferred.
[Results] Results (baseline comparison): the eight competing methods are listed but their implementations (hyper-parameter choices, software versions, handling of the 95% gate where applicable) are not described, preventing reproduction of the reported AUROC and accuracy rankings.

minor comments (1)

[Abstract] The abstract and title could state the sample sizes of each DREAM5 network to make the stability claim on the smallest-sample network immediately verifiable.

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for the thorough review and constructive comments. We address each major comment point-by-point below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the numerical thresholds (called accuracy >60%, coverage >88%, AUROC >0.6) and the 95% posterior gate are stated without definitions of these quantities or any derivation of the Bayesian model that produces the posterior over the directional contrast; without these the comparative claims cannot be evaluated.

Authors: We agree that the abstract would benefit from brief definitions of the key performance metrics and the posterior gate. In the revised manuscript, we will expand the abstract to define called accuracy (proportion of correctly directed pairs among those called), coverage (proportion of pairs for which a direction is called), AUROC, and the 95% posterior gate (the credible interval threshold for calling a direction). The derivation of the Bayesian model is provided in the Methods section; we will ensure it is clearly cross-referenced in the abstract. revision: yes
Referee: [Methods] Methods (model specification): no explicit equations, priors, or sampling procedure are given for embedding the copula directional dependence into the Bayesian framework, so it is impossible to verify how the 95% credible interval, posterior sign-support score, or principled no-call are obtained or why this Bayesian method alone remains above chance with high coverage.

Authors: The original manuscript describes the Bayesian embedding at a high level. To fully address this, we will add explicit mathematical equations for the copula directional dependence measure, the Bayesian model specification including prior distributions (e.g., on the copula parameters), and the MCMC sampling procedure used to obtain the posterior. This will clarify the computation of the 95% credible interval, the posterior sign-support score, and the no-call rule based on the posterior. revision: yes
Referee: [Results] Results (benchmark tables): the headline claim that only Bayesian CDD meets the three thresholds on all three networks rests on treating the DREAM5 gold-standard direction labels as error-free ground truth; no label-noise simulation, alternative label construction, or sensitivity check is reported, even though the real-organism labels are themselves inferred.

Authors: We acknowledge that the DREAM5 gold standards for real organisms are inferred and may contain noise. While DREAM5 is the established benchmark in the field, we will add a discussion of this limitation in the revised manuscript, including a sensitivity analysis where we introduce controlled label noise and re-evaluate the methods' robustness. This will strengthen the claims. revision: yes
Referee: [Results] Results (baseline comparison): the eight competing methods are listed but their implementations (hyper-parameter choices, software versions, handling of the 95% gate where applicable) are not described, preventing reproduction of the reported AUROC and accuracy rankings.

Authors: We agree that detailed implementation information is essential for reproducibility. In the revised version, we will include a supplementary table or section detailing the software packages, versions, hyper-parameter settings, and any specific handling of thresholds or gates for each of the eight competing methods. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark on external gold standards

full rationale

The manuscript is a comparative benchmark study that evaluates Bayesian CDD and eight other methods by computing accuracy, coverage, and direction AUROC against the fixed DREAM5 gold-standard direction labels on the in-silico, S. aureus, and E. coli networks. These metrics are calculated directly from the external labels and the methods' outputs; no equation, posterior, or parameter fit inside the Bayesian CDD model is redefined or renamed to produce the reported performance numbers. No self-citation chain is invoked to justify uniqueness or to substitute for the benchmark results. The derivation chain therefore terminates at independent, externally supplied labels rather than looping back to the model's own inputs or prior publications by the same authors.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The abstract invokes standard Bayesian posterior inference and copula dependence modeling without listing explicit free parameters or new entities; the main modeling assumptions are treated as background.

axioms (2)

domain assumption Copula models can represent directional dependence between gene-expression variables
Implicit in the use of CDD as the base measure.
domain assumption DREAM5 network labels constitute reliable ground truth for direction
Required for all reported accuracy, coverage, and AUROC numbers.

pith-pipeline@v0.9.1-grok · 5798 in / 1475 out tokens · 32642 ms · 2026-06-30T02:03:09.550889+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references

[1]

and Kim, J

Kim, J.-M. and Kim, J. (2014). Copula directional dependence and its applications.Journal of Sta- tistical Computation and Simulation

2014
[2]

Lee, N., Kim, J.-M., et al. (2019). Copula-based directional dependence measures for gene-expression data. 8

2019
[3]

A., Irrthum, A., Wehenkel, L

Huynh-Thu, V. A., Irrthum, A., Wehenkel, L. and Geurts, P. (2010). Inferring regulatory networks from expression data using tree-based methods.PLoS ONE, 5(9):e12776

2010
[4]

Marbach, D. et al. (2012). Wisdom of crowds for robust gene network inference.Nature Methods, 9:796–804

2012
[5]

Hoyer, P. O. et al. (2009). Nonlinear causal discovery with additive noise models.NeurIPS

2009
[6]

Blöbaum, P. et al. (2018). Cause-effect inference by comparing regression errors (RECI).AISTATS

2018
[7]

Shimizu, S. et al. (2006). A linear non-Gaussian acyclic model for causal discovery (LiNGAM).JMLR, 7:2003–2030

2006
[8]

and Scheines, R

Spirtes, P., Glymour, C. and Scheines, R. (2000).Causation, Prediction, and Search. MIT Press

2000
[9]

and Moffa, G

Kuipers, J. and Moffa, G. (2017). Partition MCMC for inference on acyclic digraphs / BiDAG.JASA / R package

2017
[10]

Castelletti, F. et al. BCDAG: Bayesian structure and parameter learning for Gaussian DAGs.R package. 9

[1] [1]

and Kim, J

Kim, J.-M. and Kim, J. (2014). Copula directional dependence and its applications.Journal of Sta- tistical Computation and Simulation

2014

[2] [2]

Lee, N., Kim, J.-M., et al. (2019). Copula-based directional dependence measures for gene-expression data. 8

2019

[3] [3]

A., Irrthum, A., Wehenkel, L

Huynh-Thu, V. A., Irrthum, A., Wehenkel, L. and Geurts, P. (2010). Inferring regulatory networks from expression data using tree-based methods.PLoS ONE, 5(9):e12776

2010

[4] [4]

Marbach, D. et al. (2012). Wisdom of crowds for robust gene network inference.Nature Methods, 9:796–804

2012

[5] [5]

Hoyer, P. O. et al. (2009). Nonlinear causal discovery with additive noise models.NeurIPS

2009

[6] [6]

Blöbaum, P. et al. (2018). Cause-effect inference by comparing regression errors (RECI).AISTATS

2018

[7] [7]

Shimizu, S. et al. (2006). A linear non-Gaussian acyclic model for causal discovery (LiNGAM).JMLR, 7:2003–2030

2006

[8] [8]

and Scheines, R

Spirtes, P., Glymour, C. and Scheines, R. (2000).Causation, Prediction, and Search. MIT Press

2000

[9] [9]

and Moffa, G

Kuipers, J. and Moffa, G. (2017). Partition MCMC for inference on acyclic digraphs / BiDAG.JASA / R package

2017

[10] [10]

Castelletti, F. et al. BCDAG: Bayesian structure and parameter learning for Gaussian DAGs.R package. 9