pith. sign in

arxiv: 1406.2098 · v1 · pith:TKDCTGWWnew · submitted 2014-06-09 · 📊 stat.ML · stat.CO· stat.ME

Learning directed acyclic graphs via bootstrap aggregating

classification 📊 stat.ML stat.COstat.ME
keywords bootstrapdagbagdagsgraphicallearningacyclicaggregatingaggregation
0
0 comments X
read the original abstract

Probabilistic graphical models are graphical representations of probability distributions. Graphical models have applications in many fields including biology, social sciences, linguistic, neuroscience. In this paper, we propose directed acyclic graphs (DAGs) learning via bootstrap aggregating. The proposed procedure is named as DAGBag. Specifically, an ensemble of DAGs is first learned based on bootstrap resamples of the data and then an aggregated DAG is derived by minimizing the overall distance to the entire ensemble. A family of metrics based on the structural hamming distance is defined for the space of DAGs (of a given node set) and is used for aggregation. Under the high-dimensional-low-sample size setting, the graph learned on one data set often has excessive number of false positive edges due to over-fitting of the noise. Aggregation overcomes over-fitting through variance reduction and thus greatly reduces false positives. We also develop an efficient implementation of the hill climbing search algorithm of DAG learning which makes the proposed method computationally competitive for the high-dimensional regime. The DAGBag procedure is implemented in the R package dagbag.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Stable Causal Discovery via Directed Acyclic Graph Aggregation

    stat.ME 2026-05 unverdicted novelty 6.0

    DAGgr aggregates weighted candidate DAGs using out-of-sample predictive likelihood and an acyclicity-preserving threshold, with claimed finite-sample bounds and consistency, outperforming baselines in simulations and ...