A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning

Honglin Du; Muxuan Liang; Xiang Zhong

arxiv: 2605.19313 · v1 · pith:GTZ5DP4Rnew · submitted 2026-05-19 · 📊 stat.ML · cs.LG· stat.ME

A Unified Framework for Structure-Aware Clustering and Heterogeneous Causal Graph Learning

Honglin Du , Muxuan Liang , Xiang Zhong This is my paper

Pith reviewed 2026-05-20 03:21 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.ME

keywords causal discoveryDAG learningclusteringheterogeneous datastructural equation modelingADMM optimizationfusion penalty

0 comments

The pith

A new method jointly clusters subjects and learns their distinct causal dependency graphs from data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops DAG-DC-ADMM, a framework that assigns subjects to groups while estimating a separate directed acyclic graph for each group. It builds on structural equation models, adds a smooth acyclicity constraint, and applies a groupwise truncated Lasso fusion penalty to pull similar dependency structures together within clusters. An adapted ADMM solver handles the resulting nonconvex problem and guarantees convergence to a KKT point for certain graph forms such as upper triangular adjacency matrices. A sympathetic reader cares because many real systems contain hidden subpopulations whose interactions differ, and forcing a single graph on all subjects produces biased estimates of those interactions.

Core claim

The authors present DAG-DC-ADMM as a unified optimization framework that simultaneously learns cluster assignments and cluster-specific DAG structures. Acyclicity is enforced by a smooth constraint, structural similarity within clusters is promoted by the groupwise truncated Lasso fusion penalty, and the nonconvex program is solved via the augmented Lagrangian method with an ADMM scheme for difference-of-convex programs. Experiments show the method recovers the cluster-specific causal structures with high true positive rate and low false discovery rate, enabling discovery of heterogeneous dependencies when subpopulation labels are unknown.

What carries the argument

The groupwise truncated Lasso fusion penalty that encourages structural consensus inside clusters while permitting differences across clusters, paired with an ADMM solver adapted for the difference-of-convex program that also enforces acyclicity.

If this is right

The recovered graphs exhibit high true positive rate and low false discovery rate when cluster-specific structures are present.
Heterogeneous dependencies can be identified even when the researcher does not know the subpopulation labels in advance.
Bias from assuming a single dependency structure across all subjects is reduced.
The approach applies directly to multivariate systems whose interactions are encoded as DAGs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same joint clustering-plus-graph-learning idea could be tested on longitudinal data to track whether cluster membership or graph structure changes over time.
In applied domains such as gene regulatory networks, the method might reveal patient subgroups whose regulatory graphs respond differently to interventions.
Theoretical analysis of the fusion penalty could yield bounds on how many clusters can be reliably recovered as a function of sample size and graph sparsity.

Load-bearing premise

Subjects can be meaningfully grouped by how similar their dependency graphs are, and the observed data are generated by a structural equation model whose form matches the chosen acyclicity and fusion penalties.

What would settle it

Generate synthetic data with known cluster labels and known true DAGs for each cluster, run the algorithm, and check whether the recovered graphs match the true ones at the reported high true positive rate and low false discovery rate.

Figures

Figures reproduced from arXiv: 2605.19313 by Honglin Du, Muxuan Liang, Xiang Zhong.

**Figure 2.** Figure 2: Adjacency matrix comparison In contrast, the population-level DAG, built by pooling all cells, omits connections that are clearly present within specific subpopulations. This is a classic effect of population averaging: heterogeneous signaling states are smoothed together, and edges that are strong only in certain subsets fall below detection (Sachs et al., 2005). The identification of these biologically m… view at source ↗

read the original abstract

In complex multivariate systems, interactions among variables are defined by dependency structures, often encoded as directed acyclic graphs ($\text{DAGs}$). However, dependency structures can vary across subjects, and ignoring this structural heterogeneity introduces bias and obscures subpopulation-specific dependencies. To address this, we propose Directed Acyclic Graph-based Dependency Clustering via Alternating Direction Method of Multipliers (DAG-DC-ADMM), a unified framework built upon Structural Equation Modeling (SEM) that jointly learns cluster assignments and cluster-specific dependency structures. We encode acyclicity via a smooth constraint and integrate a groupwise truncated Lasso fusion penalty (gTLP) to cluster subjects based on their structural similarity. This yields a nonconvex optimization problem that incorporates sparsity, acyclicity, and structural consensus constraints. We address the nonconvexity by using the augmented Lagrangian method and solve it with an adapted version of the Alternating Direction Method of Multipliers (ADMM) for difference-of-convex programs. For certain graph structures, such as upper triangular adjacency matrices, our algorithm is guaranteed to converge to a Karush-Kuhn-Tucker (KKT) point. Experiments demonstrate that our method recovers cluster-specific causal dependency structures with a high true positive rate and a low false discovery rate. This capability enables the robust discovery of heterogeneous dependencies across subjects where the subpopulation label is unknown.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This unifies clustering and cluster-specific DAG learning in one ADMM problem, but the KKT convergence guarantee is limited to upper-triangular matrices.

read the letter

The one thing to know is that this paper gives a unified optimization method for clustering subjects by their causal graph structures while learning those graphs at the same time. It uses structural equation models, a groupwise truncated lasso penalty, and ADMM to handle the joint problem. The convergence guarantee, though, only applies to upper triangular adjacency matrices rather than general DAGs. They put the cluster assignments and the cluster-specific DAGs into one nonconvex program with sparsity, acyclicity, and consensus constraints. Solving it with an adapted ADMM for difference-of-convex programs is the technical contribution. This approach avoids the usual two-step process of first clustering then estimating graphs separately, which can introduce bias. The setup is clear and the penalty choice makes sense for encouraging structural similarity within clusters. If the experiments are as strong as the abstract suggests, with high true positive rates and low false discovery rates, it could be a practical addition for data with unknown subpopulations. The main concern is the limited convergence result. For arbitrary DAGs the algorithm may not reach a KKT point, which raises questions about whether the recovered structures are reliable or if the clusters are stable. The abstract mentions favorable recovery rates but gives little on the simulation setup, baselines, or how sensitive things are to the penalty parameters. That leaves the central performance claim on shaky ground until more details are provided. This is aimed at researchers working on causal discovery in heterogeneous settings, such as in genomics or economics. Someone building tools for joint clustering and graph learning would get value from the formulation. The paper engages honestly with the modeling assumptions and shows a coherent approach, so it merits a serious referee. I would recommend sending it for peer review, asking the authors to strengthen the convergence analysis for general cases and flesh out the experiments.

Referee Report

2 major / 1 minor

Summary. The paper proposes DAG-DC-ADMM, a unified SEM-based framework that jointly infers unknown cluster assignments and cluster-specific DAGs by combining a smooth acyclicity constraint with a groupwise truncated Lasso fusion (gTLP) penalty; the resulting nonconvex program is solved via an adapted ADMM procedure for difference-of-convex programs. The abstract states that the algorithm is guaranteed to reach a KKT point for certain structures such as upper-triangular adjacency matrices and reports that experiments recover cluster-specific causal structures with high true-positive rate and low false-discovery rate.

Significance. If the empirical recovery claims hold under general DAGs, the method would offer a practical route to structure-aware clustering and heterogeneous causal discovery in settings where subpopulation labels are unavailable. The integration of fusion penalties with acyclicity constraints is technically coherent, though the absence of machine-checked proofs or fully reproducible experimental pipelines limits the immediate strength of the contribution.

major comments (2)

[Abstract] Abstract: the convergence guarantee is explicitly limited to 'certain graph structures, such as upper triangular adjacency matrices.' Because the central claim concerns recovery of general cluster-specific DAGs via the nonconvex ADMM solver, the lack of a general convergence result for arbitrary DAGs directly undermines in the stability of the reported cluster assignments and edge recoveries.
[Abstract / Experimental section] Experiments (as summarized in the abstract): the claim of 'high true positive rate and low false discovery rate' is presented without any quantitative description of data-generating processes, baseline methods, hyper-parameter ranges, or sensitivity analyses. This absence leaves the performance assertions only weakly supported and makes it impossible to assess whether the reported TPR/FDR values are robust or merely artifacts of particular simulation choices.

minor comments (1)

[Methods] The notation for the groupwise truncated Lasso fusion penalty (gTLP) and its relation to the acyclicity constraint could be clarified with an explicit equation reference in the methods section.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating whether revisions have been made.

read point-by-point responses

Referee: [Abstract] Abstract: the convergence guarantee is explicitly limited to 'certain graph structures, such as upper triangular adjacency matrices.' Because the central claim concerns recovery of general cluster-specific DAGs via the nonconvex ADMM solver, the lack of a general convergence result for arbitrary DAGs directly undermines in the stability of the reported cluster assignments and edge recoveries.

Authors: We acknowledge that the convergence guarantee in the manuscript is restricted to certain graph structures, such as upper-triangular adjacency matrices, for which the ADMM iterates admit a direct KKT-point analysis. For arbitrary DAGs the nonconvexity arising from the difference-of-convex formulation and the smooth acyclicity constraint renders a general convergence proof technically difficult at present. We have revised the abstract and the theoretical analysis section to state this limitation more explicitly and to clarify that practical performance on general DAGs is supported by the empirical studies rather than by a universal theoretical guarantee. revision: yes
Referee: [Abstract / Experimental section] Experiments (as summarized in the abstract): the claim of 'high true positive rate and low false discovery rate' is presented without any quantitative description of data-generating processes, baseline methods, hyper-parameter ranges, or sensitivity analyses. This absence leaves the performance assertions only weakly supported and makes it impossible to assess whether the reported TPR/FDR values are robust or merely artifacts of particular simulation choices.

Authors: The abstract supplies only a high-level summary of the main empirical findings. Complete specifications of the data-generating processes (variable dimensions, sample sizes, cluster configurations, and noise models), the baseline algorithms, the hyper-parameter grids, and the sensitivity analyses appear in the Experimental Results section. In the revised version we have augmented the abstract with concise quantitative descriptors of the simulation regime (e.g., number of variables, sample sizes, and number of clusters) while preserving brevity, thereby making the performance claims more informative without duplicating the full experimental details. revision: yes

standing simulated objections not resolved

A general convergence guarantee for arbitrary DAGs under the nonconvex ADMM solver

Circularity Check

0 steps flagged

No significant circularity; optimization procedure is independent of reported recovery metrics

full rationale

The paper presents DAG-DC-ADMM as a joint optimization over cluster assignments and cluster-specific DAGs using SEM, a smooth acyclicity constraint, and groupwise truncated Lasso fusion penalty, solved via adapted ADMM for difference-of-convex programs. Experimental claims of high TPR and low FDR on recovering cluster-specific structures are presented as outcomes of this procedure on data, not as quantities forced by construction from the fitted parameters or model definition. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described framework. The limited convergence guarantee for upper-triangular cases is an algorithmic caveat but does not reduce the central derivation or empirical results to tautological inputs. The method remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the structural equation modeling assumption, the smooth acyclicity encoding, and the existence of structural clusters; no new physical entities are postulated.

free parameters (1)

penalty coefficients for gTLP and acyclicity
Regularization strengths that control sparsity, fusion, and cycle avoidance must be chosen or tuned.

axioms (1)

domain assumption Observed data are generated by a linear or nonlinear structural equation model whose adjacency matrix satisfies an acyclicity constraint.
The entire framework is built upon SEM and encodes acyclicity via a smooth constraint as stated in the abstract.

pith-pipeline@v0.9.0 · 5773 in / 1193 out tokens · 38986 ms · 2026-05-20T03:21:08.700515+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We address the nonconvexity by using the augmented Lagrangian method and solve it with an adapted version of the Alternating Direction Method of Multipliers (ADMM) for difference-of-convex programs. ... groupwise truncated Lasso fusion penalty (gTLP)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We encode acyclicity via a smooth constraint ... h(Wi) = tr(exp(Wi ∘ Wi)) − d = 0

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · 1 internal anchor

[1]

Pediatric research , volume=

Directed acyclic graphs: a tool for causal studies in paediatrics , author=. Pediatric research , volume=. 2018 , publisher=

work page 2018
[2]

IEEE transactions on information theory , volume=

Least squares quantization in PCM , author=. IEEE transactions on information theory , volume=. 1982 , publisher=

work page 1982
[3]

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

Algorithms for hierarchical clustering: an overview , author=. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=. 2012 , publisher=

work page 2012
[4]

kdd , volume=

A density-based algorithm for discovering clusters in large spatial databases with noise , author=. kdd , volume=

work page
[5]

Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics , volume=

Some methods for classification and analysis of multivariate observations , author=. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics , volume=. 1967 , organization=

work page 1967
[6]

2013 , publisher=

Pattern recognition with fuzzy objective function algorithms , author=. 2013 , publisher=

work page 2013
[7]

Journal of the American Statistical Association , volume=

A framework for feature selection in clustering , author=. Journal of the American Statistical Association , volume=. 2010 , publisher=

work page 2010
[8]

PASCAL workshop on statistics and optimization of clustering workshop , volume=

Convex clustering shrinkage , author=. PASCAL workshop on statistics and optimization of clustering workshop , volume=

work page
[9]

The Journal of Machine Learning Research , volume=

Cluster analysis: unsupervised learning via supervised learning with a non-convex penalty , author=. The Journal of Machine Learning Research , volume=. 2013 , publisher=

work page 2013
[10]

Journal of Machine Learning Research , volume=

A new algorithm and theory for penalized regression-based clustering , author=. Journal of Machine Learning Research , volume=

work page
[11]

Causality: Statistical perspectives and applications , pages=

Causal inference in time series analysis , author=. Causality: Statistical perspectives and applications , pages=. 2012 , publisher=

work page 2012
[12]

2009 , publisher=

Causality , author=. 2009 , publisher=

work page 2009
[13]

Physical review letters , volume=

Kernel method for nonlinear Granger causality , author=. Physical review letters , volume=. 2008 , publisher=

work page 2008
[14]

0-penalized maximum likelihood for sparse directed acyclic graphs , author=

work page
[15]

arXiv preprint arXiv:1910.01075 , year=

Learning neural causal models from unknown interventions , author=. arXiv preprint arXiv:1910.01075 , year=

work page arXiv 1910
[16]

Advances in neural information processing systems , volume=

Dags with no tears: Continuous optimization for structure learning , author=. Advances in neural information processing systems , volume=

work page
[17]

Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =

Hocking, Toby Dylan and Joulin, Armand and Bach, Francis and Vert, Jean-Philippe , title =. Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =. 2011 , isbn =

work page 2011
[18]

, year =

Jain, Anil K. , year =. Data clustering: 50 years beyond K-means , volume =. Pattern Recognition Letters , publisher =. doi:10.1016/j.patrec.2009.09.011 , number =

work page doi:10.1016/j.patrec.2009.09.011 2009
[19]

IEEE Transactions on Neural Networks and Learning Systems , volume=

A review of convex clustering from multiple perspectives: models, optimizations, statistical properties, applications, and connections , author=. IEEE Transactions on Neural Networks and Learning Systems , volume=. 2023 , publisher=

work page 2023
[20]

28th international conference on machine learning , pages=

Clusterpath an algorithm for clustering using convex fusion penalties , author=. 28th international conference on machine learning , pages=

work page
[21]

Journal of Computational and Graphical Statistics , volume=

Splitting methods for convex clustering , author=. Journal of Computational and Graphical Statistics , volume=. 2015 , publisher=

work page 2015
[22]

Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

Network lasso: Clustering and optimization in large graphs , author=. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

work page
[23]

Weighted total variation based convex clustering

Weighted total variation based convex clustering , author=. arXiv preprint arXiv:1808.09144 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[24]

Journal of Machine Learning Research , volume=

Provable convex co-clustering of tensors , author=. Journal of Machine Learning Research , volume=

work page
[25]

arXiv preprint arXiv:1906.09581 , year=

Robust convex clustering: How does fusion penalty enhance robustness? , author=. arXiv preprint arXiv:1906.09581 , year=

work page arXiv 1906
[26]

Advances in Data Analysis and Classification , volume=

Convex clustering for binary data , author=. Advances in Data Analysis and Classification , volume=. 2019 , publisher=

work page 2019
[27]

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

Algorithms for hierarchical clustering: an overview , author=. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

work page
[28]

Advances in Neural Information Processing Systems , volume=

DAGs with No Fears: A closer look at continuous optimization for learning Bayesian networks , author=. Advances in Neural Information Processing Systems , volume=

work page
[29]

IEEE transactions on signal processing , volume=

Clustering of data with missing entries using non-convex fusion penalties , author=. IEEE transactions on signal processing , volume=. 2019 , publisher=

work page 2019
[30]

IEEE transactions on cybernetics , volume=

Nonsmooth Penalized Clustering via \_ \ p\ Regularized Sparse Regression , author=. IEEE transactions on cybernetics , volume=. 2016 , publisher=

work page 2016
[31]

Journal of the American Statistical Association , volume=

Likelihood-based selection and sharp parameter estimation , author=. Journal of the American Statistical Association , volume=. 2012 , publisher=

work page 2012
[32]

2019 , publisher=

Solution Path Clustering with Robust Loss and Concave Penalty , author=. 2019 , publisher=

work page 2019
[33]

Proceedings of the National Academy of Sciences , volume=

Robust continuous clustering , author=. Proceedings of the National Academy of Sciences , volume=. 2017 , publisher=

work page 2017
[34]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

The joint graphical lasso for inverse covariance estimation across multiple classes , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2014 , publisher=

work page 2014
[35]

Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence , pages=

Learning mixtures of DAG models , author=. Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence , pages=

work page
[36]

, year =

Muthén, Bengt O. , year =. Beyond SEM: General Latent Variable Modeling , volume =. Behaviormetrika , publisher =. doi:10.2333/bhmk.29.81 , number =

work page doi:10.2333/bhmk.29.81
[37]

Mathematical Programming , volume=

DC programming and DCA: thirty years of developments , author=. Mathematical Programming , volume=. 2018 , publisher=

work page 2018
[38]

Foundations and Trends

Distributed optimization and statistical learning via the alternating direction method of multipliers , author=. Foundations and Trends. 2011 , publisher=

work page 2011
[39]

Mathematical Programming , year=

Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , author=. Mathematical Programming , year=

work page
[40]

Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-

Attouch, H. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-. Mathematics of operations research , volume=. 2010 , publisher=

work page 2010
[41]

Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) , pages=

V-measure: A conditional entropy-based external cluster evaluation measure , author=. Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) , pages=

work page 2007
[42]

Journal of Classification , year=

Comparing partitions , author=. Journal of Classification , year=

work page
[43]

Science , volume=

Causal protein-signaling networks derived from multiparameter single-cell data , author=. Science , volume=. 2005 , publisher=

work page 2005
[44]

Biometrika , volume=

Identifiability of Gaussian structural equation models with equal error variances , author=. Biometrika , volume=. 2014 , publisher=

work page 2014
[45]

International conference on machine learning , pages=

DAG-GNN: DAG structure learning with graph neural networks , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019
[46]

Advances in Neural Information Processing Systems , volume=

Dagma: Learning dags via m-matrices and a log-determinant acyclicity characterization , author=. Advances in Neural Information Processing Systems , volume=

work page
[47]

The Annals of Statistics , pages=

CAM: Causal additive models, high-dimensional order search and penalized regression , author=. The Annals of Statistics , pages=

work page
[48]

International conference on artificial intelligence and statistics , pages=

Learning sparse nonparametric dags , author=. International conference on artificial intelligence and statistics , pages=

work page
[49]

International Conference on Machine Learning , pages=

Optimizing notears objectives via topological swaps , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[50]

Principal Component Analysis , ISBN =

Jolliffe, Ian , year =. Principal Component Analysis , ISBN =. doi:10.1002/0470013192.bsa501 , journal =

work page doi:10.1002/0470013192.bsa501
[51]

Cluster Quality Analysis Using Silhouette Score , year=

Shahapure, Ketan Rajshekhar and Nicholas, Charles , booktitle=. Cluster Quality Analysis Using Silhouette Score , year=

work page
[52]

International Conference on Artificial Intelligence and Statistics , year=

DYNOTEARS: Structure Learning from Time-Series Data , author=. International Conference on Artificial Intelligence and Statistics , year=

work page
[53]

doi:10.48550/arxiv.1906.04477 , arxivId =

Causal discovery with reinforcement learning , author=. arXiv preprint arXiv:1906.04477 , year=

work page arXiv 1906
[54]

arXiv preprint arXiv:2007.12786 , year=

Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data , author=. arXiv preprint arXiv:2007.12786 , year=

work page arXiv 2007
[55]

Frontiers in psychology , volume=

Structural equation modeling with many variables: A systematic review of issues and developments , author=. Frontiers in psychology , volume=. 2018 , publisher=

work page 2018
[56]

Nature Reviews Cancer , volume=

The PI3K--AKT network at the interface of oncogenic signalling and cancer metabolism , author=. Nature Reviews Cancer , volume=. 2020 , publisher=

work page 2020
[57]

and Rahman, Anisur and Mohan, Krithika and Elston, Timothy C

Nosbisch, Jamie L. and Rahman, Anisur and Mohan, Krithika and Elston, Timothy C. and Bear, James E. and Haugh, Jason M. , editor =. Mechanistic models of PLC/PKC signaling implicate phosphatidic acid as a key amplifier of chemotactic gradient sensing , volume =. PLOS Computational Biology , publisher =. 2020 , month = apr, pages =. doi:10.1371/journal.pcb...

work page doi:10.1371/journal.pcbi.1007708 2020
[58]

, journal=

Rui Xu and Wunsch, D. , journal=. Survey of clustering algorithms , year=

work page
[59]

Handbook of causal analysis for social research , pages=

Eight myths about causality and structural equation models , author=. Handbook of causal analysis for social research , pages=. 2013 , publisher=

work page 2013
[60]

International journal of computer applications , volume=

Clustering techniques and the similarity measures used in clustering: A survey , author=. International journal of computer applications , volume=. 2016 , publisher=

work page 2016
[61]

and Guo, Zhigao and Liu, Yang and Chobtham, Kiattikun , year =

Kitson, Neville Kenneth and Constantinou, Anthony C. and Guo, Zhigao and Liu, Yang and Chobtham, Kiattikun , year =. A survey of Bayesian Network structure learning , volume =. Artificial Intelligence Review , publisher =. doi:10.1007/s10462-022-10351-w , number =

work page doi:10.1007/s10462-022-10351-w
[62]

Learning from data: Artificial intelligence and statistics V , pages=

Learning Bayesian networks is NP-complete , author=. Learning from data: Artificial intelligence and statistics V , pages=. 1996 , publisher=

work page 1996
[63]

Journal of Machine Learning Research , volume=

Causal discovery from heterogeneous/nonstationary data , author=. Journal of Machine Learning Research , volume=

work page
[64]

Annual Review of Statistics and Its Application , volume=

Structure learning in graphical modeling , author=. Annual Review of Statistics and Its Application , volume=. 2017 , publisher=

work page 2017
[65]

Frontiers in genetics , volume=

Review of causal discovery methods based on graphical models , author=. Frontiers in genetics , volume=. 2019 , publisher=

work page 2019
[66]

ACM computing surveys (CSUR) , volume=

Data clustering: a review , author=. ACM computing surveys (CSUR) , volume=. 1999 , publisher=

work page 1999
[67]

, journal=

Ma, Xiuxiu and Ballal, Tarig and Chen, Hui and Aldayel, Omar and Al-Naffouri, Tareq Y. , journal=. A Maximum-Likelihood TDOA Localization Algorithm Using Difference-of-Convex Programming , year=

work page
[68]

Foundations and Trends in optimization , volume=

Proximal algorithms , author=. Foundations and Trends in optimization , volume=. 2014 , publisher=

work page 2014
[69]

SIAM Journal on scientific computing , volume=

A limited memory algorithm for bound constrained optimization , author=. SIAM Journal on scientific computing , volume=. 1995 , publisher=

work page 1995
[70]

Frontiers in genetics , volume=

Learning subject-specific directed acyclic graphs with mixed effects structural equation models from observational data , author=. Frontiers in genetics , volume=. 2018 , publisher=

work page 2018
[71]

Statistics in medicine , volume=

Bayesian graphical modeling for heterogeneous causal effects , author=. Statistics in medicine , volume=. 2023 , publisher=

work page 2023
[72]

Machine Learning , volume=

Causal discovery with a mixture of DAGs , author=. Machine Learning , volume=. 2023 , publisher=

work page 2023
[73]

Transactions on Machine Learning Research , year=

Separability analysis for causal discovery in mixture of DAGs , author=. Transactions on Machine Learning Research , year=

work page

[1] [1]

Pediatric research , volume=

Directed acyclic graphs: a tool for causal studies in paediatrics , author=. Pediatric research , volume=. 2018 , publisher=

work page 2018

[2] [2]

IEEE transactions on information theory , volume=

Least squares quantization in PCM , author=. IEEE transactions on information theory , volume=. 1982 , publisher=

work page 1982

[3] [3]

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

Algorithms for hierarchical clustering: an overview , author=. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=. 2012 , publisher=

work page 2012

[4] [4]

kdd , volume=

A density-based algorithm for discovering clusters in large spatial databases with noise , author=. kdd , volume=

work page

[5] [5]

Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics , volume=

Some methods for classification and analysis of multivariate observations , author=. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics , volume=. 1967 , organization=

work page 1967

[6] [6]

2013 , publisher=

Pattern recognition with fuzzy objective function algorithms , author=. 2013 , publisher=

work page 2013

[7] [7]

Journal of the American Statistical Association , volume=

A framework for feature selection in clustering , author=. Journal of the American Statistical Association , volume=. 2010 , publisher=

work page 2010

[8] [8]

PASCAL workshop on statistics and optimization of clustering workshop , volume=

Convex clustering shrinkage , author=. PASCAL workshop on statistics and optimization of clustering workshop , volume=

work page

[9] [9]

The Journal of Machine Learning Research , volume=

Cluster analysis: unsupervised learning via supervised learning with a non-convex penalty , author=. The Journal of Machine Learning Research , volume=. 2013 , publisher=

work page 2013

[10] [10]

Journal of Machine Learning Research , volume=

A new algorithm and theory for penalized regression-based clustering , author=. Journal of Machine Learning Research , volume=

work page

[11] [11]

Causality: Statistical perspectives and applications , pages=

Causal inference in time series analysis , author=. Causality: Statistical perspectives and applications , pages=. 2012 , publisher=

work page 2012

[12] [12]

2009 , publisher=

Causality , author=. 2009 , publisher=

work page 2009

[13] [13]

Physical review letters , volume=

Kernel method for nonlinear Granger causality , author=. Physical review letters , volume=. 2008 , publisher=

work page 2008

[14] [14]

0-penalized maximum likelihood for sparse directed acyclic graphs , author=

work page

[15] [15]

arXiv preprint arXiv:1910.01075 , year=

Learning neural causal models from unknown interventions , author=. arXiv preprint arXiv:1910.01075 , year=

work page arXiv 1910

[16] [16]

Advances in neural information processing systems , volume=

Dags with no tears: Continuous optimization for structure learning , author=. Advances in neural information processing systems , volume=

work page

[17] [17]

Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =

Hocking, Toby Dylan and Joulin, Armand and Bach, Francis and Vert, Jean-Philippe , title =. Proceedings of the 28th International Conference on International Conference on Machine Learning , pages =. 2011 , isbn =

work page 2011

[18] [18]

, year =

Jain, Anil K. , year =. Data clustering: 50 years beyond K-means , volume =. Pattern Recognition Letters , publisher =. doi:10.1016/j.patrec.2009.09.011 , number =

work page doi:10.1016/j.patrec.2009.09.011 2009

[19] [19]

IEEE Transactions on Neural Networks and Learning Systems , volume=

A review of convex clustering from multiple perspectives: models, optimizations, statistical properties, applications, and connections , author=. IEEE Transactions on Neural Networks and Learning Systems , volume=. 2023 , publisher=

work page 2023

[20] [20]

28th international conference on machine learning , pages=

Clusterpath an algorithm for clustering using convex fusion penalties , author=. 28th international conference on machine learning , pages=

work page

[21] [21]

Journal of Computational and Graphical Statistics , volume=

Splitting methods for convex clustering , author=. Journal of Computational and Graphical Statistics , volume=. 2015 , publisher=

work page 2015

[22] [22]

Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

Network lasso: Clustering and optimization in large graphs , author=. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

work page

[23] [23]

Weighted total variation based convex clustering

Weighted total variation based convex clustering , author=. arXiv preprint arXiv:1808.09144 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

Journal of Machine Learning Research , volume=

Provable convex co-clustering of tensors , author=. Journal of Machine Learning Research , volume=

work page

[25] [25]

arXiv preprint arXiv:1906.09581 , year=

Robust convex clustering: How does fusion penalty enhance robustness? , author=. arXiv preprint arXiv:1906.09581 , year=

work page arXiv 1906

[26] [26]

Advances in Data Analysis and Classification , volume=

Convex clustering for binary data , author=. Advances in Data Analysis and Classification , volume=. 2019 , publisher=

work page 2019

[27] [27]

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

Algorithms for hierarchical clustering: an overview , author=. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , volume=

work page

[28] [28]

Advances in Neural Information Processing Systems , volume=

DAGs with No Fears: A closer look at continuous optimization for learning Bayesian networks , author=. Advances in Neural Information Processing Systems , volume=

work page

[29] [29]

IEEE transactions on signal processing , volume=

Clustering of data with missing entries using non-convex fusion penalties , author=. IEEE transactions on signal processing , volume=. 2019 , publisher=

work page 2019

[30] [30]

IEEE transactions on cybernetics , volume=

Nonsmooth Penalized Clustering via \_ \ p\ Regularized Sparse Regression , author=. IEEE transactions on cybernetics , volume=. 2016 , publisher=

work page 2016

[31] [31]

Journal of the American Statistical Association , volume=

Likelihood-based selection and sharp parameter estimation , author=. Journal of the American Statistical Association , volume=. 2012 , publisher=

work page 2012

[32] [32]

2019 , publisher=

Solution Path Clustering with Robust Loss and Concave Penalty , author=. 2019 , publisher=

work page 2019

[33] [33]

Proceedings of the National Academy of Sciences , volume=

Robust continuous clustering , author=. Proceedings of the National Academy of Sciences , volume=. 2017 , publisher=

work page 2017

[34] [34]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

The joint graphical lasso for inverse covariance estimation across multiple classes , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2014 , publisher=

work page 2014

[35] [35]

Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence , pages=

Learning mixtures of DAG models , author=. Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence , pages=

work page

[36] [36]

, year =

Muthén, Bengt O. , year =. Beyond SEM: General Latent Variable Modeling , volume =. Behaviormetrika , publisher =. doi:10.2333/bhmk.29.81 , number =

work page doi:10.2333/bhmk.29.81

[37] [37]

Mathematical Programming , volume=

DC programming and DCA: thirty years of developments , author=. Mathematical Programming , volume=. 2018 , publisher=

work page 2018

[38] [38]

Foundations and Trends

Distributed optimization and statistical learning via the alternating direction method of multipliers , author=. Foundations and Trends. 2011 , publisher=

work page 2011

[39] [39]

Mathematical Programming , year=

Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , author=. Mathematical Programming , year=

work page

[40] [40]

Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-

Attouch, H. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-. Mathematics of operations research , volume=. 2010 , publisher=

work page 2010

[41] [41]

Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) , pages=

V-measure: A conditional entropy-based external cluster evaluation measure , author=. Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL) , pages=

work page 2007

[42] [42]

Journal of Classification , year=

Comparing partitions , author=. Journal of Classification , year=

work page

[43] [43]

Science , volume=

Causal protein-signaling networks derived from multiparameter single-cell data , author=. Science , volume=. 2005 , publisher=

work page 2005

[44] [44]

Biometrika , volume=

Identifiability of Gaussian structural equation models with equal error variances , author=. Biometrika , volume=. 2014 , publisher=

work page 2014

[45] [45]

International conference on machine learning , pages=

DAG-GNN: DAG structure learning with graph neural networks , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019

[46] [46]

Advances in Neural Information Processing Systems , volume=

Dagma: Learning dags via m-matrices and a log-determinant acyclicity characterization , author=. Advances in Neural Information Processing Systems , volume=

work page

[47] [47]

The Annals of Statistics , pages=

CAM: Causal additive models, high-dimensional order search and penalized regression , author=. The Annals of Statistics , pages=

work page

[48] [48]

International conference on artificial intelligence and statistics , pages=

Learning sparse nonparametric dags , author=. International conference on artificial intelligence and statistics , pages=

work page

[49] [49]

International Conference on Machine Learning , pages=

Optimizing notears objectives via topological swaps , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023

[50] [50]

Principal Component Analysis , ISBN =

Jolliffe, Ian , year =. Principal Component Analysis , ISBN =. doi:10.1002/0470013192.bsa501 , journal =

work page doi:10.1002/0470013192.bsa501

[51] [51]

Cluster Quality Analysis Using Silhouette Score , year=

Shahapure, Ketan Rajshekhar and Nicholas, Charles , booktitle=. Cluster Quality Analysis Using Silhouette Score , year=

work page

[52] [52]

International Conference on Artificial Intelligence and Statistics , year=

DYNOTEARS: Structure Learning from Time-Series Data , author=. International Conference on Artificial Intelligence and Statistics , year=

work page

[53] [53]

doi:10.48550/arxiv.1906.04477 , arxivId =

Causal discovery with reinforcement learning , author=. arXiv preprint arXiv:1906.04477 , year=

work page arXiv 1906

[54] [54]

arXiv preprint arXiv:2007.12786 , year=

Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data , author=. arXiv preprint arXiv:2007.12786 , year=

work page arXiv 2007

[55] [55]

Frontiers in psychology , volume=

Structural equation modeling with many variables: A systematic review of issues and developments , author=. Frontiers in psychology , volume=. 2018 , publisher=

work page 2018

[56] [56]

Nature Reviews Cancer , volume=

The PI3K--AKT network at the interface of oncogenic signalling and cancer metabolism , author=. Nature Reviews Cancer , volume=. 2020 , publisher=

work page 2020

[57] [57]

and Rahman, Anisur and Mohan, Krithika and Elston, Timothy C

Nosbisch, Jamie L. and Rahman, Anisur and Mohan, Krithika and Elston, Timothy C. and Bear, James E. and Haugh, Jason M. , editor =. Mechanistic models of PLC/PKC signaling implicate phosphatidic acid as a key amplifier of chemotactic gradient sensing , volume =. PLOS Computational Biology , publisher =. 2020 , month = apr, pages =. doi:10.1371/journal.pcb...

work page doi:10.1371/journal.pcbi.1007708 2020

[58] [58]

, journal=

Rui Xu and Wunsch, D. , journal=. Survey of clustering algorithms , year=

work page

[59] [59]

Handbook of causal analysis for social research , pages=

Eight myths about causality and structural equation models , author=. Handbook of causal analysis for social research , pages=. 2013 , publisher=

work page 2013

[60] [60]

International journal of computer applications , volume=

Clustering techniques and the similarity measures used in clustering: A survey , author=. International journal of computer applications , volume=. 2016 , publisher=

work page 2016

[61] [61]

and Guo, Zhigao and Liu, Yang and Chobtham, Kiattikun , year =

Kitson, Neville Kenneth and Constantinou, Anthony C. and Guo, Zhigao and Liu, Yang and Chobtham, Kiattikun , year =. A survey of Bayesian Network structure learning , volume =. Artificial Intelligence Review , publisher =. doi:10.1007/s10462-022-10351-w , number =

work page doi:10.1007/s10462-022-10351-w

[62] [62]

Learning from data: Artificial intelligence and statistics V , pages=

Learning Bayesian networks is NP-complete , author=. Learning from data: Artificial intelligence and statistics V , pages=. 1996 , publisher=

work page 1996

[63] [63]

Journal of Machine Learning Research , volume=

Causal discovery from heterogeneous/nonstationary data , author=. Journal of Machine Learning Research , volume=

work page

[64] [64]

Annual Review of Statistics and Its Application , volume=

Structure learning in graphical modeling , author=. Annual Review of Statistics and Its Application , volume=. 2017 , publisher=

work page 2017

[65] [65]

Frontiers in genetics , volume=

Review of causal discovery methods based on graphical models , author=. Frontiers in genetics , volume=. 2019 , publisher=

work page 2019

[66] [66]

ACM computing surveys (CSUR) , volume=

Data clustering: a review , author=. ACM computing surveys (CSUR) , volume=. 1999 , publisher=

work page 1999

[67] [67]

, journal=

Ma, Xiuxiu and Ballal, Tarig and Chen, Hui and Aldayel, Omar and Al-Naffouri, Tareq Y. , journal=. A Maximum-Likelihood TDOA Localization Algorithm Using Difference-of-Convex Programming , year=

work page

[68] [68]

Foundations and Trends in optimization , volume=

Proximal algorithms , author=. Foundations and Trends in optimization , volume=. 2014 , publisher=

work page 2014

[69] [69]

SIAM Journal on scientific computing , volume=

A limited memory algorithm for bound constrained optimization , author=. SIAM Journal on scientific computing , volume=. 1995 , publisher=

work page 1995

[70] [70]

Frontiers in genetics , volume=

Learning subject-specific directed acyclic graphs with mixed effects structural equation models from observational data , author=. Frontiers in genetics , volume=. 2018 , publisher=

work page 2018

[71] [71]

Statistics in medicine , volume=

Bayesian graphical modeling for heterogeneous causal effects , author=. Statistics in medicine , volume=. 2023 , publisher=

work page 2023

[72] [72]

Machine Learning , volume=

Causal discovery with a mixture of DAGs , author=. Machine Learning , volume=. 2023 , publisher=

work page 2023

[73] [73]

Transactions on Machine Learning Research , year=

Separability analysis for causal discovery in mixture of DAGs , author=. Transactions on Machine Learning Research , year=

work page