Score-Based Causal Discovery of Latent Variable Causal Models

Biwei Huang; Haoyue Dai; Ignavier Ng; Kun Zhang; Peter Spirtes; Xinshuai Dong

arxiv: 2605.20396 · v1 · pith:D4TJJR75new · submitted 2026-05-19 · 💻 cs.LG · stat.ML

Score-Based Causal Discovery of Latent Variable Causal Models

Ignavier Ng , Xinshuai Dong , Haoyue Dai , Biwei Huang , Peter Spirtes , Kun Zhang This is my paper

Pith reviewed 2026-05-21 08:00 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords causal discoverylatent variablesscore-based methodsstructure learningcausal graphical modelsidentifiabilitydegrees of freedom

0 comments

The pith

A properly formulated scoring function achieves score equivalence and consistency for structure learning of latent variable causal models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops score-based methods for identifying causal structures that include causally related latent variables. It shows that properly designed scoring functions attain score equivalence and consistency by incorporating a characterization of the degrees of freedom in the marginal distribution over observed variables. This formulation addresses practical drawbacks of constraint-based approaches such as error propagation and testing-order dependency. The work also unifies several existing methods under different structural assumptions through both exact and continuous score variants.

Core claim

We show that a properly formulated scoring function can achieve score equivalence and consistency for structure learning of latent variable causal models. We further provide a characterization of the degrees of freedom for the marginal over the observed variables under multiple structural assumptions considered in the literature, and accordingly develop both exact and continuous score-based methods.

What carries the argument

Scoring function constructed from the degrees of freedom count of the marginal distribution over observed variables under latent variable structural assumptions.

If this is right

Score-based search procedures such as GES become applicable to causal models containing latent variables.
The resulting methods carry identifiability guarantees for structures involving causally related hidden variables.
Constraint-based methods relying on conditional independence or rank tests can be reinterpreted as special cases of this scoring framework.
Both discrete exact search and continuous optimization versions of the score can be used for structure recovery.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The degrees-of-freedom scores could be combined with modern gradient-based optimizers to scale to larger graphs.
Similar marginal-distribution characterizations may extend the approach to time-series or dynamic latent causal models.
The framework suggests a route to hybrid methods that mix score-based search with selected independence tests for efficiency.

Load-bearing premise

The characterization of the degrees of freedom for the marginal distribution over observed variables holds under the multiple structural assumptions considered.

What would settle it

A simulated or real dataset with known ground-truth latent variable causal structure where the score-based method selects an incorrect model when the degrees of freedom count is used.

Figures

Figures reproduced from arXiv: 2605.20396 by Biwei Huang, Haoyue Dai, Ignavier Ng, Kun Zhang, Peter Spirtes, Xinshuai Dong.

**Figure 2.** Figure 2: Example of latent hierarchical structure. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Example to illustrate graph operators Oatomic, Omin, and Oskeleton. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗

**Figure 4.** Figure 4: Ground truths for 1-factor latent variable models. [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

**Figure 5.** Figure 5: Ground truths for latent hierarchical structures. [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

read the original abstract

Identifying latent variables and the causal structure involving them is essential across various scientific fields. While many existing works fall under the category of constraint-based methods (with e.g. conditional independence or rank deficiency tests), they may face empirical challenges such as testing-order dependency, error propagation, and choosing an appropriate significance level. These issues can potentially be mitigated by properly designed score-based methods, such as Greedy Equivalence Search (GES) (Chickering, 2002) in the specific setting without latent variables. Yet, formulating score-based methods with latent variables is highly challenging. In this work, we develop score-based methods that are capable of identifying causal structures containing causally-related latent variables with identifiability guarantees. Specifically, we show that a properly formulated scoring function can achieve score equivalence and consistency for structure learning of latent variable causal models. We further provide a characterization of the degrees of freedom for the marginal over the observed variables under multiple structural assumptions considered in the literature, and accordingly develop both exact and continuous score-based methods. This offers a unified view of several existing constraint-based methods with different structural assumptions. Experimental results validate the effectiveness of the proposed methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends score-based causal discovery to models with causally related latents via a degrees-of-freedom characterization for the observed marginal, but the whole construction stands or falls on whether that count is accurate.

read the letter

This paper develops score-based methods for causal discovery when latent variables are causally related to each other. The main contribution is a scoring function that aims for score equivalence and consistency, plus a characterization of the degrees of freedom in the marginal distribution over the observed variables under several structural assumptions from the literature. They turn that count into both exact and continuous scores and show how it unifies some existing constraint-based approaches. Experiments suggest the methods are effective and sidestep some practical headaches of conditional independence tests, such as order dependence and error propagation. The soft spot is the degrees-of-freedom count itself. If that characterization does not hold exactly for the linear or nonlinear cases or the different latent-to-observed topologies they consider, the penalty term will be wrong and the consistency claim will not follow. The abstract states the count as given, so the derivations in the full paper need direct verification. This work is for researchers already working on causal discovery algorithms who want score-based alternatives that handle latents. Readers interested in theoretical identifiability or in building more robust structure-learning tools will get the most out of it. It deserves a serious referee to check the proofs and the modeling assumptions.

Referee Report

2 major / 3 minor

Summary. The manuscript develops score-based methods for causal structure learning in latent variable models. It asserts that a properly formulated scoring function achieves score equivalence and consistency by first characterizing the degrees of freedom of the marginal distribution over observed variables under multiple structural assumptions drawn from the literature. Both exact and continuous scores are constructed from this characterization, and the approach is presented as unifying several existing constraint-based methods. Experimental results are reported to support effectiveness.

Significance. If the degrees-of-freedom characterization is correct and yields consistent scores, the work would provide a principled score-based alternative to constraint-based approaches for latent-variable causal discovery, potentially reducing issues such as testing-order dependency and error propagation. The unification perspective across different structural assumptions and the availability of both exact and continuous formulations are constructive contributions.

major comments (2)

[§4, Theorem 1] §4, Theorem 1 (degrees-of-freedom characterization): The marginal DoF count under the considered structural assumptions is used directly to define the penalty term in both the exact score (Eq. (5)) and the continuous score (Eq. (8)). The manuscript states the count but supplies only a high-level derivation sketch without explicit verification against known results for nonlinear or non-Gaussian cases, nor sensitivity analysis when latent-to-observed topologies deviate from the assumed forms. This count is load-bearing for the consistency claim in Theorem 2.
[§5.2] §5.2: The proof of score equivalence and consistency assumes the DoF characterization holds exactly for all model classes considered. No auxiliary result or simulation is given to bound the effect on consistency when the count is approximate (e.g., under mild nonlinearity), which directly affects whether the central claim that “a properly formulated scoring function can achieve … consistency” is established.

minor comments (3)

[Abstract] The abstract refers to “multiple structural assumptions considered in the literature” without naming the specific assumptions or citing the corresponding sections; a brief enumeration would improve readability.
[Figure 2] Figure 2: the y-axis label for the continuous-score optimization trajectory is missing units or scaling information, making direct comparison with the exact-score results difficult.
[§3] Notation for the latent-variable adjacency matrix is introduced in §3 but first used in §4; moving the notation paragraph earlier would reduce forward references.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address the two major concerns point by point below, clarifying the basis of our degrees-of-freedom characterization and outlining planned revisions to strengthen the presentation of consistency results.

read point-by-point responses

Referee: [§4, Theorem 1] §4, Theorem 1 (degrees-of-freedom characterization): The marginal DoF count under the considered structural assumptions is used directly to define the penalty term in both the exact score (Eq. (5)) and the continuous score (Eq. (8)). The manuscript states the count but supplies only a high-level derivation sketch without explicit verification against known results for nonlinear or non-Gaussian cases, nor sensitivity analysis when latent-to-observed topologies deviate from the assumed forms. This count is load-bearing for the consistency claim in Theorem 2.

Authors: We agree that a more detailed derivation would improve clarity. Theorem 1 builds on standard results for linear-Gaussian and certain nonlinear cases from the literature on latent variable models (e.g., rank conditions and independence constraints under the assumed topologies). The high-level sketch condenses these extensions to the multiple structural assumptions considered. In the revision we will expand the appendix with explicit DoF calculations for representative nonlinear and non-Gaussian settings, including direct comparisons to known closed-form expressions. We will also add a short robustness discussion noting that the main consistency claims hold exactly under the stated structural assumptions and degrade gracefully for mild topology deviations; a full sensitivity analysis for arbitrary deviations lies outside the paper's scope but can be noted as future work. revision: yes
Referee: [§5.2] §5.2: The proof of score equivalence and consistency assumes the DoF characterization holds exactly for all model classes considered. No auxiliary result or simulation is given to bound the effect on consistency when the count is approximate (e.g., under mild nonlinearity), which directly affects whether the central claim that “a properly formulated scoring function can achieve … consistency” is established.

Authors: The proof in §5.2 is stated under the exact DoF characterization of Theorem 1. We acknowledge the absence of an auxiliary bound or simulation for approximate counts. In the revised manuscript we will insert a remark after Theorem 2 that quantifies the effect of small perturbations in the penalty term (via a continuity argument on the score) and add a brief simulation experiment in the experiments section that perturbs the DoF count under controlled nonlinearity and reports the resulting structure-recovery rates. These additions will make explicit the conditions under which the consistency claim remains valid. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation

full rationale

The paper states a theorem that a properly formulated scoring function achieves score equivalence and consistency for latent variable causal models. It separately provides a characterization of degrees of freedom for the marginal distribution under structural assumptions drawn from the literature, then builds exact and continuous scores from that count to set the penalty term. No quoted step reduces the central result to a fitted input, self-citation chain, or definitional equivalence; the consistency claim rests on the independent derivation of the degrees-of-freedom count rather than on renaming or smuggling prior results. The derivation is therefore self-contained against the stated modeling assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed from abstract alone; no explicit free parameters, axioms, or invented entities are stated in the provided text. The degrees-of-freedom characterization is treated as a derived quantity rather than an ad-hoc postulate.

pith-pipeline@v0.9.0 · 5744 in / 1061 out tokens · 35721 ms · 2026-05-21T08:00:32.501916+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We further provide a characterization of the degrees of freedom for the marginal over the observed variables under multiple structural assumptions... develop both exact and continuous score-based methods.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

scoredim(G,D) := dim(G) if G can generate S, ∞ otherwise... scoreBIC(G,D) := scoreL(G,D) + (log T / 2) dim(G)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages · 4 internal anchors

[1]

Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases

Adams, J., Hansen, N., and Zhang, K. Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases. Advances in Neural Information Processing Systems, 34: 0 22822--22833, 2021

work page 2021
[2]

Recursive causal structure learning in the presence of latent variables and selection bias

Akbari, S., Mokhtarian, E., Ghassami, A., and Kiyavash, N. Recursive causal structure learning in the presence of latent variables and selection bias. Advances in Neural Information Processing Systems, 34: 0 10119--10130, 2021

work page 2021
[3]

Structure learning for cyclic linear causal models

Amendola, C., Dettling, P., Drton, M., Onori, F., and Wu, J. Structure learning for cyclic linear causal models. In Conference on Uncertainty in Artificial Intelligence, 2020

work page 2020
[4]

Third-order moment varieties of linear non-gaussian graphical models

Am \'e ndola, C., Drton, M., Grosdos, A., Homs, R., and Robeva, E. Third-order moment varieties of linear non-gaussian graphical models. Information and Inference: A Journal of the IMA, 12 0 (3): 0 iaad007, 2023

work page 2023
[5]

Learning linear bayesian networks with latent variables

Anandkumar, A., Hsu, D., Javanmard, A., and Kakade, S. Learning linear bayesian networks with latent variables. In International Conference on Machine Learning, pp.\ 249--257. PMLR, 2013

work page 2013
[6]

and van der Schaar, M

Bellot, A. and van der Schaar, M. Deconfounded score method: Scoring dags with dense unobserved confounding. arXiv preprint arXiv:2103.15106, 2021

work page arXiv 2021
[7]

and Risler, J.-J

Benedetti, R. and Risler, J.-J. Real algebraic and semi-algebraic sets. Actualit \'e s math \'e matiques. Hermann, Paris, 1990

work page 1990
[8]

Ordering-based causal structure learning in the presence of latent variables

Bernstein, D., Saeed, B., Squires, C., and Uhler, C. Ordering-based causal structure learning in the presence of latent variables. In International Conference on Artificial Intelligence and Statistics, pp.\ 4098--4108. PMLR, 2020

work page 2020
[9]

Bertsekas, D. P. Constrained Optimization and Lagrange Multiplier Methods . Academic Press, 1982

work page 1982
[10]

Bertsekas, D. P. Nonlinear Programming. Athena Scientific, 2nd edition, 1999

work page 1999
[11]

Differentiable causal discovery under unmeasured confounding

Bhattacharya, R., Nagarajan, T., Malinsky, D., and Shpitser, I. Differentiable causal discovery under unmeasured confounding. In International Conference on Artificial Intelligence and Statistics, 2021

work page 2021
[12]

Bollen, K. A. The General Model, Part I: Latent Variable and Measurement Models Combined, chapter Eight, pp.\ 319--394. John Wiley & Sons, Ltd, 1989. ISBN 9781118619179

work page 1989
[13]

and Pearl, J

Brito, C. and Pearl, J. A new identification condition for recursive models with correlated errors. Structural Equation Modeling: A Multidisciplinary Journal, 9 0 (4): 0 459--474, 2002. doi:10.1207/S15328007SEM0904\_1

work page doi:10.1207/s15328007sem0904 2002
[14]

Differentiable causal discovery from interventional data

Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S., and Drouin, A. Differentiable causal discovery from interventional data. In Advances in Neural Information Processing Systems, 2020

work page 2020
[15]

H., Lu, P., Nocedal, J., and Zhu, C

Byrd, R. H., Lu, P., Nocedal, J., and Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16 0 (5): 0 1190--1208, 1995

work page 1995
[16]

Structural Equation Modeling With AMOS: Basic Concepts, Applications, and Programming

Byrne, B. Structural Equation Modeling With AMOS: Basic Concepts, Applications, and Programming. Multivariate Applications Series. Taylor & Francis, 2001

work page 2001
[17]

Triad constraints for learning causal structure of latent variables

Cai, R., Xie, F., Glymour, C., Hao, Z., and Zhang, K. Triad constraints for learning causal structure of latent variables. Advances in neural information processing systems, 32, 2019

work page 2019
[18]

Identification of linear latent variable model with arbitrary distribution

Chen, Z., Xie, F., Qiao, J., Hao, Z., Zhang, K., and Cai, R. Identification of linear latent variable model with arbitrary distribution. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.\ 6350--6357, 2022

work page 2022
[19]

Chickering, D. M. Optimal structure identification with greedy search. Journal of Machine Learning Research, 3 0 (Nov): 0 507--554, 2002

work page 2002
[20]

J., Tan, V

Choi, M. J., Tan, V. Y., Anandkumar, A., and Willsky, A. S. Learning latent tree graphical models. Journal of Machine Learning Research, 12: 0 1771--1812, 2011

work page 2011
[21]

and Bucur, I

Claassen, T. and Bucur, I. G. Greedy equivalence search in the presence of latent confounders. In Conference on Uncertainty in Artificial Intelligence, 2022

work page 2022
[22]

Learning Sparse Causal Models is not NP-hard

Claassen, T., Mooij, J., and Heskes, T. Learning sparse causal models is not np-hard. arXiv preprint arXiv:1309.6824, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[23]

H., Kalisch, M., and Richardson, T

Colombo, D., Maathuis, M. H., Kalisch, M., and Richardson, T. S. Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, pp.\ 294--321, 2012

work page 2012
[24]

A., Little, J., and O'Shea, D

Cox, D. A., Little, J., and O'Shea, D. Ideals, Varieties, and Algorithms. Springer, New York, fourth edition, 2015

work page 2015
[25]

Learning the causal structure of copula models with latent variables

Cui, R., Groot, P., Schauer, M., and Heskes, T. Learning the causal structure of copula models with latent variables. 2018

work page 2018
[26]

Independence testing-based approach to causal discovery under measurement error and linear non-gaussian models

Dai, H., Spirtes, P., and Zhang, K. Independence testing-based approach to causal discovery under measurement error and linear non-gaussian models. Advances in Neural Information Processing Systems, 35: 0 27524--27536, 2022

work page 2022
[27]

P., Laird, N

Dempster, A. P., Laird, N. M., and Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39: 0 1--38, 1977

work page 1977
[28]

A versatile causal discovery framework to allow causally-related hidden variables

Dong, X., Huang, B., Ng, I., Song, X., Zheng, Y., Jin, S., Legaspi, R., Spirtes, P., and Zhang, K. A versatile causal discovery framework to allow causally-related hidden variables. arXiv preprint arXiv:2312.11001, 2023

work page arXiv 2023
[29]

Algebraic problems in structural equation modeling

Drton, M. Algebraic problems in structural equation modeling. In Advanced Studies in Pure Mathematics, pp.\ 35--86. Mathematical Society of Japan, 2018

work page 2018
[30]

Algebraic sparse factor analysis

Drton, M., Grosdos, A., Portakal, I., and Sturma, N. Algebraic sparse factor analysis. arXiv preprint arXiv:2312.14762, 2023

work page arXiv 2023
[31]

The frugal inference of causal relations

Forster, M., Raskutti, G., Stern, R., and Weinberger, N. The frugal inference of causal relations. The British Journal for the Philosophy of Science, 69, 04 2017

work page 2017
[32]

E., and Meek, C

Geiger, D., Heckerman, D. E., and Meek, C. Asymptotic model selection for directed networks with hidden variables. In Conference on Uncertainty in Artificial Intelligence, 1996

work page 1996
[33]

Stratified exponential families: Graphical models and model selection

Geiger, D., Heckerman, D., King, H., and Meek, C. Stratified exponential families: Graphical models and model selection. The Annals of Statistics, 29 0 (2): 0 505--529, 2001

work page 2001
[34]

Characterizing distribution equivalence and structure learning for cyclic and acyclic directed graphs

Ghassami, A., Yang, A., Kiyavash, N., and Zhang, K. Characterizing distribution equivalence and structure learning for cyclic and acyclic directed graphs. In International Conference on Machine Learning, 2020

work page 2020
[35]

The development of markers for the big five factor structure

Goldberg, L. The development of markers for the big five factor structure. Psychological Assessment, 4: 0 26--42, 03 1992

work page 1992
[36]

Haughton, D. M. A. On the choice of a model to fit data from an exponential family. The Annals of Statistics, 16 0 (1): 0 342--355, 1988

work page 1988
[37]

A., Buehner, M., Schwaighofer, M., Klapetek, A., and Hilbert, S

Himi, S. A., Buehner, M., Schwaighofer, M., Klapetek, A., and Hilbert, S. Multitasking behavior and its related constructs: Executive functions, working memory capacity, relational integration, and divided attention. Cognition, 189: 0 275--298, 08 2019

work page 2019
[38]

Latent hierarchical causal structure discovery with rank constraints

Huang, B., Low, C., Xie, F., Glymour, C., and Zhang, K. Latent hierarchical causal structure discovery with rank constraints. In Advances in Neural Information Processing Systems, 2022

work page 2022
[39]

Identifiability of latent-variable and structural-equation models: from linear to nonlinear

Hyv \"a rinen, A., Khemakhem, I., and Monti, R. Identifiability of latent-variable and structural-equation models: from linear to nonlinear. Annals of the Institute of Statistical Mathematics, 2023

work page 2023
[40]

Categorical reparameterization with gumbel-softmax

Jang, E., Gu, S., and Poole, B. Categorical reparameterization with gumbel-softmax. In International Conference on Learning Representations, 2017

work page 2017
[41]

and Ba, J

Kingma, D. and Ba, J. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2014

work page 2014
[42]

Learning latent causal graphs via mixture oracles

Kivva, B., Rajendran, G., Ravikumar, P., and Aragam, B. Learning latent causal graphs via mixture oracles. Advances in Neural Information Processing Systems, 34: 0 18087--18101, 2021

work page 2021
[43]

and Friedman, N

Koller, D. and Friedman, N. Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA, 2009

work page 2009
[44]

and Ramsey, J

Kummerfeld, E. and Ramsey, J. Causal clustering for 1-factor measurement models. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.\ 1655--1664, 2016

work page 2016
[45]

Identifiability of directed Gaussian graphical models with one latent source

Leung, D., Drton, M., and Hara, H. Identifiability of directed Gaussian graphical models with one latent source. Electronic Journal of Statistics, 10, 05 2015

work page 2015
[46]

J., Mnih, A., and Teh, Y

Maddison, C. J., Mnih, A., and Teh, Y. W. The concrete distribution: A continuous relaxation of discrete random variables. In International Conference on Learning Representations, 2017

work page 2017
[47]

Nandy, P., Hauser, A., and Maathuis, M. H. High-dimensional consistency in score-based and hybrid structure learning. The Annals of Statistics, 46 0 (6A): 0 3151--3183, 2018

work page 2018
[48]

On the role of sparsity and DAG constraints for learning linear DAGs

Ng, I., Ghassami, A., and Zhang, K. On the role of sparsity and DAG constraints for learning linear DAGs . In Advances in Neural Information Processing Systems, 2020

work page 2020
[49]

Masked gradient-based causal structure learning

Ng, I., Zhu, S., Fang, Z., Li, H., Chen, Z., and Wang, J. Masked gradient-based causal structure learning. In SIAM International Conference on Data Mining, 2022

work page 2022
[50]

Structure learning with continuous optimization: A sober look and beyond

Ng, I., Huang, B., and Zhang, K. Structure learning with continuous optimization: A sober look and beyond. In Proceedings of the Third Conference on Causal Learning and Reasoning, 2024

work page 2024
[51]

and Wright, S

Nocedal, J. and Wright, S. J. Numerical optimization. Springer series in operations research and financial engineering. Springer, 2nd edition, 2006

work page 2006
[52]

H., Evans, R

Nowzohour, C., Maathuis, M. H., Evans, R. J., and B \"u hlmann, P. Distributional equivalence and structure learning for bow-free acyclic path diagrams. 2017

work page 2017
[53]

PyTorch : An imperative style, high-performance deep learning library

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. PyTorch : An imperative style, high-performance deep learning library. In Advances in Neural Infor...

work page 2019
[54]

Causality

Pearl, J. Causality. Cambridge university press, 2009

work page 2009
[55]

Ramsey, J., Glymour, M., Sanchez-Romero, R., and Glymour, C. A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. International Journal of Data Science and Analytics, 3 0 (2): 0 121--129, 2017

work page 2017
[56]

Learning directed acyclic graphs based on sparsest permutations

Raskutti, G. and Uhler, C. Learning directed acyclic graphs based on sparsest permutations. arXiv preprint arXiv:1307.0366v3, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[57]

and Spirtes, P

Richardson, T. and Spirtes, P. Ancestral graph Markov models. The Annals of Statistics, 30 0 (4): 0 962--1030, 2002

work page 2002
[58]

Richardson, T. S. Models of feedback: interpretation and discovery. PhD thesis, Carnegie-Mellon University, 1996

work page 1996
[59]

Learning linear non-gaussian causal models in the presence of latent variables

Salehkaleybar, S., Ghassami, A., Kiyavash, N., and Zhang, K. Learning linear non-gaussian causal models in the presence of latent variables. The Journal of Machine Learning Research, 21 0 (1): 0 1436--1459, 2020

work page 2020
[60]

The TETRAD project: Constraint based aids to causal model specification

Scheines, R., Spirtes, P., Glymour, C., Meek, C., and Richardson, T. The TETRAD project: Constraint based aids to causal model specification. Multivariate Behavioral Research, 33: 0 65--117, 1998

work page 1998
[61]

R., Kalchbrenner, N., Goyal, A., and Bengio, Y

Sch \"o lkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y. Towards causal representation learning. Proceedings of the IEEE, 109 0 (5): 0 612--634, 2021

work page 2021
[62]

Estimating the dimension of a model

Schwarz, G. Estimating the dimension of a model. The Annals of Statistics, 6 0 (2): 0 461--464, 1978

work page 1978
[63]

and Chechik, M

Shahin, R. and Chechik, M. Automatic and efficient variability-aware lifting of functional programs. Proceedings of the ACM on Programming Languages, 4 0 (OOPSLA): 0 1--27, 2020

work page 2020
[64]

O., and Hyv \"a rinen, A

Shimizu, S., Hoyer, P. O., and Hyv \"a rinen, A. Estimation of linear non-gaussian acyclic models for latent factors. Neurocomputing, 72 0 (7-9): 0 2024--2027, 2009

work page 2024
[65]

Parameter and Structure Learning in Nested Markov Models

Shpitser, I., Richardson, T. S., Robins, J. M., and Evans, R. Parameter and structure learning in nested Markov models. arXiv preprint arXiv:1207.5058, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[66]

and Scheines, R

Silva, R. and Scheines, R. Generalized measurement models. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science, 2005

work page 2005
[67]

Learning measurement models for unobserved variables

Silva, R., Scheines, R., Glymour, C., and Spirtes, P. Learning measurement models for unobserved variables. In Conference on Uncertainty in Artificial Intelligence, 2003

work page 2003
[68]

Learning the structure of linear latent variable models

Silva, R., Scheines, R., Glymour, C., and Spirtes, P. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7 0 (8): 0 191--246, 2006. URL http://jmlr.org/papers/v7/silva06a.html

work page 2006
[69]

Singh, A. P. and Moore, A. W. Finding optimal Bayesian networks by dynamic programming. Technical report, Carnegie Mellon University, 2005

work page 2005
[70]

Introduction to causal inference

Spirtes, P. Introduction to causal inference. Journal of Machine Learning Research, 11 0 (5), 2010

work page 2010
[71]

and Glymour, C

Spirtes, P. and Glymour, C. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9: 0 62--72, 1991

work page 1991
[72]

Causation, Prediction, and Search

Spirtes, P., Glymour, C., and Scheines, R. Causation, Prediction, and Search. MIT press, 2nd edition, 2001

work page 2001
[73]

Causal Inference in the Presence of Latent Variables and Selection Bias

Spirtes, P. L., Meek, C., and Richardson, T. S. Causal inference in the presence of latent variables and selection bias. arXiv preprint arXiv:1302.4983, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[74]

Linear causal disentanglement via interventions

Squires, C., Seigal, A., Bhate, S., and Uhler, C. Linear causal disentanglement via interventions. In International Conference on Machine Learning, 2023

work page 2023
[75]

Unpaired multi-domain causal representation learning

Sturma, N., Squires, C., Drton, M., and Uhler, C. Unpaired multi-domain causal representation learning. arXiv preprint arXiv:2302.00993, 2023

work page arXiv 2023
[76]

Trek separation for gaussian graphical models

Sullivant, S., Talaska, K., and Draisma, J. Trek separation for gaussian graphical models. The Annals of Statistics, 38 0 (3): 0 1665--1685, 2010

work page 2010
[77]

and Tsamardinos, I

Triantafillou, S. and Tsamardinos, I. Score-based vs constraint-based causal learning in the presence of confounders. In Cfa@ uai, pp.\ 59--67, 2016

work page 2016
[78]

and Mooij, J

van Ommen, T. and Mooij, J. M. Algebraic equivalence of linear structural equation models. In Conference on Uncertainty in Artificial Intelligence, 2017

work page 2017
[79]

and Pearl, J

Verma, T. and Pearl, J. Equivalence and synthesis of causal models. In Conference on Uncertainty in Artificial Intelligence, 1991

work page 1991
[80]

E., Haberland , M., Reddy , T., Cournapeau , D., Burovski , E., Peterson , P., Weckesser , W., Bright , J., van der Walt , S

Virtanen , P., Gommers , R., Oliphant , T. E., Haberland , M., Reddy , T., Cournapeau , D., Burovski , E., Peterson , P., Weckesser , W., Bright , J., van der Walt , S. J., Brett , M., Wilson , J., Jarrod Millman , K., Mayorov , N., Nelson , A. R. J., Jones , E., Kern , R., Larson , E., Carey , C., Polat , \.I ., Feng , Y., Moore , E. W., Vand erPlas , J....

work page 2020

Showing first 80 references.

[1] [1]

Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases

Adams, J., Hansen, N., and Zhang, K. Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases. Advances in Neural Information Processing Systems, 34: 0 22822--22833, 2021

work page 2021

[2] [2]

Recursive causal structure learning in the presence of latent variables and selection bias

Akbari, S., Mokhtarian, E., Ghassami, A., and Kiyavash, N. Recursive causal structure learning in the presence of latent variables and selection bias. Advances in Neural Information Processing Systems, 34: 0 10119--10130, 2021

work page 2021

[3] [3]

Structure learning for cyclic linear causal models

Amendola, C., Dettling, P., Drton, M., Onori, F., and Wu, J. Structure learning for cyclic linear causal models. In Conference on Uncertainty in Artificial Intelligence, 2020

work page 2020

[4] [4]

Third-order moment varieties of linear non-gaussian graphical models

Am \'e ndola, C., Drton, M., Grosdos, A., Homs, R., and Robeva, E. Third-order moment varieties of linear non-gaussian graphical models. Information and Inference: A Journal of the IMA, 12 0 (3): 0 iaad007, 2023

work page 2023

[5] [5]

Learning linear bayesian networks with latent variables

Anandkumar, A., Hsu, D., Javanmard, A., and Kakade, S. Learning linear bayesian networks with latent variables. In International Conference on Machine Learning, pp.\ 249--257. PMLR, 2013

work page 2013

[6] [6]

and van der Schaar, M

Bellot, A. and van der Schaar, M. Deconfounded score method: Scoring dags with dense unobserved confounding. arXiv preprint arXiv:2103.15106, 2021

work page arXiv 2021

[7] [7]

and Risler, J.-J

Benedetti, R. and Risler, J.-J. Real algebraic and semi-algebraic sets. Actualit \'e s math \'e matiques. Hermann, Paris, 1990

work page 1990

[8] [8]

Ordering-based causal structure learning in the presence of latent variables

Bernstein, D., Saeed, B., Squires, C., and Uhler, C. Ordering-based causal structure learning in the presence of latent variables. In International Conference on Artificial Intelligence and Statistics, pp.\ 4098--4108. PMLR, 2020

work page 2020

[9] [9]

Bertsekas, D. P. Constrained Optimization and Lagrange Multiplier Methods . Academic Press, 1982

work page 1982

[10] [10]

Bertsekas, D. P. Nonlinear Programming. Athena Scientific, 2nd edition, 1999

work page 1999

[11] [11]

Differentiable causal discovery under unmeasured confounding

Bhattacharya, R., Nagarajan, T., Malinsky, D., and Shpitser, I. Differentiable causal discovery under unmeasured confounding. In International Conference on Artificial Intelligence and Statistics, 2021

work page 2021

[12] [12]

Bollen, K. A. The General Model, Part I: Latent Variable and Measurement Models Combined, chapter Eight, pp.\ 319--394. John Wiley & Sons, Ltd, 1989. ISBN 9781118619179

work page 1989

[13] [13]

and Pearl, J

Brito, C. and Pearl, J. A new identification condition for recursive models with correlated errors. Structural Equation Modeling: A Multidisciplinary Journal, 9 0 (4): 0 459--474, 2002. doi:10.1207/S15328007SEM0904\_1

work page doi:10.1207/s15328007sem0904 2002

[14] [14]

Differentiable causal discovery from interventional data

Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S., and Drouin, A. Differentiable causal discovery from interventional data. In Advances in Neural Information Processing Systems, 2020

work page 2020

[15] [15]

H., Lu, P., Nocedal, J., and Zhu, C

Byrd, R. H., Lu, P., Nocedal, J., and Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16 0 (5): 0 1190--1208, 1995

work page 1995

[16] [16]

Structural Equation Modeling With AMOS: Basic Concepts, Applications, and Programming

Byrne, B. Structural Equation Modeling With AMOS: Basic Concepts, Applications, and Programming. Multivariate Applications Series. Taylor & Francis, 2001

work page 2001

[17] [17]

Triad constraints for learning causal structure of latent variables

Cai, R., Xie, F., Glymour, C., Hao, Z., and Zhang, K. Triad constraints for learning causal structure of latent variables. Advances in neural information processing systems, 32, 2019

work page 2019

[18] [18]

Identification of linear latent variable model with arbitrary distribution

Chen, Z., Xie, F., Qiao, J., Hao, Z., Zhang, K., and Cai, R. Identification of linear latent variable model with arbitrary distribution. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.\ 6350--6357, 2022

work page 2022

[19] [19]

Chickering, D. M. Optimal structure identification with greedy search. Journal of Machine Learning Research, 3 0 (Nov): 0 507--554, 2002

work page 2002

[20] [20]

J., Tan, V

Choi, M. J., Tan, V. Y., Anandkumar, A., and Willsky, A. S. Learning latent tree graphical models. Journal of Machine Learning Research, 12: 0 1771--1812, 2011

work page 2011

[21] [21]

and Bucur, I

Claassen, T. and Bucur, I. G. Greedy equivalence search in the presence of latent confounders. In Conference on Uncertainty in Artificial Intelligence, 2022

work page 2022

[22] [22]

Learning Sparse Causal Models is not NP-hard

Claassen, T., Mooij, J., and Heskes, T. Learning sparse causal models is not np-hard. arXiv preprint arXiv:1309.6824, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[23] [23]

H., Kalisch, M., and Richardson, T

Colombo, D., Maathuis, M. H., Kalisch, M., and Richardson, T. S. Learning high-dimensional directed acyclic graphs with latent and selection variables. The Annals of Statistics, pp.\ 294--321, 2012

work page 2012

[24] [24]

A., Little, J., and O'Shea, D

Cox, D. A., Little, J., and O'Shea, D. Ideals, Varieties, and Algorithms. Springer, New York, fourth edition, 2015

work page 2015

[25] [25]

Learning the causal structure of copula models with latent variables

Cui, R., Groot, P., Schauer, M., and Heskes, T. Learning the causal structure of copula models with latent variables. 2018

work page 2018

[26] [26]

Independence testing-based approach to causal discovery under measurement error and linear non-gaussian models

Dai, H., Spirtes, P., and Zhang, K. Independence testing-based approach to causal discovery under measurement error and linear non-gaussian models. Advances in Neural Information Processing Systems, 35: 0 27524--27536, 2022

work page 2022

[27] [27]

P., Laird, N

Dempster, A. P., Laird, N. M., and Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39: 0 1--38, 1977

work page 1977

[28] [28]

A versatile causal discovery framework to allow causally-related hidden variables

Dong, X., Huang, B., Ng, I., Song, X., Zheng, Y., Jin, S., Legaspi, R., Spirtes, P., and Zhang, K. A versatile causal discovery framework to allow causally-related hidden variables. arXiv preprint arXiv:2312.11001, 2023

work page arXiv 2023

[29] [29]

Algebraic problems in structural equation modeling

Drton, M. Algebraic problems in structural equation modeling. In Advanced Studies in Pure Mathematics, pp.\ 35--86. Mathematical Society of Japan, 2018

work page 2018

[30] [30]

Algebraic sparse factor analysis

Drton, M., Grosdos, A., Portakal, I., and Sturma, N. Algebraic sparse factor analysis. arXiv preprint arXiv:2312.14762, 2023

work page arXiv 2023

[31] [31]

The frugal inference of causal relations

Forster, M., Raskutti, G., Stern, R., and Weinberger, N. The frugal inference of causal relations. The British Journal for the Philosophy of Science, 69, 04 2017

work page 2017

[32] [32]

E., and Meek, C

Geiger, D., Heckerman, D. E., and Meek, C. Asymptotic model selection for directed networks with hidden variables. In Conference on Uncertainty in Artificial Intelligence, 1996

work page 1996

[33] [33]

Stratified exponential families: Graphical models and model selection

Geiger, D., Heckerman, D., King, H., and Meek, C. Stratified exponential families: Graphical models and model selection. The Annals of Statistics, 29 0 (2): 0 505--529, 2001

work page 2001

[34] [34]

Characterizing distribution equivalence and structure learning for cyclic and acyclic directed graphs

Ghassami, A., Yang, A., Kiyavash, N., and Zhang, K. Characterizing distribution equivalence and structure learning for cyclic and acyclic directed graphs. In International Conference on Machine Learning, 2020

work page 2020

[35] [35]

The development of markers for the big five factor structure

Goldberg, L. The development of markers for the big five factor structure. Psychological Assessment, 4: 0 26--42, 03 1992

work page 1992

[36] [36]

Haughton, D. M. A. On the choice of a model to fit data from an exponential family. The Annals of Statistics, 16 0 (1): 0 342--355, 1988

work page 1988

[37] [37]

A., Buehner, M., Schwaighofer, M., Klapetek, A., and Hilbert, S

Himi, S. A., Buehner, M., Schwaighofer, M., Klapetek, A., and Hilbert, S. Multitasking behavior and its related constructs: Executive functions, working memory capacity, relational integration, and divided attention. Cognition, 189: 0 275--298, 08 2019

work page 2019

[38] [38]

Latent hierarchical causal structure discovery with rank constraints

Huang, B., Low, C., Xie, F., Glymour, C., and Zhang, K. Latent hierarchical causal structure discovery with rank constraints. In Advances in Neural Information Processing Systems, 2022

work page 2022

[39] [39]

Identifiability of latent-variable and structural-equation models: from linear to nonlinear

Hyv \"a rinen, A., Khemakhem, I., and Monti, R. Identifiability of latent-variable and structural-equation models: from linear to nonlinear. Annals of the Institute of Statistical Mathematics, 2023

work page 2023

[40] [40]

Categorical reparameterization with gumbel-softmax

Jang, E., Gu, S., and Poole, B. Categorical reparameterization with gumbel-softmax. In International Conference on Learning Representations, 2017

work page 2017

[41] [41]

and Ba, J

Kingma, D. and Ba, J. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2014

work page 2014

[42] [42]

Learning latent causal graphs via mixture oracles

Kivva, B., Rajendran, G., Ravikumar, P., and Aragam, B. Learning latent causal graphs via mixture oracles. Advances in Neural Information Processing Systems, 34: 0 18087--18101, 2021

work page 2021

[43] [43]

and Friedman, N

Koller, D. and Friedman, N. Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA, 2009

work page 2009

[44] [44]

and Ramsey, J

Kummerfeld, E. and Ramsey, J. Causal clustering for 1-factor measurement models. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp.\ 1655--1664, 2016

work page 2016

[45] [45]

Identifiability of directed Gaussian graphical models with one latent source

Leung, D., Drton, M., and Hara, H. Identifiability of directed Gaussian graphical models with one latent source. Electronic Journal of Statistics, 10, 05 2015

work page 2015

[46] [46]

J., Mnih, A., and Teh, Y

Maddison, C. J., Mnih, A., and Teh, Y. W. The concrete distribution: A continuous relaxation of discrete random variables. In International Conference on Learning Representations, 2017

work page 2017

[47] [47]

Nandy, P., Hauser, A., and Maathuis, M. H. High-dimensional consistency in score-based and hybrid structure learning. The Annals of Statistics, 46 0 (6A): 0 3151--3183, 2018

work page 2018

[48] [48]

On the role of sparsity and DAG constraints for learning linear DAGs

Ng, I., Ghassami, A., and Zhang, K. On the role of sparsity and DAG constraints for learning linear DAGs . In Advances in Neural Information Processing Systems, 2020

work page 2020

[49] [49]

Masked gradient-based causal structure learning

Ng, I., Zhu, S., Fang, Z., Li, H., Chen, Z., and Wang, J. Masked gradient-based causal structure learning. In SIAM International Conference on Data Mining, 2022

work page 2022

[50] [50]

Structure learning with continuous optimization: A sober look and beyond

Ng, I., Huang, B., and Zhang, K. Structure learning with continuous optimization: A sober look and beyond. In Proceedings of the Third Conference on Causal Learning and Reasoning, 2024

work page 2024

[51] [51]

and Wright, S

Nocedal, J. and Wright, S. J. Numerical optimization. Springer series in operations research and financial engineering. Springer, 2nd edition, 2006

work page 2006

[52] [52]

H., Evans, R

Nowzohour, C., Maathuis, M. H., Evans, R. J., and B \"u hlmann, P. Distributional equivalence and structure learning for bow-free acyclic path diagrams. 2017

work page 2017

[53] [53]

PyTorch : An imperative style, high-performance deep learning library

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. PyTorch : An imperative style, high-performance deep learning library. In Advances in Neural Infor...

work page 2019

[54] [54]

Causality

Pearl, J. Causality. Cambridge university press, 2009

work page 2009

[55] [55]

Ramsey, J., Glymour, M., Sanchez-Romero, R., and Glymour, C. A million variables and more: the fast greedy equivalence search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images. International Journal of Data Science and Analytics, 3 0 (2): 0 121--129, 2017

work page 2017

[56] [56]

Learning directed acyclic graphs based on sparsest permutations

Raskutti, G. and Uhler, C. Learning directed acyclic graphs based on sparsest permutations. arXiv preprint arXiv:1307.0366v3, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[57] [57]

and Spirtes, P

Richardson, T. and Spirtes, P. Ancestral graph Markov models. The Annals of Statistics, 30 0 (4): 0 962--1030, 2002

work page 2002

[58] [58]

Richardson, T. S. Models of feedback: interpretation and discovery. PhD thesis, Carnegie-Mellon University, 1996

work page 1996

[59] [59]

Learning linear non-gaussian causal models in the presence of latent variables

Salehkaleybar, S., Ghassami, A., Kiyavash, N., and Zhang, K. Learning linear non-gaussian causal models in the presence of latent variables. The Journal of Machine Learning Research, 21 0 (1): 0 1436--1459, 2020

work page 2020

[60] [60]

The TETRAD project: Constraint based aids to causal model specification

Scheines, R., Spirtes, P., Glymour, C., Meek, C., and Richardson, T. The TETRAD project: Constraint based aids to causal model specification. Multivariate Behavioral Research, 33: 0 65--117, 1998

work page 1998

[61] [61]

R., Kalchbrenner, N., Goyal, A., and Bengio, Y

Sch \"o lkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y. Towards causal representation learning. Proceedings of the IEEE, 109 0 (5): 0 612--634, 2021

work page 2021

[62] [62]

Estimating the dimension of a model

Schwarz, G. Estimating the dimension of a model. The Annals of Statistics, 6 0 (2): 0 461--464, 1978

work page 1978

[63] [63]

and Chechik, M

Shahin, R. and Chechik, M. Automatic and efficient variability-aware lifting of functional programs. Proceedings of the ACM on Programming Languages, 4 0 (OOPSLA): 0 1--27, 2020

work page 2020

[64] [64]

O., and Hyv \"a rinen, A

Shimizu, S., Hoyer, P. O., and Hyv \"a rinen, A. Estimation of linear non-gaussian acyclic models for latent factors. Neurocomputing, 72 0 (7-9): 0 2024--2027, 2009

work page 2024

[65] [65]

Parameter and Structure Learning in Nested Markov Models

Shpitser, I., Richardson, T. S., Robins, J. M., and Evans, R. Parameter and structure learning in nested Markov models. arXiv preprint arXiv:1207.5058, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[66] [66]

and Scheines, R

Silva, R. and Scheines, R. Generalized measurement models. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science, 2005

work page 2005

[67] [67]

Learning measurement models for unobserved variables

Silva, R., Scheines, R., Glymour, C., and Spirtes, P. Learning measurement models for unobserved variables. In Conference on Uncertainty in Artificial Intelligence, 2003

work page 2003

[68] [68]

Learning the structure of linear latent variable models

Silva, R., Scheines, R., Glymour, C., and Spirtes, P. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7 0 (8): 0 191--246, 2006. URL http://jmlr.org/papers/v7/silva06a.html

work page 2006

[69] [69]

Singh, A. P. and Moore, A. W. Finding optimal Bayesian networks by dynamic programming. Technical report, Carnegie Mellon University, 2005

work page 2005

[70] [70]

Introduction to causal inference

Spirtes, P. Introduction to causal inference. Journal of Machine Learning Research, 11 0 (5), 2010

work page 2010

[71] [71]

and Glymour, C

Spirtes, P. and Glymour, C. An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9: 0 62--72, 1991

work page 1991

[72] [72]

Causation, Prediction, and Search

Spirtes, P., Glymour, C., and Scheines, R. Causation, Prediction, and Search. MIT press, 2nd edition, 2001

work page 2001

[73] [73]

Causal Inference in the Presence of Latent Variables and Selection Bias

Spirtes, P. L., Meek, C., and Richardson, T. S. Causal inference in the presence of latent variables and selection bias. arXiv preprint arXiv:1302.4983, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[74] [74]

Linear causal disentanglement via interventions

Squires, C., Seigal, A., Bhate, S., and Uhler, C. Linear causal disentanglement via interventions. In International Conference on Machine Learning, 2023

work page 2023

[75] [75]

Unpaired multi-domain causal representation learning

Sturma, N., Squires, C., Drton, M., and Uhler, C. Unpaired multi-domain causal representation learning. arXiv preprint arXiv:2302.00993, 2023

work page arXiv 2023

[76] [76]

Trek separation for gaussian graphical models

Sullivant, S., Talaska, K., and Draisma, J. Trek separation for gaussian graphical models. The Annals of Statistics, 38 0 (3): 0 1665--1685, 2010

work page 2010

[77] [77]

and Tsamardinos, I

Triantafillou, S. and Tsamardinos, I. Score-based vs constraint-based causal learning in the presence of confounders. In Cfa@ uai, pp.\ 59--67, 2016

work page 2016

[78] [78]

and Mooij, J

van Ommen, T. and Mooij, J. M. Algebraic equivalence of linear structural equation models. In Conference on Uncertainty in Artificial Intelligence, 2017

work page 2017

[79] [79]

and Pearl, J

Verma, T. and Pearl, J. Equivalence and synthesis of causal models. In Conference on Uncertainty in Artificial Intelligence, 1991

work page 1991

[80] [80]

E., Haberland , M., Reddy , T., Cournapeau , D., Burovski , E., Peterson , P., Weckesser , W., Bright , J., van der Walt , S

Virtanen , P., Gommers , R., Oliphant , T. E., Haberland , M., Reddy , T., Cournapeau , D., Burovski , E., Peterson , P., Weckesser , W., Bright , J., van der Walt , S. J., Brett , M., Wilson , J., Jarrod Millman , K., Mayorov , N., Nelson , A. R. J., Jones , E., Kern , R., Larson , E., Carey , C., Polat , \.I ., Feng , Y., Moore , E. W., Vand erPlas , J....

work page 2020