Causal Structure Learning: a Bayesian approach based on random graphs

Hugo J. Escalante Balderas; Ivan R. Feliciano-Avelino; L. Enrique Sucar; Mauricio Gonzalez-Soto

arxiv: 2010.06164 · v1 · submitted 2020-10-13 · 💻 cs.AI

Causal Structure Learning: a Bayesian approach based on random graphs

Mauricio Gonzalez-Soto , Ivan R. Feliciano-Avelino , L. Enrique Sucar , Hugo J. Escalante Balderas This is my paper

Pith reviewed 2026-05-24 14:25 UTC · model grok-4.3

classification 💻 cs.AI

keywords causal structure learningBayesian inferencerandom graphscausal discoverymachine learning

0 comments

The pith

Bayesian updating with random graph priors recovers causal structures from observed interactions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a Bayesian method for learning causal structures by representing uncertainty with random graphs. The approach updates beliefs about causal relationships through interactions with the environment. Experiments confirm that the method learns the causal structure in two different scenarios. In the first scenario, it also identifies the optimal action, while the second shows it works for tasks of different sizes and structures.

Core claim

The paper claims that a random graph prior over possible causal graphs, when updated Bayesianly on data from interactions, allows recovery of the true causal structure of the environment.

What carries the argument

Random graphs as priors for causal structures with Bayesian updating on interaction observations.

If this is right

The method learns both the causal structure and the optimal action in the tested environment.
It successfully handles causal structures of varying sizes and complexities.
Bayesian updating on the random graph model captures uncertainty in causal relationships.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be integrated with reinforcement learning agents to simultaneously learn structure and policy.
Applying the method to real-world datasets with known or partially known causal structures would test its robustness beyond simulations.
Extensions to continuous variables or nonlinear relationships might require adjustments to the random graph model.

Load-bearing premise

The random graph prior and the likelihood model based on interactions correctly represent the uncertainty and allow recovery of the true causal structure through Bayesian updating.

What would settle it

If the posterior distribution over graphs does not concentrate on the true causal graph after sufficient interactions in a controlled setting with known ground truth, the claim would be falsified.

Figures

Figures reproduced from arXiv: 2010.06164 by Hugo J. Escalante Balderas, Ivan R. Feliciano-Avelino, L. Enrique Sucar, Mauricio Gonzalez-Soto.

**Figure 2.** Figure 2: shows the average value and standard deviation per round of each belief pij over 10 runs for each action policy. It is easy to observe that the relation between Treatment and Reaction is the easiest to learn for the three policies. Also, we can see that the beliefs about Treatment- Lives, and Disease-Reaction remain very similar in all policies. However, there is a different behavior for Reaction-Lives and… view at source ↗

**Figure 3.** Figure 3: Evaluation metrics per interaction round over [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The agent starts with an initial configuration of the lights. It aims to reach the goal state of [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Examples of the 3 types of latent causal structures on the environment. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Average value and standard deviation per round of each metric over [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Example of heatmaps showing the changes over time of the beliefs against the ground truth, [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

A Random Graph is a random object which take its values in the space of graphs. We take advantage of the expressibility of graphs in order to model the uncertainty about the existence of causal relationships within a given set of variables. We adopt a Bayesian point of view in order to capture a causal structure via interaction and learning with a causal environment. We test our method over two different scenarios, and the experiments mainly confirm that our technique can learn a causal structure. Furthermore, the experiments and results presented for the first test scenario demonstrate the usefulness of our method to learn a causal structure as well as the optimal action. On the other hand the second experiment, shows that our proposal manages to learn the underlying causal structure of several tasks with different sizes and different causal structures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a Bayesian random-graph prior for causal discovery but supplies no equations, derivations, or concrete results, leaving the core claim uncheckable.

read the letter

The central modeling move is to put a prior over random graphs to capture uncertainty in causal edges and update it from observed interactions. That is the main new element they present, along with tests on two synthetic scenarios where the method appears to recover structures and sometimes an optimal action. The second scenario also checks scaling across different graph sizes. Those are the concrete pieces they contribute. The approach has the virtue of tying causal learning directly to Bayesian updating on graphs, which could be a clean way to handle uncertainty if the details work out. The experiments are described as confirming the idea, at least at a high level. The soft spots are substantial and central. No equations appear for the prior, the likelihood from interactions, or the posterior computation. There is no discussion of how acyclicity is enforced or how the space of graphs is handled. The experiments give no numbers, baselines, sample sizes, or diagnostics on whether posterior mass actually concentrates on the true DAG. The stress-test concern about the prior correctly encoding causal uncertainty and the lack of consistency checks therefore stands; nothing in the description addresses it. This version is too preliminary to be useful to most readers working on causal discovery. It does not show enough technical substance or reproducible evidence to merit sending out for serious refereeing.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a Bayesian approach to causal structure learning that places a prior over random graphs to represent uncertainty about the existence of causal relationships among a set of variables. Beliefs are updated via observed interactions with a causal environment. The method is tested on two synthetic scenarios; the authors claim that experiments confirm the technique learns causal structures and, in the first scenario, also identifies the optimal action, while the second shows recovery of underlying structures across tasks of varying sizes.

Significance. If the modeling and inference steps are valid, the random-graph prior could offer a flexible way to encode uncertainty over causal graphs in interactive settings, potentially bridging causal discovery with sequential decision-making. The empirical demonstration on multiple tasks is a positive feature. However, the absence of any formal specification, consistency analysis, or quantitative evaluation means the work does not yet establish a clear advance over existing Bayesian causal discovery methods.

major comments (3)

[Abstract and method description] No equations, formal definition of the random-graph prior (including acyclicity handling), likelihood model, or posterior computation procedure appear in the manuscript. Without these, the central claim that Bayesian updating on interactions recovers the true causal structure cannot be verified and remains an untested modeling assumption.
[Experiments section] Experiments are reported only qualitatively ('mainly confirm', 'demonstrate the usefulness'). No quantitative metrics (e.g., structural Hamming distance, posterior probability on the true DAG), sample-size scaling, error bars, or baseline comparisons are provided, so the empirical support for both scenarios is not load-bearing.
[Theoretical justification (implicit throughout)] No consistency result, posterior-concentration argument, or prior-misspecification analysis is given. These are required to substantiate that the random-graph prior plus interaction likelihood yields recovery of the data-generating structure as the number of observations grows.

minor comments (2)

[Abstract] Grammatical error in abstract: 'which take its values' should read 'which takes its values'.
[Abstract] The description of the second experiment is vague on task sizes, number of variables, and how 'different causal structures' were generated or evaluated.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments correctly identify gaps in formalization, quantitative evaluation, and theoretical analysis. We address each major comment below and commit to revisions where appropriate.

read point-by-point responses

Referee: [Abstract and method description] No equations, formal definition of the random-graph prior (including acyclicity handling), likelihood model, or posterior computation procedure appear in the manuscript. Without these, the central claim that Bayesian updating on interactions recovers the true causal structure cannot be verified and remains an untested modeling assumption.

Authors: We agree that the manuscript lacks the necessary formal specifications and equations. The submitted version provides only a high-level textual description. In the revised manuscript we will add a formal definition of the random-graph prior (including the mechanism used to enforce acyclicity), the likelihood model for observed interactions, and the procedure used to compute or approximate the posterior. revision: yes
Referee: [Experiments section] Experiments are reported only qualitatively ('mainly confirm', 'demonstrate the usefulness'). No quantitative metrics (e.g., structural Hamming distance, posterior probability on the true DAG), sample-size scaling, error bars, or baseline comparisons are provided, so the empirical support for both scenarios is not load-bearing.

Authors: We acknowledge that the experimental results are presented only qualitatively and lack the quantitative metrics, scaling plots, error bars, and baseline comparisons needed to make the claims load-bearing. In revision we will augment both scenarios with structural Hamming distance, posterior mass on the true DAG, performance versus number of interactions, standard errors across repeated runs, and comparisons to standard causal discovery methods. revision: yes
Referee: [Theoretical justification (implicit throughout)] No consistency result, posterior-concentration argument, or prior-misspecification analysis is given. These are required to substantiate that the random-graph prior plus interaction likelihood yields recovery of the data-generating structure as the number of observations grows.

Authors: The manuscript focuses on a modeling proposal and small-scale empirical demonstrations rather than theoretical analysis. A full consistency or posterior-concentration result would require substantial additional theoretical work (identifiability conditions, prior properties, and interaction model assumptions) that lies outside the current scope. We will add an explicit limitations section discussing this gap and outlining directions for future theoretical investigation. revision: partial

Circularity Check

0 steps flagged

No circularity; modeling assumptions stated explicitly without self-referential reduction.

full rationale

The provided abstract and description contain no equations, no fitted parameters presented as predictions, and no self-citations. The approach is described as placing a random-graph prior over causal graphs and performing Bayesian updates on observed interactions. This is presented as a modeling choice rather than a derived result that reduces to its inputs by construction. No load-bearing step matches any of the enumerated circularity patterns. The experiments are claimed to confirm the method but supply no internal reduction that would qualify as circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; the modeling assumptions that turn a random graph into a causal model and the likelihood function that links observations to graph updates are not stated, so the ledger cannot be populated with concrete entries.

pith-pipeline@v0.9.0 · 5669 in / 1114 out tokens · 17596 ms · 2026-05-24T14:25:26.786699+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Choosing with unknown causal information: Action-outcome probabilities for decision making can be grounded in causal models
cs.AI 2019-07 unverdicted novelty 4.0

Action-outcome probabilities for rational choice can be grounded in causal models both when the causal structure is known and when it is unknown, with an extension to causal Nash Equilibrium.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian theory. Wiley Series in Probability and Statistics

work page 2000
[2]

Bollob \'a s, B. (2001). Random Graphs . Cambridge studies in advanced mathematics

work page 2001
[3]

Campbell, D. T. and Cook, T. D. (1979). Quasi-experimentation: Design & analysis issues for field settings . Rand McNally College Publishing Company Chicago

work page 1979
[4]

Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of machine learning research , 3(Nov):507--554

work page 2002
[5]

Danks, D. (2014). Unifying the mind: Cognitive representations as graphical models . MIT Press

work page 2014
[6]

Eberhardt, F. (2007). Causation and intervention. Unpublished doctoral dissertation, Carnegie Mellon University , page 93

work page 2007
[7]

Eberhardt, F. (2008a). Almost optimal intervention sets for causal discovery. In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence , pages 161--168. AUAI Press

work page
[8]

Eberhardt, F. (2008b). Causal discovery as a game. In Proceedings of the 2008th International Conference on Causality: Objectives and Assessment-Volume 6 , pages 87--96. JMLR. org

work page
[9]

Fernbach, P. M. and Sloman, S. A. (2009). Causal learning with local computations. Journal of experimental psychology: Learning, memory, and cognition , 35(3):678

work page 2009
[10]

B., Stern, H

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian data analysis . CRC press, 3rd edition

work page 2013
[11]

Gilboa, I. (2009). Theory of Decision under Uncertainty . Cambridge University Press

work page 2009
[12]

Glymour, C., Zhang, K., and Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics , 10:524

work page 2019
[13]

E., and Escalante, H

Gonzalez-Soto, M., Sucar, L. E., and Escalante, H. J. (2018). Playing against nature: causal discovery for decision making under uncertainty. In Machine Learning for Causal Inference, Counterfactual Prediction and Autonomous Action (CausalML) Workshop at ICML 2018

work page 2018
[14]

and B \"u hlmann, P

Hauser, A. and B \"u hlmann, P. (2012). Two optimal strategies for active learning of causal models from interventions. In Proceedings of the 6th European Workshop on Probabilistic Graphical Models , pages 123--130

work page 2012
[15]

and B \"u hlmann, P

Hauser, A. and B \"u hlmann, P. (2014). Two optimal strategies for active learning of causal models from interventional data. International Journal of Approximate Reasoning , 55(4):926--939

work page 2014
[16]

and Geng, Z

He, Y.-B. and Geng, Z. (2008). Active learning of causal networks with intervention experiments and optimal designs. Journal of Machine Learning Research , 9(Nov):2523--2547

work page 2008
[17]

Hyttinen, A., Eberhardt, F., and Hoyer, P. O. (2013). Experiment selection for causal discovery. The Journal of Machine Learning Research , 14(1):3041--3071

work page 2013
[18]

Jackson, M. O. (2010). Social and economic networks . Princeton university press

work page 2010
[19]

Joyce, J. M. (1999). The Foundations of Causal Decision Theory . Cambridge University Press

work page 1999
[20]

and Friedman, N

Koller, D. and Friedman, N. (2009). Probabilistic graphical models: principles and techniques . MIT press

work page 2009
[21]

M., Ullman, T

Lake, B. M., Ullman, T. D., Tenenbaum, J. B., and Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and Brain Sciences , 40

work page 2017
[22]

Lattimore, F., Lattimore, T., and Reid, M. D. (2016). Causal bandits: Learning good interventions via causal inference. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems 29 , pages 1181--1189. Curran Associates, Inc

work page 2016
[23]

and B \"u hlmann, P

Loh, P.-L. and B \"u hlmann, P. (2014). High-dimensional learning of linear causal networks via inverse covariance estimation. Journal of Machine Learning Research , 15(1):3065--3105

work page 2014
[24]

March, J. G. (1991). Exploration and exploitation in organizational learning. Organization science , 2(1):71--87

work page 1991
[25]

Meganck, S., Leray, P., and Manderick, B. (2006). Learning causal bayesian networks from observations and experiments: A decision theoretic approach. In International Conference on Modeling Decisions for Artificial Intelligence , pages 58--69. Springer

work page 2006
[26]

M., Peters, J., Janzing, D., Zscheischler, J., and Sch \"o lkopf, B

Mooij, J. M., Peters, J., Janzing, D., Zscheischler, J., and Sch \"o lkopf, B. (2016). Distinguishing cause from effect using observational data: methods and benchmarks. The Journal of Machine Learning Research , 17(1):1103--1204

work page 2016
[27]

Murphy, K. P. (2001). Active learning of causal bayes net structure

work page 2001
[28]

Nair, S., Zhu, Y., Savarese, S., and Fei-Fei, L. (2019). Causal induction from visual observations for goal directed tasks

work page 2019
[29]

O., Sachs, K., Mallick, P., and Vitek, O

Ness, R. O., Sachs, K., Mallick, P., and Vitek, O. (2017). A bayesian active learning experimental design for inferring signaling networks. In Sahinalp, S. C., editor, Research in Computational Molecular Biology , pages 134--156, Cham. Springer International Publishing

work page 2017
[30]

Newman, M. (2018). Networks . Oxford university press

work page 2018
[31]

Pearl, J. (2009). Causality: Models, Reasoning and Inference . Cambridge University Press, New York, NY, USA, 2nd edition

work page 2009
[32]

and Mackenzie, D

Pearl, J. and Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect . Basic Books

work page 2018
[33]

Probabilistic Active Learning of Functions in Structural Causal Models

Rubenstein, P. K., Tolstikhin, I., Hennig, P., and Schoelkopf, B. (2017). Probabilistic active learning of functions in structural causal models. arXiv preprint arXiv:1706.10234

work page internal anchor Pith review Pith/arXiv arXiv 2017
[34]

Savage, L. (1954). The Foundations of Statistics. New York: John Wiley & Sons

work page 1954
[35]

G., and Shakkottai, S

Sen, R., Shanmugam, K., Dimakis, A. G., and Shakkottai, S. (2017). Identifying best interventions through online importance sampling. In International Conference on Machine Learning , pages 3057--3066

work page 2017
[36]

G., and Vishwanath, S

Shanmugam, K., Kocaoglu, M., Dimakis, A. G., and Vishwanath, S. (2015). Learning causal graphs with small interventions. In Advances in Neural Information Processing Systems , pages 3195--3203

work page 2015
[37]

N., and Scheines, R

Spirtes, P., Glymour, C. N., and Scheines, R. (2000). Causation, prediction and search . MIT Press

work page 2000
[38]

Sucar, L. E. (2015). Probabilistic Graphical Models . Advances in Computer Vision and Pattern Recognition. Springer London

work page 2015
[39]

Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An introduction . MIT Press

work page 1998
[40]

and Koller, D

Tong, S. and Koller, D. (2001). Active learning for structure in bayesian networks. In International joint conference on artificial intelligence , volume 17, pages 863--869. LAWRENCE ERLBAUM ASSOCIATES LTD

work page 2001
[41]

and Pearl, J

Verma, T. and Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence , pages 255--270

work page 1990
[42]

and Morgenstern, O

Von Neumann, J. and Morgenstern, O. (1944). Theory of games and economic behavior . Princeton University Press

work page 1944
[43]

R., Cheng, P

Waldmann, M. R., Cheng, P. W., Hagmayer, Y., and Blaisdell, A. P. (2008). Causal learning in rats and humans: A minimal rational model. The probabilistic mind. Prospects for Bayesian cognitive science , pages 453--484

work page 2008
[44]

and Danks, D

Wellen, S. and Danks, D. (2012). Learning causal structure through local prediction-error learning. In Proceedings of the Annual Meeting of the Cognitive Science Society , volume 34

work page 2012
[45]

Woodward, J. (2003). Making things happen: A theory of causal explanation . Oxford Studies in Philosophy of Science. Oxford University Press

work page 2003

[1] [1]

Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian theory. Wiley Series in Probability and Statistics

work page 2000

[2] [2]

Bollob \'a s, B. (2001). Random Graphs . Cambridge studies in advanced mathematics

work page 2001

[3] [3]

Campbell, D. T. and Cook, T. D. (1979). Quasi-experimentation: Design & analysis issues for field settings . Rand McNally College Publishing Company Chicago

work page 1979

[4] [4]

Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of machine learning research , 3(Nov):507--554

work page 2002

[5] [5]

Danks, D. (2014). Unifying the mind: Cognitive representations as graphical models . MIT Press

work page 2014

[6] [6]

Eberhardt, F. (2007). Causation and intervention. Unpublished doctoral dissertation, Carnegie Mellon University , page 93

work page 2007

[7] [7]

Eberhardt, F. (2008a). Almost optimal intervention sets for causal discovery. In Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence , pages 161--168. AUAI Press

work page

[8] [8]

Eberhardt, F. (2008b). Causal discovery as a game. In Proceedings of the 2008th International Conference on Causality: Objectives and Assessment-Volume 6 , pages 87--96. JMLR. org

work page

[9] [9]

Fernbach, P. M. and Sloman, S. A. (2009). Causal learning with local computations. Journal of experimental psychology: Learning, memory, and cognition , 35(3):678

work page 2009

[10] [10]

B., Stern, H

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian data analysis . CRC press, 3rd edition

work page 2013

[11] [11]

Gilboa, I. (2009). Theory of Decision under Uncertainty . Cambridge University Press

work page 2009

[12] [12]

Glymour, C., Zhang, K., and Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics , 10:524

work page 2019

[13] [13]

E., and Escalante, H

Gonzalez-Soto, M., Sucar, L. E., and Escalante, H. J. (2018). Playing against nature: causal discovery for decision making under uncertainty. In Machine Learning for Causal Inference, Counterfactual Prediction and Autonomous Action (CausalML) Workshop at ICML 2018

work page 2018

[14] [14]

and B \"u hlmann, P

Hauser, A. and B \"u hlmann, P. (2012). Two optimal strategies for active learning of causal models from interventions. In Proceedings of the 6th European Workshop on Probabilistic Graphical Models , pages 123--130

work page 2012

[15] [15]

and B \"u hlmann, P

Hauser, A. and B \"u hlmann, P. (2014). Two optimal strategies for active learning of causal models from interventional data. International Journal of Approximate Reasoning , 55(4):926--939

work page 2014

[16] [16]

and Geng, Z

He, Y.-B. and Geng, Z. (2008). Active learning of causal networks with intervention experiments and optimal designs. Journal of Machine Learning Research , 9(Nov):2523--2547

work page 2008

[17] [17]

Hyttinen, A., Eberhardt, F., and Hoyer, P. O. (2013). Experiment selection for causal discovery. The Journal of Machine Learning Research , 14(1):3041--3071

work page 2013

[18] [18]

Jackson, M. O. (2010). Social and economic networks . Princeton university press

work page 2010

[19] [19]

Joyce, J. M. (1999). The Foundations of Causal Decision Theory . Cambridge University Press

work page 1999

[20] [20]

and Friedman, N

Koller, D. and Friedman, N. (2009). Probabilistic graphical models: principles and techniques . MIT press

work page 2009

[21] [21]

M., Ullman, T

Lake, B. M., Ullman, T. D., Tenenbaum, J. B., and Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and Brain Sciences , 40

work page 2017

[22] [22]

Lattimore, F., Lattimore, T., and Reid, M. D. (2016). Causal bandits: Learning good interventions via causal inference. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems 29 , pages 1181--1189. Curran Associates, Inc

work page 2016

[23] [23]

and B \"u hlmann, P

Loh, P.-L. and B \"u hlmann, P. (2014). High-dimensional learning of linear causal networks via inverse covariance estimation. Journal of Machine Learning Research , 15(1):3065--3105

work page 2014

[24] [24]

March, J. G. (1991). Exploration and exploitation in organizational learning. Organization science , 2(1):71--87

work page 1991

[25] [25]

Meganck, S., Leray, P., and Manderick, B. (2006). Learning causal bayesian networks from observations and experiments: A decision theoretic approach. In International Conference on Modeling Decisions for Artificial Intelligence , pages 58--69. Springer

work page 2006

[26] [26]

M., Peters, J., Janzing, D., Zscheischler, J., and Sch \"o lkopf, B

Mooij, J. M., Peters, J., Janzing, D., Zscheischler, J., and Sch \"o lkopf, B. (2016). Distinguishing cause from effect using observational data: methods and benchmarks. The Journal of Machine Learning Research , 17(1):1103--1204

work page 2016

[27] [27]

Murphy, K. P. (2001). Active learning of causal bayes net structure

work page 2001

[28] [28]

Nair, S., Zhu, Y., Savarese, S., and Fei-Fei, L. (2019). Causal induction from visual observations for goal directed tasks

work page 2019

[29] [29]

O., Sachs, K., Mallick, P., and Vitek, O

Ness, R. O., Sachs, K., Mallick, P., and Vitek, O. (2017). A bayesian active learning experimental design for inferring signaling networks. In Sahinalp, S. C., editor, Research in Computational Molecular Biology , pages 134--156, Cham. Springer International Publishing

work page 2017

[30] [30]

Newman, M. (2018). Networks . Oxford university press

work page 2018

[31] [31]

Pearl, J. (2009). Causality: Models, Reasoning and Inference . Cambridge University Press, New York, NY, USA, 2nd edition

work page 2009

[32] [32]

and Mackenzie, D

Pearl, J. and Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect . Basic Books

work page 2018

[33] [33]

Probabilistic Active Learning of Functions in Structural Causal Models

Rubenstein, P. K., Tolstikhin, I., Hennig, P., and Schoelkopf, B. (2017). Probabilistic active learning of functions in structural causal models. arXiv preprint arXiv:1706.10234

work page internal anchor Pith review Pith/arXiv arXiv 2017

[34] [34]

Savage, L. (1954). The Foundations of Statistics. New York: John Wiley & Sons

work page 1954

[35] [35]

G., and Shakkottai, S

Sen, R., Shanmugam, K., Dimakis, A. G., and Shakkottai, S. (2017). Identifying best interventions through online importance sampling. In International Conference on Machine Learning , pages 3057--3066

work page 2017

[36] [36]

G., and Vishwanath, S

Shanmugam, K., Kocaoglu, M., Dimakis, A. G., and Vishwanath, S. (2015). Learning causal graphs with small interventions. In Advances in Neural Information Processing Systems , pages 3195--3203

work page 2015

[37] [37]

N., and Scheines, R

Spirtes, P., Glymour, C. N., and Scheines, R. (2000). Causation, prediction and search . MIT Press

work page 2000

[38] [38]

Sucar, L. E. (2015). Probabilistic Graphical Models . Advances in Computer Vision and Pattern Recognition. Springer London

work page 2015

[39] [39]

Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An introduction . MIT Press

work page 1998

[40] [40]

and Koller, D

Tong, S. and Koller, D. (2001). Active learning for structure in bayesian networks. In International joint conference on artificial intelligence , volume 17, pages 863--869. LAWRENCE ERLBAUM ASSOCIATES LTD

work page 2001

[41] [41]

and Pearl, J

Verma, T. and Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence , pages 255--270

work page 1990

[42] [42]

and Morgenstern, O

Von Neumann, J. and Morgenstern, O. (1944). Theory of games and economic behavior . Princeton University Press

work page 1944

[43] [43]

R., Cheng, P

Waldmann, M. R., Cheng, P. W., Hagmayer, Y., and Blaisdell, A. P. (2008). Causal learning in rats and humans: A minimal rational model. The probabilistic mind. Prospects for Bayesian cognitive science , pages 453--484

work page 2008

[44] [44]

and Danks, D

Wellen, S. and Danks, D. (2012). Learning causal structure through local prediction-error learning. In Proceedings of the Annual Meeting of the Cognitive Science Society , volume 34

work page 2012

[45] [45]

Woodward, J. (2003). Making things happen: A theory of causal explanation . Oxford Studies in Philosophy of Science. Oxford University Press

work page 2003