Estimating Joint Interventional Distributions from Marginal Interventional Data

Armin Keki\'c; Atalanti Mastakouri; Bernhard Sch\"olkopf; Elke Kirschbaum; Sergio Hernan Garrido Mejia

arxiv: 2409.01794 · v2 · submitted 2024-09-03 · 📊 stat.ME · cs.LG· stat.ML

Estimating Joint Interventional Distributions from Marginal Interventional Data

Sergio Hernan Garrido Mejia , Elke Kirschbaum , Armin Keki\'c , Bernhard Sch\"olkopf , Atalanti Mastakouri This is my paper

Pith reviewed 2026-05-23 20:56 UTC · model grok-4.3

classification 📊 stat.ME cs.LGstat.ML

keywords causal inferencemaximum entropyinterventional distributionsexponential familyLagrange dualitycausal feature selection

0 comments

The pith

Marginal interventional distributions over variable subsets suffice to recover the joint interventional distribution over all variables.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the Causal Maximum Entropy principle to incorporate interventional data in addition to observational data. Using Lagrange duality, it establishes that the resulting optimization problem has a solution in the exponential family. This framework supports two concrete tasks when marginal interventional distributions are supplied for arbitrary subsets of variables: causal feature selection from a mixture of observational and single-variable interventional data, and direct recovery of the full joint interventional distribution. A sympathetic reader cares because the approach combines datasets collected under different interventions without requiring joint observations of every variable at once.

Core claim

Extending the Causal Maximum Entropy objective to include interventional constraints yields, by Lagrange duality, a solution in the exponential family. When marginal interventional distributions are provided for any subset of the variables, the same objective recovers the joint interventional distribution over the full set and also enables causal feature selection from mixed observational and single-variable interventional data.

What carries the argument

The extended Causal Maximum Entropy objective with interventional constraints, solved via its Lagrange dual to produce an exponential-family distribution.

If this is right

Causal feature selection can be performed from a mixture of observational data and single-variable interventional data, outperforming prior merging methods on synthetic examples.
The recovered joint interventional distributions match the performance of tests that require full joint observations.
The exponential-family form supplies an explicit parametric representation for any collection of marginal interventional constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the causal graph is known, the same dual construction could be used to propagate constraints across unobserved interventions.
The method suggests a data-collection strategy in which separate experiments each intervene on only a few variables, with the joint recovered afterward.

Load-bearing premise

Marginal interventional distributions supplied for arbitrary subsets of variables are together sufficient to uniquely determine the joint interventional distribution over all variables.

What would settle it

A concrete data-generating process in which two different joint interventional distributions produce identical marginal interventional distributions on every proper subset, yet differ on the full joint.

Figures

Figures reproduced from arXiv: 2409.01794 by Armin Keki\'c, Atalanti Mastakouri, Bernhard Sch\"olkopf, Elke Kirschbaum, Sergio Hernan Garrido Mejia.

**Figure 1.** Figure 1: Results for causal feature selection. (a), (b), and (c) show the graph structures used for our synthetic experiments. [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: Residuals between true and estimated joint interventional distributions. The violin plots show the residuals between [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

read the original abstract

In this paper we show how to exploit interventional data to acquire the joint conditional distribution of all the variables using the Maximum Entropy principle. To this end, we extend the Causal Maximum Entropy method to make use of interventional data in addition to observational data. Using Lagrange duality, we prove that the solution to the Causal Maximum Entropy problem with interventional constraints lies in the exponential family, as in the Maximum Entropy solution. Our method allows us to perform two tasks of interest when marginal interventional distributions are provided for any subset of the variables. First, we show how to perform causal feature selection from a mixture of observational and single-variable interventional data, and, second, how to infer joint interventional distributions. For the former task, we show on synthetically generated data, that our proposed method outperforms the state-of-the-art method on merging datasets, and yields comparable results to the KCI-test which requires access to joint observations of all variables.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends Causal MaxEnt to turn marginal interventional data into joint interventional distributions and feature selection, but the uniqueness claim needs graph-dependent conditions to hold.

read the letter

The paper's main contribution is extending Causal MaxEnt to treat marginal interventional distributions as constraints, then proving via Lagrange duality that the solution remains in the exponential family. This setup is used for two tasks: causal feature selection from mixed observational and single-variable interventional data, and recovering the full joint interventional distribution when only marginal interventional pieces are available for arbitrary subsets of variables. The synthetic experiments show the method outperforming a merging-datasets baseline on feature selection while matching the KCI test, which requires full joint observations. That gives a practical edge if the data regime matches the simulations. The identifiability step is the soft spot. The abstract states the method works when marginal interventional distributions are supplied for any subset, but multiple joints can be consistent with the same marginal interventional constraints if the underlying DAG leaves some paths uncovered or admits latent structures that agree on the observed marginals. The stress-test concern lands here: without explicit conditions on the graph or the data-generating process, the recovered joint is not guaranteed to be the unique one. The paper reports only synthetic results, so it is unclear how sensitive the procedure is to violations of those unstated conditions. This work is for researchers in causal discovery and causal ML who routinely face partial interventional data and want either feature rankings or joint interventional quantities. The technical route is new enough and the experiments concrete enough that a serious referee should see it, even if the identifiability argument needs tightening in revision.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the Causal Maximum Entropy framework to incorporate interventional data alongside observational data. Using Lagrange duality, it claims to prove that the optimizer under interventional constraints remains in the exponential family. The method is applied to two tasks: causal feature selection from a mixture of observational and single-variable interventional data, and recovery of joint interventional distributions from marginal interventional distributions supplied for arbitrary subsets of variables. Synthetic experiments indicate that the approach outperforms dataset-merging baselines for feature selection and performs comparably to the KCI test (which requires joint observations).

Significance. If the duality argument is rigorous and the supplied marginal interventional constraints suffice for unique recovery of the joint, the work supplies a principled maximum-entropy route for fusing observational and interventional data without requiring full joint observations. The exponential-family preservation result would be a clean theoretical contribution, and the feature-selection experiments on synthetic data provide concrete empirical grounding.

major comments (2)

[Abstract / identifiability section] Abstract and the section presenting the identifiability claim: the statement that the method recovers the joint interventional distribution “when marginal interventional distributions are provided for any subset of the variables” is load-bearing for both claimed tasks. No explicit identifiability theorem or graph-dependent conditions are supplied showing that the marginal interventional constraints uniquely determine the joint; multiple joints can agree on the same do-marginals when intervened subsets leave paths or components unconstrained.
[Theoretical development / duality argument] The Lagrange-duality proof (referenced in the abstract and presumably in the main theoretical section): the claim that the solution remains in the exponential family under interventional constraints is central, yet the manuscript provides neither the explicit dual derivation nor the encoding of the marginal interventional expectations as constraints. Without these details the preservation result cannot be verified.

minor comments (2)

[Experiments] The synthetic-data section should report the precise data-generating process, the number of variables, the fraction of interventional samples, and the exact performance metrics (beyond “outperforms”) so that the feature-selection comparison can be reproduced.
[Notation / method section] Notation for the interventional constraints (e.g., how P(V_S | do(V_T)) is written inside the extended Causal MaxEnt objective) should be introduced once and used consistently.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which identify key areas where the manuscript requires greater rigor and explicit detail. We address each major comment below and will incorporate the necessary revisions.

read point-by-point responses

Referee: [Abstract / identifiability section] Abstract and the section presenting the identifiability claim: the statement that the method recovers the joint interventional distribution “when marginal interventional distributions are provided for any subset of the variables” is load-bearing for both claimed tasks. No explicit identifiability theorem or graph-dependent conditions are supplied showing that the marginal interventional constraints uniquely determine the joint; multiple joints can agree on the same do-marginals when intervened subsets leave paths or components unconstrained.

Authors: We acknowledge that the current version does not supply an explicit identifiability theorem with graph-dependent conditions. The claim in the abstract and introduction is intended to hold under the maximum-entropy principle when the supplied marginal interventional constraints are sufficient to pin down the joint, but we agree that uniqueness is not automatic for arbitrary subsets. In the revision we will add a dedicated identifiability subsection that states the precise conditions (e.g., when the collection of intervened variable sets covers all relevant causal paths or satisfies a covering criterion on the underlying DAG) under which the joint interventional distribution is uniquely recoverable from the given marginals. revision: yes
Referee: [Theoretical development / duality argument] The Lagrange-duality proof (referenced in the abstract and presumably in the main theoretical section): the claim that the solution remains in the exponential family under interventional constraints is central, yet the manuscript provides neither the explicit dual derivation nor the encoding of the marginal interventional expectations as constraints. Without these details the preservation result cannot be verified.

Authors: The manuscript sketches the Lagrange-duality argument but does not expand the full derivation or the precise encoding of interventional marginals. We will revise the theoretical section to include the complete steps: (i) formulation of the constrained optimization problem that augments the observational entropy objective with both observational and interventional moment-matching constraints, (ii) construction of the Lagrangian that incorporates the do-marginal expectations as linear constraints on the interventional distributions, (iii) derivation of the dual problem, and (iv) explicit verification that the resulting primal optimizer belongs to the exponential family with parameters that absorb the interventional Lagrange multipliers. This will make the preservation result directly verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation follows from standard Lagrange duality on extended MaxEnt

full rationale

The abstract describes extending Causal MaxEnt with interventional constraints and applying Lagrange duality to obtain an exponential-family solution. This is a direct consequence of the optimization problem definition and does not reduce any target quantity (joint interventional distribution) to a fitted parameter or self-citation by construction. No load-bearing steps match the enumerated circularity patterns; the method is presented as building on the established MaxEnt principle with an independent duality argument. The identifiability of joints from marginals is an assumption whose validity is external to the derivation chain itself.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the maximum entropy principle being applicable once interventional marginals are added as constraints, plus the correctness of the Lagrange duality argument. No explicit free parameters or new entities are named in the abstract.

free parameters (1)

Lagrange multipliers for interventional constraints
These multipliers are introduced to enforce the marginal interventional distributions; their values are determined by the optimization and are therefore fitted to the supplied data.

axioms (2)

domain assumption The maximum entropy principle remains valid when observational and interventional marginal constraints are combined in a causal setting.
Invoked when the abstract states that the Causal MaxEnt method is extended to interventional data.
standard math Lagrange duality applies directly to the Causal MaxEnt objective with the added interventional constraints.
Used to prove the solution lies in the exponential family.

pith-pipeline@v0.9.0 · 5708 in / 1422 out tokens · 55907 ms · 2026-05-23T20:56:10.253859+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

[1]

A., and Della Pietra, V

Berger, A., Della Pietra, S. A., and Della Pietra, V. J. A maximum entropy approach to natural language processing. Computational linguistics, 22 0 (1): 0 39--71, 1996

work page 1996
[2]

J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q

Bradbury, J., Frostig, R., Hawkins, P., Johnson, M. J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/google/jax

work page 2018
[3]

Cooper, G. F. and Yoo, C. Causal discovery from a mixture of experimental and observational data. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, pp.\ 116--125, 1999

work page 1999
[4]

Integrating locally learned causal structures with overlapping variables

Danks, D., Glymour, C., and Tillman, R. Integrating locally learned causal structures with overlapping variables. Advances in Neural Information Processing Systems, 21, 2008

work page 2008
[5]

Deming, W. E. and Stephan, F. F. On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics, 11 0 (4): 0 427--444, 1940

work page 1940
[6]

and Murphy, K

Eaton, D. and Murphy, K. Exact bayesian structure learning from uncertain interventions. In Meila, M. and Shen, X. (eds.), Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, volume 2 of Proceedings of Machine Learning Research, pp.\ 107--114, San Juan, Puerto Rico, 21--24 Mar 2007. PMLR. URL https://proceedings...

work page 2007
[7]

Q., Ghasemi, M., and Kocaoglu, M

Elahi, M. Q., Ghasemi, M., and Kocaoglu, M. Identification of average causal effects in confounded additive noise models. arXiv preprint arXiv:2407.10014, 2024

work page arXiv 2024
[8]

and Tse, D

Farnia, F. and Tse, D. A minimax approach to supervised learning. Advances in Neural Information Processing Systems, 29, 2016

work page 2016
[9]

Obtaining causal information by merging datasets with maxent

Garrido Mejia , S., Kirschbaum, E., and Janzing, D. Obtaining causal information by merging datasets with maxent. In International Conference on Artificial Intelligence and Statistics, pp.\ 581--603. PMLR, 2022

work page 2022
[10]

u gelgen, J., K \

Gresele, L., Von K \"u gelgen, J., K \"u bler, J., Kirschbaum, E., Sch \"o lkopf, B., and Janzing, D. Causal inference through the structural causal marginal problem. In International Conference on Machine Learning, pp.\ 7793--7824. PMLR, 2022

work page 2022
[11]

Invariant causal prediction for nonlinear models

Heinze-Deml, C., Peters, J., and Meinshausen, N. Invariant causal prediction for nonlinear models. Journal of Causal Inference, 6 0 (2), 2018

work page 2018
[12]

M., and Talahaturuson, A

Hindersah, R., Kalay, A. M., and Talahaturuson, A. Rice yield grown in different fertilizer combination and planting methods: Case study in buru island, indonesia. Open Agriculture, 7 0 (1): 0 871--881, 2022

work page 2022
[13]

Causal versions of maximum entropy and principle of insufficient reason

Janzing, D. Causal versions of maximum entropy and principle of insufficient reason. Journal of Causal Inference, 9 0 (1): 0 285--301, 2021

work page 2021
[14]

Distinguishing Cause and Effect via Second Order Exponential Models

Janzing, D., Sun, X., and Sch \"o lkopf, B. Distinguishing cause and effect via second order exponential models. arXiv preprint arXiv:0910.5561, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009
[15]

Jaynes, E. T. Information theory and statistical mechanics. Physical review, 106 0 (4): 0 620, 1957

work page 1957
[16]

Jaynes, E. T. Probability theory: The logic of science. Cambridge university press, 2003

work page 2003
[17]

Disentangling causal effects from sets of interventions in the presence of unobserved confounders

Jeunen, O., Gilligan-Lee, C., Mehrotra, R., and Lalmas, M. Disentangling causal effects from sets of interventions in the presence of unobserved confounders. Advances in Neural Information Processing Systems, 35: 0 27850--27861, 2022

work page 2022
[18]

Kellerer, H. G. Ma theoretische marginalprobleme. Mathematische Annalen, 153 0 (3): 0 168--198, June 1964. doi:10.1007/bf01360315. URL https://doi.org/10.1007/bf01360315

work page doi:10.1007/bf01360315 1964
[19]

and Friedman, N

Koller, D. and Friedman, N. Probabilistic graphical models: principles and techniques. MIT press, 2009

work page 2009
[20]

M., Magliacane, S., and Claassen, T

Mooij, J. M., Magliacane, S., and Claassen, T. Joint causal inference from multiple contexts. The Journal of Machine Learning Research, 21 0 (1): 0 3919--4026, 2020

work page 2020
[21]

Causality

Pearl, J. Causality. Cambridge university press, 2009

work page 2009
[22]

and Mackenzie, D

Pearl, J. and Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., USA, 1st edition, 2018. ISBN 046509760X

work page 2018
[23]

Causal inference by using invariant prediction: identification and confidence intervals

Peters, J., B \"u hlmann, P., and Meinshausen, N. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78 0 (5): 0 947--1012, 2016

work page 2016
[24]

Effect of irrigation and fertilizer management on rice yield and nitrogen loss: A meta-analysis

Qiu, H., Yang, S., Jiang, Z., Xu, Y., and Jiao, X. Effect of irrigation and fertilizer management on rice yield and nitrogen loss: A meta-analysis. Plants, 11 0 (13): 0 1690, 2022

work page 2022
[25]

and Silva, R

Saengkyongam, S. and Silva, R. Learning joint nonlinear effects from single-variable interventions in the presence of hidden confounders. In Conference on Uncertainty in Artificial Intelligence, pp.\ 300--309. PMLR, 2020

work page 2020
[26]

A., and Janzing, D

Sani, N., Mastakouri, A. A., and Janzing, D. Bounding probabilities of causation through the causal marginal problem. arXiv preprint arXiv:2304.02023, 2023

work page arXiv 2023
[27]

Causal inference by choosing graphs with most plausible markov kernels

Sun, X., Janzing, D., and Sch \"o lkopf, B. Causal inference by choosing graphs with most plausible markov kernels. In Ninth International Symposium on Artificial Intelligence and Mathematics (AIMath 2006), pp.\ 1--11, 2006

work page 2006
[28]

and Pearl, J

Tian, J. and Pearl, J. Causal discovery from changes. In Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, pp.\ 512--521, 2001

work page 2001
[29]

and Pearl, J

Tian, J. and Pearl, J. A general identification condition for causal effects. eScholarship, University of California, 2002

work page 2002
[30]

and Spirtes, P

Tillman, R. and Spirtes, P. Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.\ 3--15. JMLR Workshop and Conference Proceedings, 2011

work page 2011
[31]

Tillman, R. E. Structure learning with independent non-identically distributed data. In Proceedings of the 26th Annual International Conference on Machine Learning, pp.\ 1041--1048, 2009

work page 2009
[32]

and Tsamardinos, I

Triantafillou, S. and Tsamardinos, I. Constraint-based causal discovery from multiple interventions over overlapping variable sets. The Journal of Machine Learning Research, 16 0 (1): 0 2147--2205, 2015

work page 2015
[33]

J., Jordan, M

Wainwright, M. J., Jordan, M. I., et al. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning , 1 0 (1--2): 0 1--305, 2008

work page 2008
[34]

S., and Ardiwinata, A

Wihardjaka, A., Harsanti, E. S., and Ardiwinata, A. N. Effect of fertilizer management on potassium dynamics and yield of rainfed lowland rice in indonesia. Chilean journal of agricultural research, 82 0 (1): 0 33--43, 2022

work page 2022
[35]

Kernel-based conditional independence test and application in causal discovery

Zhang, K., Peters, J., Janzing, D., and Sch \"o lkopf, B. Kernel-based conditional independence test and application in causal discovery. In 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), pp.\ 804--813. AUAI Press, 2011

work page 2011
[36]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[1] [1]

A., and Della Pietra, V

Berger, A., Della Pietra, S. A., and Della Pietra, V. J. A maximum entropy approach to natural language processing. Computational linguistics, 22 0 (1): 0 39--71, 1996

work page 1996

[2] [2]

J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q

Bradbury, J., Frostig, R., Hawkins, P., Johnson, M. J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., Vander P las, J., Wanderman- M ilne, S., and Zhang, Q. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/google/jax

work page 2018

[3] [3]

Cooper, G. F. and Yoo, C. Causal discovery from a mixture of experimental and observational data. In Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, pp.\ 116--125, 1999

work page 1999

[4] [4]

Integrating locally learned causal structures with overlapping variables

Danks, D., Glymour, C., and Tillman, R. Integrating locally learned causal structures with overlapping variables. Advances in Neural Information Processing Systems, 21, 2008

work page 2008

[5] [5]

Deming, W. E. and Stephan, F. F. On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. The Annals of Mathematical Statistics, 11 0 (4): 0 427--444, 1940

work page 1940

[6] [6]

and Murphy, K

Eaton, D. and Murphy, K. Exact bayesian structure learning from uncertain interventions. In Meila, M. and Shen, X. (eds.), Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, volume 2 of Proceedings of Machine Learning Research, pp.\ 107--114, San Juan, Puerto Rico, 21--24 Mar 2007. PMLR. URL https://proceedings...

work page 2007

[7] [7]

Q., Ghasemi, M., and Kocaoglu, M

Elahi, M. Q., Ghasemi, M., and Kocaoglu, M. Identification of average causal effects in confounded additive noise models. arXiv preprint arXiv:2407.10014, 2024

work page arXiv 2024

[8] [8]

and Tse, D

Farnia, F. and Tse, D. A minimax approach to supervised learning. Advances in Neural Information Processing Systems, 29, 2016

work page 2016

[9] [9]

Obtaining causal information by merging datasets with maxent

Garrido Mejia , S., Kirschbaum, E., and Janzing, D. Obtaining causal information by merging datasets with maxent. In International Conference on Artificial Intelligence and Statistics, pp.\ 581--603. PMLR, 2022

work page 2022

[10] [10]

u gelgen, J., K \

Gresele, L., Von K \"u gelgen, J., K \"u bler, J., Kirschbaum, E., Sch \"o lkopf, B., and Janzing, D. Causal inference through the structural causal marginal problem. In International Conference on Machine Learning, pp.\ 7793--7824. PMLR, 2022

work page 2022

[11] [11]

Invariant causal prediction for nonlinear models

Heinze-Deml, C., Peters, J., and Meinshausen, N. Invariant causal prediction for nonlinear models. Journal of Causal Inference, 6 0 (2), 2018

work page 2018

[12] [12]

M., and Talahaturuson, A

Hindersah, R., Kalay, A. M., and Talahaturuson, A. Rice yield grown in different fertilizer combination and planting methods: Case study in buru island, indonesia. Open Agriculture, 7 0 (1): 0 871--881, 2022

work page 2022

[13] [13]

Causal versions of maximum entropy and principle of insufficient reason

Janzing, D. Causal versions of maximum entropy and principle of insufficient reason. Journal of Causal Inference, 9 0 (1): 0 285--301, 2021

work page 2021

[14] [14]

Distinguishing Cause and Effect via Second Order Exponential Models

Janzing, D., Sun, X., and Sch \"o lkopf, B. Distinguishing cause and effect via second order exponential models. arXiv preprint arXiv:0910.5561, 2009

work page internal anchor Pith review Pith/arXiv arXiv 2009

[15] [15]

Jaynes, E. T. Information theory and statistical mechanics. Physical review, 106 0 (4): 0 620, 1957

work page 1957

[16] [16]

Jaynes, E. T. Probability theory: The logic of science. Cambridge university press, 2003

work page 2003

[17] [17]

Disentangling causal effects from sets of interventions in the presence of unobserved confounders

Jeunen, O., Gilligan-Lee, C., Mehrotra, R., and Lalmas, M. Disentangling causal effects from sets of interventions in the presence of unobserved confounders. Advances in Neural Information Processing Systems, 35: 0 27850--27861, 2022

work page 2022

[18] [18]

Kellerer, H. G. Ma theoretische marginalprobleme. Mathematische Annalen, 153 0 (3): 0 168--198, June 1964. doi:10.1007/bf01360315. URL https://doi.org/10.1007/bf01360315

work page doi:10.1007/bf01360315 1964

[19] [19]

and Friedman, N

Koller, D. and Friedman, N. Probabilistic graphical models: principles and techniques. MIT press, 2009

work page 2009

[20] [20]

M., Magliacane, S., and Claassen, T

Mooij, J. M., Magliacane, S., and Claassen, T. Joint causal inference from multiple contexts. The Journal of Machine Learning Research, 21 0 (1): 0 3919--4026, 2020

work page 2020

[21] [21]

Causality

Pearl, J. Causality. Cambridge university press, 2009

work page 2009

[22] [22]

and Mackenzie, D

Pearl, J. and Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., USA, 1st edition, 2018. ISBN 046509760X

work page 2018

[23] [23]

Causal inference by using invariant prediction: identification and confidence intervals

Peters, J., B \"u hlmann, P., and Meinshausen, N. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78 0 (5): 0 947--1012, 2016

work page 2016

[24] [24]

Effect of irrigation and fertilizer management on rice yield and nitrogen loss: A meta-analysis

Qiu, H., Yang, S., Jiang, Z., Xu, Y., and Jiao, X. Effect of irrigation and fertilizer management on rice yield and nitrogen loss: A meta-analysis. Plants, 11 0 (13): 0 1690, 2022

work page 2022

[25] [25]

and Silva, R

Saengkyongam, S. and Silva, R. Learning joint nonlinear effects from single-variable interventions in the presence of hidden confounders. In Conference on Uncertainty in Artificial Intelligence, pp.\ 300--309. PMLR, 2020

work page 2020

[26] [26]

A., and Janzing, D

Sani, N., Mastakouri, A. A., and Janzing, D. Bounding probabilities of causation through the causal marginal problem. arXiv preprint arXiv:2304.02023, 2023

work page arXiv 2023

[27] [27]

Causal inference by choosing graphs with most plausible markov kernels

Sun, X., Janzing, D., and Sch \"o lkopf, B. Causal inference by choosing graphs with most plausible markov kernels. In Ninth International Symposium on Artificial Intelligence and Mathematics (AIMath 2006), pp.\ 1--11, 2006

work page 2006

[28] [28]

and Pearl, J

Tian, J. and Pearl, J. Causal discovery from changes. In Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, pp.\ 512--521, 2001

work page 2001

[29] [29]

and Pearl, J

Tian, J. and Pearl, J. A general identification condition for causal effects. eScholarship, University of California, 2002

work page 2002

[30] [30]

and Spirtes, P

Tillman, R. and Spirtes, P. Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp.\ 3--15. JMLR Workshop and Conference Proceedings, 2011

work page 2011

[31] [31]

Tillman, R. E. Structure learning with independent non-identically distributed data. In Proceedings of the 26th Annual International Conference on Machine Learning, pp.\ 1041--1048, 2009

work page 2009

[32] [32]

and Tsamardinos, I

Triantafillou, S. and Tsamardinos, I. Constraint-based causal discovery from multiple interventions over overlapping variable sets. The Journal of Machine Learning Research, 16 0 (1): 0 2147--2205, 2015

work page 2015

[33] [33]

J., Jordan, M

Wainwright, M. J., Jordan, M. I., et al. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning , 1 0 (1--2): 0 1--305, 2008

work page 2008

[34] [34]

S., and Ardiwinata, A

Wihardjaka, A., Harsanti, E. S., and Ardiwinata, A. N. Effect of fertilizer management on potassium dynamics and yield of rainfed lowland rice in indonesia. Chilean journal of agricultural research, 82 0 (1): 0 33--43, 2022

work page 2022

[35] [35]

Kernel-based conditional independence test and application in causal discovery

Zhang, K., Peters, J., Janzing, D., and Sch \"o lkopf, B. Kernel-based conditional independence test and application in causal discovery. In 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), pp.\ 804--813. AUAI Press, 2011

work page 2011

[36] [36]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page