On the Relationship between Bayesian Networks and Probabilistic Structural Causal Models

Eleonora Zullo; Fabio Stella; Peter J.F. Lucas

arxiv: 2603.27406 · v2 · submitted 2026-03-28 · 💻 cs.AI · cs.LG

On the Relationship between Bayesian Networks and Probabilistic Structural Causal Models

Peter J.F. Lucas , Eleonora Zullo , Fabio Stella This is my paper

Pith reviewed 2026-05-14 21:33 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords Bayesian networksstructural causal modelscausalitylinear algebralinear programmingmodel transformationprobabilistic models

0 comments

The pith

Bayesian networks can be mapped to probabilistic structural causal models through linear algebra and linear programming.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates the mapping between Bayesian networks learned from data or experts and probabilistic versions of structural causal models. It uses linear algebra and linear programming to perform the transformation and studies conditions for the existence and uniqueness of solutions based on the dimensions of the models. This matters because it clarifies whether standard probabilistic networks can support causal reasoning without major changes to their structure or distributions. The work also looks at how the meaning of the models shifts under this transformation.

Core claim

A Bayesian network can be converted into a probabilistic structural causal model by solving a system of linear equations that relate their parameters, with solutions existing and being unique depending on the relative dimensions of the two models, and this conversion affects the semantic interpretation of causality in the resulting model.

What carries the argument

The linear system derived from equating the joint distributions of the Bayesian network and the probabilistic structural causal model, solved via matrix methods and linear programming.

If this is right

Solutions exist when the number of parameters in the structural model is sufficient to match the Bayesian network's degrees of freedom.
Uniqueness holds when the linear system has full rank.
The observational probabilities remain identical after transformation.
The causal semantics differ because structural models explicitly encode functional mechanisms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This mapping suggests that causal inference techniques developed for structural models could be applied to Bayesian networks obtained from data.
Dimensional analysis offers a practical test for whether a given Bayesian network admits a causal interpretation in the structural sense.
Extensions to discrete or mixed variable cases may require similar but generalized linear programming approaches.

Load-bearing premise

Bayesian networks are compatible with probabilistic structural causal models such that their parameters can be related through linear equations without fundamental incompatibility.

What would settle it

Observing a Bayesian network for which no solution to the corresponding linear system exists, even when dimensions suggest it should, or where the transformation changes the probability distribution.

read the original abstract

In this paper, the relationship between probabilistic graphical models, in particular Bayesian networks, and causal diagrams, also called structural causal models, is studied. Structural causal models are deterministic models, based on structural equations or functions, that can be provided with uncertainty by adding independent, unobserved random variables to the models, equipped with probability distributions. One question that arises is whether a Bayesian network that has obtained from expert knowledge or learnt from data can be mapped to a probabilistic structural causal model, and whether or not this has consequences for the network structure and probability distribution. We show that linear algebra and linear programming offer key methods for the transformation, and examine properties for the existence and uniqueness of solutions based on dimensions of the probabilistic structural model. Finally, we examine in what way the semantics of the models is affected by this transformation. Keywords: Causality, probabilistic structural causal models, Bayesian networks, linear algebra, experimental software.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps BNs to probabilistic SCMs via linear algebra and LP with dimension-based uniqueness, but the mapping's effect on conditional independencies is not clearly secured.

read the letter

The main point is that the authors treat BN parameters as vectors and apply linear algebra plus linear programming to produce a corresponding probabilistic SCM, then use dimension counts to settle existence and uniqueness of the solution. This algebraic framing is the concrete step they add beyond the usual high-level comparisons between the two model classes. They also flag how the semantics can shift once the transformation is done, which is a useful reminder that the mapping is not cost-free. The dimension argument itself is cleanly stated and gives a practical test for when the correspondence is unambiguous. The soft spot is whether the linear map keeps the original conditional independencies. Flattening the probability tables into a vector space does not automatically respect the nonlinear simplex constraints or the factorization imposed by the DAG, so a solution in the ambient space could still produce an SCM whose joint introduces spurious dependencies. The abstract asserts the methods work but supplies no derivations or small examples that would confirm d-separation is preserved, so that claim rests on the reader accepting the linear step without seeing the check. This is aimed at researchers who already work at the intersection of graphical models and causality and who are open to algebraic tools for the connection. Someone looking for a formal handle on when a BN can be recast as a probabilistic SCM will get a usable starting point, even if they have to supply the missing verification steps themselves. The paper deserves a serious referee because the mapping idea is worth tightening up rather than being dismissed outright. I would send it for review and ask specifically for proofs or counter-examples on the independence preservation question.

Referee Report

2 major / 1 minor

Summary. The paper examines the relationship between Bayesian networks (BNs) obtained from expert knowledge or data and probabilistic structural causal models (SCMs). It claims that linear algebra and linear programming provide methods to transform BN parameter vectors into SCM exogenous-noise distributions, derives conditions for existence and uniqueness of solutions from the dimensions of the probabilistic structural model, and analyzes how the transformation affects model semantics.

Significance. If the linear-algebraic and LP-based mappings are shown to preserve the joint distribution, conditional independencies, and causal semantics, the work would supply a concrete computational bridge between two standard representations in probabilistic graphical models and causality. The emphasis on dimension-driven existence/uniqueness arguments and the use of standard linear-algebra tools would be a clear methodological contribution, especially if accompanied by reproducible code or explicit algorithms.

major comments (2)

[Abstract / main claims] The central claim that a linear (or LP) transformation exists and is unique when the dimension condition holds is not supported by any derivation, example, or proof in the visible text. The abstract asserts the result but supplies no explicit mapping, no verification that the image lies inside the probability simplex, and no check that d-separation relations are preserved.
[Section on existence and uniqueness] The dimension-based existence/uniqueness argument only guarantees a solution in the ambient vector space. It does not automatically ensure that the resulting SCM induces a joint distribution whose support and conditional independencies match those of the original BN factorization; linear combinations of noise terms can introduce spurious dependencies that violate the DAG.

minor comments (1)

[Keywords] The keyword list includes 'experimental software' yet the abstract and visible text contain no description of any implementation, experiments, or code.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on our manuscript. We address each major comment below and indicate planned revisions to improve clarity and rigor.

read point-by-point responses

Referee: [Abstract / main claims] The central claim that a linear (or LP) transformation exists and is unique when the dimension condition holds is not supported by any derivation, example, or proof in the visible text. The abstract asserts the result but supplies no explicit mapping, no verification that the image lies inside the probability simplex, and no check that d-separation relations are preserved.

Authors: We acknowledge that the abstract is concise and does not detail the mapping. The manuscript derives the linear transformation in Section 3 by constructing a system of equations from the BN factorization and solving for the exogenous noise distributions via linear algebra (or LP when constraints are active). We will revise the abstract to briefly describe the mapping and add a concrete numerical example in the main text that verifies the solution lies in the probability simplex when the dimension condition holds. We will also explicitly note that d-separation is preserved because the underlying DAG is unchanged and the transformation equates the conditional probability tables exactly. revision: yes
Referee: [Section on existence and uniqueness] The dimension-based existence/uniqueness argument only guarantees a solution in the ambient vector space. It does not automatically ensure that the resulting SCM induces a joint distribution whose support and conditional independencies match those of the original BN factorization; linear combinations of noise terms can introduce spurious dependencies that violate the DAG.

Authors: We agree this point requires clarification. The dimension argument identifies when a unique solution exists in the vector space, but the linear system is specifically constructed by equating the BN's joint probabilities (via its factorization) to the SCM's implied distribution. This ensures the induced joint, support, and conditional independencies match by design. We will add a paragraph in the revised Section 4 explaining that independent exogenous noises and the exact replication of CPTs prevent spurious dependencies, together with a short proof sketch that the marginals and conditionals are identical to those of the original BN. revision: yes

Circularity Check

0 steps flagged

No circularity: standard linear-algebraic mapping applied to model dimensions

full rationale

The paper's core derivation applies linear algebra and linear programming to transform between Bayesian network parameter vectors and probabilistic structural causal model noise distributions, then derives existence and uniqueness conditions from the relative dimensions of the two representations. No equation or claim reduces by construction to a fitted parameter, self-defined quantity, or load-bearing self-citation; the dimension-based arguments rest on ordinary linear-algebra facts external to the target mapping. The abstract and description contain no self-referential steps that would force the reported properties to be tautological with the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the domain assumption that probabilistic SCMs are formed by adding independent unobserved random variables to deterministic structural equations, and that linear methods suffice for mapping without additional constraints.

axioms (1)

domain assumption Structural causal models can be made probabilistic by adding independent, unobserved random variables equipped with probability distributions.
Directly stated in the abstract as the basis for probabilistic SCMs.

pith-pipeline@v0.9.0 · 5455 in / 1079 out tokens · 48413 ms · 2026-05-14T21:33:20.381875+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that linear algebra and linear programming offer key methods for the transformation, and examine properties for the existence and uniqueness of solutions based on dimensions of the probabilistic structural model.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the only uncertainty represented in a PSCM is in its associated exogenous variables U_w

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.