pith. sign in

arxiv: 1907.00221 · v1 · pith:JGUWWSJ2new · submitted 2019-06-29 · 💻 cs.LG · cs.AI· stat.ML

Causal Inference Under Interference And Network Uncertainty

Pith reviewed 2026-05-25 12:45 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML
keywords causal inferenceinterferencenetwork uncertaintystructure learningdata dependencecausal effectssynthetic data
0
0 comments X

The pith

A method estimates causal effects under interference when the dependence network is unknown in advance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Classical causal methods assume independent data units, but many applications involve networks where one unit's treatment affects others. Standard interference adjustments require exact knowledge of those ties, which is often unavailable in practice such as community trials. The paper combines structure-learning algorithms with interference models to produce causal estimates without that exact prior network. The combined procedure is shown to recover effects on synthetic data that exhibits network dependence. A reader would care because this removes a major barrier to applying causal inference in real networked settings where ties cannot be queried directly.

Core claim

The paper claims that structure learning can be integrated with causal interference models to yield valid estimates of causal effects when the precise structure of dependence among units is not known beforehand, and demonstrates the approach recovers effects on synthetic datasets that exhibit network dependence.

What carries the argument

A joint structure-learning and interference-adjustment procedure that first infers the dependence network from data and then applies network-aware causal estimators.

Load-bearing premise

Structure learning recovers enough of the unknown dependence network to support valid interference adjustments.

What would settle it

A simulation in which the method produces biased effect estimates relative to an oracle estimator that is given the true network structure.

Figures

Figures reproduced from arXiv: 1907.00221 by Daniel Malinsky, Ilya Shpitser, Rohit Bhattacharya.

Figure 1
Figure 1. Figure 1: A chain graph over three variables (L, A, and Y ) on 4 individuals, representing possible relationships between disposable needle use and risk of blood-borne disease among heroin-users. tain causal effects without explicitly estimating them, if corresponding pathways are absent in the selected net￾work. As an example, if neighborhoods have 4 units, we may aim to learn a graphical model such as shown in [P… view at source ↗
Figure 2
Figure 2. Figure 2: The 2-regular CG for a block of size 4 L1 A1 Y1 Y2 A2 L2 L3 A3 Y3 Y4 A4 L4 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The 3-regular CG for a block of size 4 From [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Performance of structure learning algorithms as measured by precision and recall [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
read the original abstract

Classical causal and statistical inference methods typically assume the observed data consists of independent realizations. However, in many applications this assumption is inappropriate due to a network of dependences between units in the data. Methods for estimating causal effects have been developed in the setting where the structure of dependence between units is known exactly, but in practice there is often substantial uncertainty about the precise network structure. This is true, for example, in trial data drawn from vulnerable communities where social ties are difficult to query directly. In this paper we combine techniques from the structure learning and interference literatures in causal inference, proposing a general method for estimating causal effects under data dependence when the structure of this dependence is not known a priori. We demonstrate the utility of our method on synthetic datasets which exhibit network dependence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes combining structure learning techniques with interference models from causal inference to estimate causal effects when the dependence network between units is unknown a priori. It demonstrates the approach on synthetic datasets exhibiting network dependence, addressing applications such as trial data from communities where social ties are hard to observe directly.

Significance. If the combination yields valid estimates, the work would address a practical gap in causal inference for dependent data with uncertain networks. The paper correctly identifies the limitation of existing methods that require exact network knowledge and attempts to bridge the structure learning and interference literatures. Credit is due for focusing on a realistic setting and providing synthetic demonstrations; however, the absence of theoretical guarantees or real-data validation limits the current impact.

major comments (3)
  1. [Proposed method (description of combining structure learning with interference)] The central proposal relies on applying off-the-shelf structure learning (e.g., conditional-independence tests in PC or GES) to data generated under interference. Interference induces direct cross-unit effects that violate the Markov factorization and conditional-independence assumptions required by these algorithms, yet no correction, robustness analysis, or bias bound is derived for the recovered graph or downstream estimator.
  2. [Experiments / synthetic data section] All empirical support consists of synthetic data demonstrations. No consistency theorem, finite-sample error bound, or sensitivity analysis is provided for the case when the learned graph contains false or missing edges due to interference-induced dependencies.
  3. [Abstract and introduction] The abstract states that the method produces 'valid causal estimates' under network uncertainty, but the manuscript supplies neither a proof of validity nor a counter-example analysis showing when the procedure fails.
minor comments (2)
  1. [Method] Notation for the learned graph versus the true interference graph should be distinguished more clearly when describing the two-stage procedure.
  2. [Experiments] The synthetic data generation process (how interference is injected and how the unknown network is sampled) could be described with an explicit algorithm or pseudocode for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address each major comment below.

read point-by-point responses
  1. Referee: The central proposal relies on applying off-the-shelf structure learning (e.g., conditional-independence tests in PC or GES) to data generated under interference. Interference induces direct cross-unit effects that violate the Markov factorization and conditional-independence assumptions required by these algorithms, yet no correction, robustness analysis, or bias bound is derived for the recovered graph or downstream estimator.

    Authors: We acknowledge that interference can violate the conditional-independence assumptions of standard structure-learning algorithms. The proposed method applies these algorithms as a practical heuristic without explicit correction, and the manuscript presents this as an empirical approach whose utility is shown via synthetic experiments rather than through derived robustness bounds. revision: no

  2. Referee: All empirical support consists of synthetic data demonstrations. No consistency theorem, finite-sample error bound, or sensitivity analysis is provided for the case when the learned graph contains false or missing edges due to interference-induced dependencies.

    Authors: The work is methodological and focuses on demonstrating feasibility through synthetic data that exhibit network dependence. No consistency theorems or formal error bounds are supplied, as developing such results for the combined setting is beyond the scope of the current manuscript. revision: no

  3. Referee: The abstract states that the method produces 'valid causal estimates' under network uncertainty, but the manuscript supplies neither a proof of validity nor a counter-example analysis showing when the procedure fails.

    Authors: The abstract employs 'valid' in the sense of empirical performance on the synthetic examples considered. We will revise the abstract and introduction to clarify that the estimates are shown to be accurate under the simulated conditions of network uncertainty, without claiming formal validity. revision: yes

Circularity Check

0 steps flagged

No circularity; method combines independent techniques demonstrated on synthetic data

full rationale

The paper proposes combining existing structure learning and interference techniques to estimate causal effects when the dependence network is unknown a priori. It relies on standard methods from the literature and validates the approach via synthetic datasets exhibiting network dependence. No self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations are present in the abstract or described approach. The central claim does not reduce to its inputs by construction and remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no identifiable free parameters, axioms, or invented entities; assessment limited by lack of full text.

pith-pipeline@v0.9.0 · 5657 in / 947 out tokens · 37067 ms · 2026-05-25T12:45:08.286979+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

  1. [1]

    High- dimensional Ising model selection with Bayesian infor- mation criteria

    Rina Foygel Barber and Mathias Drton. High- dimensional Ising model selection with Bayesian infor- mation criteria. Electronic Journal of Statistics, 9(1):567– 607, 2015

  2. [2]

    Spatial interaction and the statistical anal- ysis of lattice systems

    Julian Besag. Spatial interaction and the statistical anal- ysis of lattice systems. Journal of the Royal Statistical Society: Series B (Methodological Statistics), 36(2):192– 236, 1974

  3. [3]

    The Oxford Handbook of the Economics of Networks

    Yann Bramoull ´e, Andrea Galeotti, and Brian Rogers. The Oxford Handbook of the Economics of Networks. Oxford University Press, 2016

  4. [4]

    Optimal structure identifica- tion with greedy search

    David Maxwell Chickering. Optimal structure identifica- tion with greedy search. Journal of Machine Learning Research, 3(Nov):507–554, 2002

  5. [5]

    Crawford, Peter M

    Forrest W. Crawford, Peter M. Aronow, Li Zeng, and Jianghong Li. Identification of homophily and preferen- tial recruitment in respondent-driven sampling. American Journal of Epidemiology, 187(1):153–160, 2017

  6. [6]

    Robin J. Evans. Model selection and local geometry. arXiv preprint arXiv:1801.08364, 2018

  7. [7]

    Extended Bayesian in- formation criteria for Gaussian graphical models

    Rina Foygel and Mathias Drton. Extended Bayesian in- formation criteria for Gaussian graphical models. In Ad- vances in Neural Information Processing Systems , pages 604–612, 2010

  8. [8]

    Dominique M. A. Haughton. On the choice of a model to fit data from an exponential family. Annals of Statistics, 16(1):342–355, 1988

  9. [9]

    Estimation of sparse binary pairwise markov networks using pseudo- likelihoods

    Holger H ¨ofling and Robert Tibshirani. Estimation of sparse binary pairwise markov networks using pseudo- likelihoods. Journal of Machine Learning Research , 10(Apr):883–906, 2009

  10. [10]

    Raudenbush

    Guanglei Hong and Stephen W. Raudenbush. Evaluat- ing kindergarten retention policy: A case study of causal inference for multilevel observational data. Journal of the American Statistical Association, 101(475):901–910, 2006

  11. [11]

    Peter J. Huber. The behavior of maximum likelihood es- timates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statis- tics and Probability, Volume 1: Statistics, pages 221–233. University of California Press, 1967

  12. [12]

    Hudgens and M

    Michael G. Hudgens and M. Elizabeth Halloran. Toward causal inference with interference. Journal of the Ameri- can Statistical Association, 103(482):832–842, 2008

  13. [13]

    Johnson, and Pradeep K

    Ali Jalali, Christopher C. Johnson, and Pradeep K. Ravikumar. On learning discrete graphical models us- ing greedy methods. In Advances in Neural Information Processing Systems, pages 1935–1943, 2011

  14. [14]

    A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guaran- tees

    Kshitij Khare, Sang-Yun Oh, and Bala Rajaratnam. A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guaran- tees. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(4):803–825, 2015

  15. [15]

    Adam D. I. Kramer, Jamie E Guillory, and Jeffrey T. Han- cock. Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, pages 8788–8790, 2014

  16. [16]

    Lauritzen

    Steffen L. Lauritzen. Graphical Models. Oxford Univer- sity Press, 1996

  17. [17]

    Lauritzen and Thomas S

    Steffen L. Lauritzen and Thomas S. Richardson. Chain graph models and their causal interpretations. Jour- nal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3):321–348, 2002

  18. [18]

    On learning causal models from relational data

    Sanghack Lee and Vasant Honavar. On learning causal models from relational data. In Thirtieth AAAI Confer- ence on Artificial Intelligence, pages 3263–3270, 2016

  19. [19]

    So- cial selection and peer influence in an online social net- work

    Kevin Lewis, Marco Gonzalez, and Jason Kaufman. So- cial selection and peer influence in an online social net- work. Proceedings of the National Academy of Sciences, 109(1):68–72, 2012

  20. [20]

    Structural learning of chain graphs via decomposition

    Zongming Ma, Xianchao Xie, and Zhi Geng. Structural learning of chain graphs via decomposition. Journal of Machine Learning Research, 9(Dec):2847–2880, 2008

  21. [21]

    A sound and complete algorithm for learn- ing causal models from relational data

    Marc Maier, Katerina Marazopoulou, David Arbour, and David Jensen. A sound and complete algorithm for learn- ing causal models from relational data. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pages 371–380. AUAI Press, 2013

  22. [22]

    Consistent inference of a general model using the pseu- dolikelihood method

    Alexander Mozeika, Onur Dikmen, and Joonas Piili. Consistent inference of a general model using the pseu- dolikelihood method. Phys. Rev. E, 90:010101, 2014

  23. [23]

    Ogburn, Ilya Shpitser, and Youjin Lee

    Elizabeth L. Ogburn, Ilya Shpitser, and Youjin Lee. Causal inference, social networks, and chain graphs. arXiv preprint arXiv:1812.04990, 2018

  24. [24]

    Ogburn and Tyler J

    Elizabeth L. Ogburn and Tyler J. VanderWeele. Causal diagrams for interference. Statistical Science, 29(4):559– 578, 2014

  25. [25]

    Causality

    Judea Pearl. Causality. Cambridge University Press, 2009

  26. [26]

    An inclusion optimal algorithm for chain graph structure learning

    Jose Pe ˜na, Dag Sonntag, and Jens Nielsen. An inclusion optimal algorithm for chain graph structure learning. In Proceedings of the 17th International Conference on Ar- tificial Intelligence and Statistics, pages 778–786, 2014

  27. [27]

    Mooij, Dominik Janzing, and Bern- hard Sch¨olkopf

    Jonas Peters, Joris M. Mooij, Dominik Janzing, and Bern- hard Sch¨olkopf. Causal discovery with continuous addi- tive noise models. The Journal of Machine Learning Re- search, 15(1):2009–2053, 2014

  28. [28]

    Ravikumar, Martin J

    Pradeep K. Ravikumar, Martin J. Wainwright, and John D. Lafferty. High-dimensional Ising model selec- tion using L1-regularized logistic regression. Annals of Statistics, 38(3):1287–1319, 2010

  29. [29]

    Richardson and James M

    Thomas S. Richardson and James M. Robins. Single world intervention graphs (SWIGs): A unification of the counterfactual and graphical approaches to causality. Center for the Statistics and the Social Sciences, Univer- sity of Washington Series. Working Paper 128 , pages 1– 146, 2013

  30. [30]

    James M. Robins. A new approach to causal infer- ence in mortality studies with a sustained exposure pe- riod—application to control of the healthy worker sur- vivor effect. Mathematical Modelling , 7(9-12):1393– 1512, 1986

  31. [31]

    Rosenbaum

    Paul R. Rosenbaum. Interference between units in ran- domized experiments. Journal of the American Statistical Association, 102(477):191–200, 2007

  32. [32]

    Estimating the dimension of a model

    Gideon Schwarz. Estimating the dimension of a model. Annals of Statistics, 6(2):461–464, 1978

  33. [33]

    Cosma Rohilla Shalizi and Andrew C. Thomas. Ho- mophily and contagion are generically confounded in ob- servational social network studies. Sociological Methods & Research, 40(2):211–239, 2011

  34. [34]

    Identification and estima- tion of causal effects from dependent data

    Eli Sherman and Ilya Shpitser. Identification and estima- tion of causal effects from dependent data. InAdvances in Neural Information Processing Systems 31, pages 9424–

  35. [35]

    LiNGAM: non-Gaussian methods for estimating causal structures

    Shohei Shimizu. LiNGAM: non-Gaussian methods for estimating causal structures. Behaviormetrika, 41(1):65– 98, 2014

  36. [36]

    Michael E. Sobel. What do randomized studies of hous- ing mobility demonstrate? Causal inference in the face of interference. Journal of the American Statistical Associ- ation, 101(476):1398–1407, 2006

  37. [37]

    Causation, Prediction, and Search

    Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT press, 2nd edi- tion, 2000

  38. [38]

    Rich, and Scott Burris

    Sharon Stancliff, Bruce Agins, Josiah D. Rich, and Scott Burris. Syringe access for the prevention of blood borne infections among injection drug users. BMC Public Health, 3(1):37, 2003

  39. [39]

    Tchetgen Tchetgen, Isabel Fulcher, and Ilya Sh- pitser

    Eric J. Tchetgen Tchetgen, Isabel Fulcher, and Ilya Sh- pitser. Auto-G-Computation of causal effects on a net- work. arXiv:1709.01577, 2017

  40. [40]

    Tchetgen Tchetgen and Tyler J

    Eric J. Tchetgen Tchetgen and Tyler J. VanderWeele. On causal inference in the presence of interference. Statisti- cal Methods in Medical Research, 21(1):55–75, 2012

  41. [41]

    VanderWeele, Eric J

    Tyler J. VanderWeele, Eric J. Tchetgen Tchetgen, and M. Elizabeth Halloran. Components of the indirect ef- fect in vaccine trials: identification of contagion and in- fectiousness effects. Epidemiology, 23(5):751, 2012

  42. [42]

    Cole, Jessica G

    Daniel Westreich, Stephen R. Cole, Jessica G. Young, Frank Palella, Phyllis C. Tien, Lawrence Kingsley, Stephen J. Gange, and Miguel A. Hern ´an. The paramet- ric g-formula to estimate the effect of highly active an- tiretroviral therapy on incident aids or death. Statistics in Medicine, 31(18):2000–2009, 2012. Supplementary Material CAUSAL CHAIN GRAPHS A...

  43. [43]

    showed that a property called local consistency , which follows from decomposability and consistency of the score, is sufficient to design a consistent forward- backward greedy search in the space of (Markov equiv- alent) DAGs. The forward stepwise search considers ad- ditions, rather than deletions, of single edges to improve the score, which typically pr...

  44. [44]

    Vi ̸⊥ ⊥G0 Vj | bdG(Vi) or Vj ̸⊥ ⊥G0 Vi | bdG(Vj) then limn→∞ P (S(D; G′) > S(D; G)) → 1

  45. [45]

    Vi ⊥ ⊥G0 Vj | bdG′(Vi) and Vj ⊥ ⊥G0 Vi | bdG(Vj) then limn→∞ P (S(D; G′) < S(D; G)) → 1 Such a property requires a stronger notion of decompos- ability than is available in our general setting. In Section 4.2 we mention that if our model is an MRF that is mul- tivariate normal, or corresponds to a log linear discrete model with only main effects and pairw...