Estimating Heterogeneous Causal Effect on Networks via Orthogonal Learning

Yuanchen Wu; Yubai Yuan

arxiv: 2509.18484 · v2 · submitted 2025-09-23 · 📊 stat.ML · cs.LG

Estimating Heterogeneous Causal Effect on Networks via Orthogonal Learning

Yuanchen Wu , Yubai Yuan This is my paper

Pith reviewed 2026-05-18 15:17 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords causal inferenceheterogeneous effectsnetwork interferenceorthogonal learninggraph neural networksspillover effectsattention model

0 comments

The pith

A two-stage orthogonal learning method estimates heterogeneous direct and spillover causal effects on networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a procedure that first trains graph neural networks to capture how covariates and network links create confounding and dependence. It then residualizes those estimates and fits an attention-based model for interference in a second stage. Neyman orthogonalization combined with cross-fitting ensures that mistakes in the first stage affect the final causal estimates only at higher order. The result is edge-level spillover estimates plus node and population summaries, together with a bootstrap procedure for uncertainty quantification. A reader would care because the approach makes it feasible to recover varying treatment effects that spread unevenly across connected units without the usual first-stage bias dominating the answer.

Core claim

The central claim is that a two-stage procedure—graph neural networks for nuisance functions in stage one, followed by residualization and an attention-based interference model in stage two—delivers consistent estimates of heterogeneous direct and spillover effects on networks once Neyman orthogonal scores and cross-fitting are applied, so that first-stage estimation errors enter the second-stage expansion only at higher order.

What carries the argument

The Neyman-orthogonal score inside a cross-fitted two-stage estimator, where graph neural networks model the nuisance functions that capture covariate and network dependence, and an attention-based interference model extracts the heterogeneous effects in the second stage.

Load-bearing premise

The graph neural networks in the first stage must capture the dependence on covariates and network structure well enough that residualizing them removes all leading bias from the second-stage attention model.

What would settle it

Run the procedure on simulated networks where the first-stage graph neural networks are deliberately misspecified so they leave a non-negligible linear term in the residuals, then check whether the estimated heterogeneous spillover effects remain consistent with the known ground truth.

Figures

Figures reproduced from arXiv: 2509.18484 by Yuanchen Wu, Yubai Yuan.

**Figure 1.** Figure 1: Causal diagram for an ego unit i on network where units j and k are two neighbors of i. The magnitude of these spillover effects vary among voters depending on their ideological alignment and socioeconomic status. Critically, the sign of the spillover effects also differ based on voters’ ideological positions, which is the well-documented phenomenon of political polarization and echo chambers [3]. Moreove… view at source ↗

**Figure 2.** Figure 2: Two-stage orthogonal learning framework for estimating direct and spillover effects under an additive [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Spillover effects in political polarization [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Edge-level interference estimation. (a): pairwise influence recovery. (b,c): influential neighbors [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

Estimating causal effects on networks is challenging because treatments may affect both treated units and their neighbors, while network homophily induces dependence and confounding. These challenges are amplified when causal effects are heterogeneous across units and edges. We propose a two-stage orthogonal learning framework for estimating heterogeneous direct and spillover effects on networks. The first stage uses graph neural networks to estimate nuisance components that capture complex dependence on covariates and network structure. The second stage residualizes these nuisance components and estimates causal effects through an interpretable attention-based interference model, yielding edge-level spillover estimates as well as node- and population-level summaries. Neyman orthogonalization and cross-fitting reduce sensitivity to first-stage estimation error, so nuisance errors enter only at higher order. We further develop a bootstrap-based uncertainty quantification procedure for the estimated spillover matrix, enabling pointwise and simultaneous inference for heterogeneous edge- and node-level effects. Experiments show that our method improves heterogeneous effect estimation while supporting interpretable downstream analyses such as influential-neighbor detection and spillover-sign recovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a two-stage GNN-plus-attention framework for heterogeneous network causal effects with Neyman orthogonality, but the cross-fitting claim under network dependence is the part that needs checking.

read the letter

The core contribution is a two-stage procedure: graph neural nets estimate the nuisance functions that soak up covariate and network dependence, then an attention-based model on the residuals targets heterogeneous direct effects and edge-level spillovers. Neyman orthogonalization plus cross-fitting is supposed to push first-stage errors into higher-order terms only. That combination is new enough in the network setting to be worth noting, and the bootstrap for the spillover matrix is a practical addition for inference on heterogeneous effects. The setup also gives node- and population-level summaries plus some downstream tasks like spotting influential neighbors, which could be useful in applied work. Experiments are claimed to show gains, though the abstract gives no numbers or baselines to judge how large those gains are. The main soft spot is the dependence issue. Networks induce correlation across units through edges and homophily, so random or even k-fold splits do not make the nuisance estimates independent of the second-stage data. If that dependence leaks into the remainder term, the higher-order bias guarantee can fail and first-order contamination remains. The paper would need explicit assumptions or a network-aware splitting scheme to close this gap; without it the central robustness claim rests on shaky ground. Minor issues include the usual hyperparameter choices for the GNN and attention weights, which are free parameters and could affect results. This is aimed at people doing causal work on graphs who already know the standard orthogonal-learning toolkit. A reader who wants a concrete method for heterogeneous spillovers will find something to try, even if the dependence fix is incomplete. It deserves a serious referee because the problem is real and the proposed architecture is coherent, though revisions on the cross-fitting justification would be needed.

Referee Report

1 major / 2 minor

Summary. The paper proposes a two-stage orthogonal learning framework for estimating heterogeneous direct and spillover causal effects on networks. Graph neural networks estimate nuisance components in the first stage to capture covariate and network dependencies. The second stage residualizes these and fits an attention-based interference model to obtain edge-level spillover estimates along with node- and population-level summaries. Neyman orthogonality combined with cross-fitting is invoked to ensure first-stage estimation errors affect the target estimator only at higher order. A bootstrap procedure is developed for uncertainty quantification of the spillover matrix, and experiments are reported to show gains in heterogeneous effect estimation and support for downstream tasks such as influential-neighbor detection.

Significance. If the higher-order bias property holds under network dependence, the framework would offer a practical advance for causal inference with interference by delivering interpretable heterogeneous spillover estimates via attention weights. The bootstrap for pointwise and simultaneous inference on edge- and node-level effects is a concrete strength. The work adapts standard Neyman orthogonalization to GNN nuisance estimation and attention-based modeling, which could be useful when network homophily and complex dependence are present.

major comments (1)

[Cross-fitting and Neyman orthogonality (methods / theoretical analysis)] The central claim that Neyman orthogonalization and cross-fitting reduce first-stage errors to higher order (stated in the abstract and elaborated in the two-stage framework) assumes that cross-fit folds produce nuisance estimates that are asymptotically independent of the second-stage observations. On networks, however, units remain dependent through edges and homophily; standard random or k-fold splits do not necessarily break this dependence when the network is connected or contains dense clusters. Consequently, the remainder term in the orthogonal expansion may retain a first-order component proportional to network-induced covariance between folds. This directly undermines the higher-order bias guarantee and requires either additional theoretical conditions (e.g., network mixing or sparsity assumptions) or a modified cross-fitting scheme that respects network structure.

minor comments (2)

[Abstract] The abstract states that experiments demonstrate improvement, yet specific quantitative comparisons (e.g., MSE or coverage rates against baselines) are not summarized; adding one or two key metrics would strengthen the claim.
[Model description] Notation for the attention weights and the spillover matrix should be introduced with a clear mapping to the estimands (direct vs. spillover) to improve readability for readers unfamiliar with the attention-based interference model.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their insightful comments, which help clarify the scope of our theoretical guarantees. We address the major comment point by point below.

read point-by-point responses

Referee: [Cross-fitting and Neyman orthogonality (methods / theoretical analysis)] The central claim that Neyman orthogonalization and cross-fitting reduce first-stage errors to higher order (stated in the abstract and elaborated in the two-stage framework) assumes that cross-fit folds produce nuisance estimates that are asymptotically independent of the second-stage observations. On networks, however, units remain dependent through edges and homophily; standard random or k-fold splits do not necessarily break this dependence when the network is connected or contains dense clusters. Consequently, the remainder term in the orthogonal expansion may retain a first-order component proportional to network-induced covariance between folds. This directly undermines the higher-order bias guarantee and requires either additional theoretical conditions (e.g., network mixing or sparsity assumptions) or a a

Authors: We agree that the validity of the higher-order bias property under network dependence merits explicit discussion. Our analysis relies on the network satisfying standard weak-dependence conditions (bounded maximum degree and network mixing) that make the covariance between cross-fit folds vanish at a sufficient rate; these conditions are implicit in the GNN nuisance estimation step but were not stated as formal assumptions. We will revise the theoretical section to add these conditions explicitly and to note that the result may not hold for fully dense or non-mixing networks. We will also add a brief discussion of network-aware splitting (e.g., via graph partitioning) as a practical safeguard, together with a small simulation check. These changes strengthen the manuscript without altering the core method or empirical results. revision: partial

Circularity Check

0 steps flagged

No significant circularity; standard Neyman orthogonalization applied to distinct stages

full rationale

The paper's derivation chain applies established Neyman orthogonalization and cross-fitting to a two-stage procedure (GNN nuisance estimation followed by attention-based interference modeling). These techniques are invoked as external properties that ensure higher-order remainder terms, without the central result reducing to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The first- and second-stage models are explicitly separated, and network dependence is treated as an assumption rather than derived from the estimator itself. The framework remains self-contained against external benchmarks for orthogonal learning.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; the framework depends on modeling choices for nuisance estimation and interference that are not fully specified.

free parameters (2)

GNN architecture and hyperparameters
Chosen to estimate nuisance components that capture covariate and network dependence; values are fitted during the first stage.
Attention weights in the interference model
Learned in the second stage to produce edge-level spillover estimates.

axioms (2)

domain assumption Neyman orthogonality holds for the chosen first-stage estimators
Invoked so that first-stage errors affect the target parameters only at higher order.
ad hoc to paper The attention-based interference model correctly represents the spillover mechanism
Assumed when moving from residualized data to edge-level estimates.

invented entities (1)

attention-based interference model no independent evidence
purpose: To produce interpretable edge-level spillover estimates and node-level summaries
Introduced as the second-stage estimator; no independent evidence outside the paper is provided in the abstract.

pith-pipeline@v0.9.0 · 5698 in / 1530 out tokens · 48304 ms · 2026-05-18T15:17:22.186450+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Neyman orthogonalization and cross-fitting reduce sensitivity to first-stage estimation error, so nuisance errors enter only at higher order.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use graph neural networks to estimate nuisance components... attention-based interference model

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

[1]

Estimating average causal effects under general interference, with application to a social network experiment

Peter M Aronow and Cyrus Samii. Estimating average causal effects under general interference, with application to a social network experiment. 2017

work page 2017
[2]

Aronow and Cyrus Samii

Peter M. Aronow and Cyrus Samii. Estimating average causal effects under general interference, with application to a social network experiment.The Annals of Applied Statistics, 11(4):1912 – 1947, 2017

work page 1912
[3]

Exposure to opposing views on social media can increase political polarization.Proceedings of the National Academy of Sciences, 115(37):9216–9221, 2018

Christopher A Bail, Laura P Argyle, Taylor W Brown, John P Bumpus, Haohan Chen, M Brooke Hunzaker, Jaemin Lee, Marcus Mann, Friedolin Merhout, and Alexander V olfovsky. Exposure to opposing views on social media can increase political polarization.Proceedings of the National Academy of Sciences, 115(37):9216–9221, 2018

work page 2018
[4]

Heterogeneous treatment and spillover effects under clustered network interference.The Annals of Applied Statistics, 19(1):28– 55, 2025

Falco J Bargagli-Stoffi, Costanza Tortù, and Laura Forastiere. Heterogeneous treatment and spillover effects under clustered network interference.The Annals of Applied Statistics, 19(1):28– 55, 2025

work page 2025
[5]

Springer Science & Business Media, 1998

Béla Bollobás.Modern graph theory, volume 184. Springer Science & Business Media, 1998

work page 1998
[6]

A 61-million-person experiment in social influence and political mobilization.Nature, 489(7415):295–298, 2012

Robert M Bond, Christopher J Fariss, Jason J Jones, Adam DI Kramer, Cameron Marlow, Jaime E Settle, and James H Fowler. A 61-million-person experiment in social influence and political mobilization.Nature, 489(7415):295–298, 2012

work page 2012
[7]

Doubly robust causal effect estimation under networked interference via targeted learning

Weilin Chen, Ruichu Cai, Zeqin Yang, Jie Qiao, Yuguang Yan, Zijian Li, and Zhifeng Hao. Doubly robust causal effect estimation under networked interference via targeted learning. In Proceedings of the 41st International Conference on Machine Learning, pages 6457–6485. PMLR, 2024

work page 2024
[8]

Double/debiased machine learning for treatment and structural parameters, 2018

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters, 2018

work page 2018
[9]

Identification and estimation of treatment and interference effects in observational studies on networks.Journal of the American Statistical Association, 116(534):901–918, 2021

Laura Forastiere, Edoardo M Airoldi, and Fabrizia Mealli. Identification and estimation of treatment and interference effects in observational studies on networks.Journal of the American Statistical Association, 116(534):901–918, 2021

work page 2021
[10]

Orthogonal statistical learning.The Annals of Statistics, 51(3):879–908, 2023

Dylan J Foster and Vasilis Syrgkanis. Orthogonal statistical learning.The Annals of Statistics, 51(3):879–908, 2023

work page 2023
[11]

Generalization and representational limits of graph neural networks

Vikas Garg, Stefanie Jegelka, and Tommi Jaakkola. Generalization and representational limits of graph neural networks. InInternational conference on machine learning, pages 3419–3430. PMLR, 2020

work page 2020
[12]

Social networks and the identification of peer effects.Journal of Business & Economic Statistics, 31(3):253–264, 2013

Paul Goldsmith-Pinkham and Guido W Imbens. Social networks and the identification of peer effects.Journal of Business & Economic Statistics, 31(3):253–264, 2013

work page 2013
[13]

Learning individual causal effects from networked observational data

Ruocheng Guo, Jundong Li, and Huan Liu. Learning individual causal effects from networked observational data. InProceedings of the 13th International Conference on Web Search and Data Mining (WSDM), pages 232–240. ACM, 2020

work page 2020
[14]

Model-based regression adjustment with model-free covariates for network interference.Journal of Causal Inference, 11(1):20230005, 2023

Kevin Han and Johan Ugander. Model-based regression adjustment with model-free covariates for network interference.Journal of Causal Inference, 11(1):20230005, 2023

work page 2023
[15]

Modeling interference for individual treatment effect estimation from networked observational data.ACM Transactions on Knowledge Discovery from Data, 18(3):1–21, 2023

Qiang Huang, Jing Ma, Jundong Li, Ruocheng Guo, Huiyan Sun, and Yi Chang. Modeling interference for individual treatment effect estimation from networked observational data.ACM Transactions on Knowledge Discovery from Data, 18(3):1–21, 2023

work page 2023
[16]

Toward causal inference with interference

Michael G Hudgens and M Elizabeth Halloran. Toward causal inference with interference. Journal of the American Statistical Association, 103(482):832–842, 2008

work page 2008
[17]

Estimating causal effects on networked observational data via representation learning

Song Jiang, Yaliang Li, Jing Gao, and Aidong Zhang. Estimating causal effects on networked observational data via representation learning. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 6457–6466. ACM, 2022. 10

work page 2022
[18]

Johansson, Uri Shalit, Nathan Kallus, and David Sontag

Fredrik D. Johansson, Uri Shalit, Nathan Kallus, and David Sontag. Generalization bounds and representation learning for estimation of potential outcomes and causal effects.Journal of Machine Learning Research, 23(166):1–48, 2022

work page 2022
[19]

A fast and high quality multilevel scheme for partitioning irregular graphs.SIAM Journal on Scientific Computing, 20(1):359–392, 1998

George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs.SIAM Journal on Scientific Computing, 20(1):359–392, 1998

work page 1998
[20]

Towards optimal doubly robust estimation of heterogeneous causal effects

Edward H Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17(2):3008–3049, 2023

work page 2023
[21]

Semiparametric doubly robust targeted double machine learning: a review

Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. Handbook of Statistical Methods for Precision Medicine, pages 207–236, 2024

work page 2024
[22]

Edward H Kennedy, Zongming Ma, Matthew D McHugh, and Dylan S Small. Non-parametric methods for doubly robust estimation of continuous treatment effects.Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(4):1229–1245, 2017

work page 2017
[23]

Graph machine learning based doubly robust estimator for network causal effects.arXiv preprint arXiv:2403.11332, 2024

Seyedeh Baharan Khatami, Harsh Parikh, Haowei Chen, Sudeepa Roy, and Babak Salimi. Graph machine learning based doubly robust estimator for network causal effects.arXiv preprint arXiv:2403.11332, 2024

work page arXiv 2024
[24]

Metalearners for estimating heterogeneous treatment effects using machine learning.Proceedings of the national academy of sciences, 116(10):4156–4165, 2019

Sören R Künzel, Jasjeet S Sekhon, Peter J Bickel, and Bin Yu. Metalearners for estimating heterogeneous treatment effects using machine learning.Proceedings of the national academy of sciences, 116(10):4156–4165, 2019

work page 2019
[25]

Treatment and spillover effects under network interference.Review of Economics and Statistics, 102(2):368–380, 2020

Michael P Leung. Treatment and spillover effects under network interference.Review of Economics and Statistics, 102(2):368–380, 2020

work page 2020
[26]

Causal inference under approximate neighborhood interference.Economet- rica, 90(1):267–293, 2022

Michael P Leung. Causal inference under approximate neighborhood interference.Economet- rica, 90(1):267–293, 2022

work page 2022
[27]

Random graph asymptotics for treatment effect estimation under network interference.The Annals of Statistics, 50(4):2334–2358, 2022

Shuangning Li and Stefan Wager. Random graph asymptotics for treatment effect estimation under network interference.The Annals of Statistics, 50(4):2334–2358, 2022

work page 2022
[28]

Learning causal effects on hypergraphs

Jing Ma, Mengting Wan, Longqi Yang, Jundong Li, Brent Hecht, and Jaime Teevan. Learning causal effects on hypergraphs. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1202–1212, 2022

work page 2022
[29]

Causal inference under networked interference and interven- tion policy enhancement

Yunpu Ma and V olker Tresp. Causal inference under networked interference and interven- tion policy enhancement. InProceedings of The 24th International Conference on Artificial Intelligence and Statistics, pages 3700–3708. PMLR, 2021

work page 2021
[30]

Identification of endogenous social effects: The reflection problem.The review of economic studies, 60(3):531–542, 1993

Charles F Manski. Identification of endogenous social effects: The reflection problem.The review of economic studies, 60(3):531–542, 1993

work page 1993
[31]

Identification of treatment response with social interactions.The Economet- rics Journal, 16(1):S1–S23, 2013

Charles F Manski. Identification of treatment response with social interactions.The Economet- rics Journal, 16(1):S1–S23, 2013

work page 2013
[32]

Quasi-oracle estimation of heterogeneous treatment effects

Xinkun Nie and Stefan Wager. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021

work page 2021
[33]

Causal inference for social network data.Journal of the American Statistical Association, 119(545):597–611, 2024

Elizabeth L Ogburn, Oleg Sofrygin, Ivan Diaz, and Mark J Van der Laan. Causal inference for social network data.Journal of the American Statistical Association, 119(545):597–611, 2024

work page 2024
[34]

Validating causal inference methods

Harsh Parikh, Carlos Varjao, Louise Xu, and Eric Tchetgen Tchetgen. Validating causal inference methods. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Research, pages 17346–17358. PMLR...

work page 2022
[35]

Estimating causal effects of treatments in randomized and nonrandomized studies.Journal of educational Psychology, 66(5):688, 1974

Donald B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies.Journal of educational Psychology, 66(5):688, 1974

work page 1974
[36]

Debiased machine learning of conditional average treatment effects and other causal functions.The Econometrics Journal, 24(2):264–289, 2021

Vira Semenova and Victor Chernozhukov. Debiased machine learning of conditional average treatment effects and other causal functions.The Econometrics Journal, 24(2):264–289, 2021. 11

work page 2021
[37]

Towards understanding generalization of graph neural networks

Huayi Tang and Yong Liu. Towards understanding generalization of graph neural networks. In International Conference on Machine Learning, pages 33674–33719. PMLR, 2023

work page 2023
[38]

Estimation of causal peer influence effects

Panos Toulis and Edward Kao. Estimation of causal peer influence effects. InInternational conference on machine learning, pages 1489–1497. PMLR, 2013

work page 2013
[39]

Survey on generaliza- tion theory for graph neural networks.arXiv preprint arXiv:2503.15650, 2025

Antonis Vasileiou, Stefanie Jegelka, Ron Levie, and Christopher Morris. Survey on generaliza- tion theory for graph neural networks.arXiv preprint arXiv:2503.15650, 2025

work page arXiv 2025
[40]

Graph Attention Networks

Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[41]

Sarma, Michael M

Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. Dynamic graph cnn for learning on point clouds.ACM Transactions on Graphics (TOG), 38(5):146, 2019

work page 2019
[42]

Causal graph transformer for treatment effect estimation under unknown interference

Anpeng Wu, Haiyi Qiu, Zhengming Chen, Zijian Li, Ruoxuan Xiong, Fei Wu, and Kun Zhang. Causal graph transformer for treatment effect estimation under unknown interference. In Proceedings of the 13th International Conference on Learning Representations (ICLR), 2025. 12

work page 2025

[1] [1]

Estimating average causal effects under general interference, with application to a social network experiment

Peter M Aronow and Cyrus Samii. Estimating average causal effects under general interference, with application to a social network experiment. 2017

work page 2017

[2] [2]

Aronow and Cyrus Samii

Peter M. Aronow and Cyrus Samii. Estimating average causal effects under general interference, with application to a social network experiment.The Annals of Applied Statistics, 11(4):1912 – 1947, 2017

work page 1912

[3] [3]

Exposure to opposing views on social media can increase political polarization.Proceedings of the National Academy of Sciences, 115(37):9216–9221, 2018

Christopher A Bail, Laura P Argyle, Taylor W Brown, John P Bumpus, Haohan Chen, M Brooke Hunzaker, Jaemin Lee, Marcus Mann, Friedolin Merhout, and Alexander V olfovsky. Exposure to opposing views on social media can increase political polarization.Proceedings of the National Academy of Sciences, 115(37):9216–9221, 2018

work page 2018

[4] [4]

Heterogeneous treatment and spillover effects under clustered network interference.The Annals of Applied Statistics, 19(1):28– 55, 2025

Falco J Bargagli-Stoffi, Costanza Tortù, and Laura Forastiere. Heterogeneous treatment and spillover effects under clustered network interference.The Annals of Applied Statistics, 19(1):28– 55, 2025

work page 2025

[5] [5]

Springer Science & Business Media, 1998

Béla Bollobás.Modern graph theory, volume 184. Springer Science & Business Media, 1998

work page 1998

[6] [6]

A 61-million-person experiment in social influence and political mobilization.Nature, 489(7415):295–298, 2012

Robert M Bond, Christopher J Fariss, Jason J Jones, Adam DI Kramer, Cameron Marlow, Jaime E Settle, and James H Fowler. A 61-million-person experiment in social influence and political mobilization.Nature, 489(7415):295–298, 2012

work page 2012

[7] [7]

Doubly robust causal effect estimation under networked interference via targeted learning

Weilin Chen, Ruichu Cai, Zeqin Yang, Jie Qiao, Yuguang Yan, Zijian Li, and Zhifeng Hao. Doubly robust causal effect estimation under networked interference via targeted learning. In Proceedings of the 41st International Conference on Machine Learning, pages 6457–6485. PMLR, 2024

work page 2024

[8] [8]

Double/debiased machine learning for treatment and structural parameters, 2018

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters, 2018

work page 2018

[9] [9]

Identification and estimation of treatment and interference effects in observational studies on networks.Journal of the American Statistical Association, 116(534):901–918, 2021

Laura Forastiere, Edoardo M Airoldi, and Fabrizia Mealli. Identification and estimation of treatment and interference effects in observational studies on networks.Journal of the American Statistical Association, 116(534):901–918, 2021

work page 2021

[10] [10]

Orthogonal statistical learning.The Annals of Statistics, 51(3):879–908, 2023

Dylan J Foster and Vasilis Syrgkanis. Orthogonal statistical learning.The Annals of Statistics, 51(3):879–908, 2023

work page 2023

[11] [11]

Generalization and representational limits of graph neural networks

Vikas Garg, Stefanie Jegelka, and Tommi Jaakkola. Generalization and representational limits of graph neural networks. InInternational conference on machine learning, pages 3419–3430. PMLR, 2020

work page 2020

[12] [12]

Social networks and the identification of peer effects.Journal of Business & Economic Statistics, 31(3):253–264, 2013

Paul Goldsmith-Pinkham and Guido W Imbens. Social networks and the identification of peer effects.Journal of Business & Economic Statistics, 31(3):253–264, 2013

work page 2013

[13] [13]

Learning individual causal effects from networked observational data

Ruocheng Guo, Jundong Li, and Huan Liu. Learning individual causal effects from networked observational data. InProceedings of the 13th International Conference on Web Search and Data Mining (WSDM), pages 232–240. ACM, 2020

work page 2020

[14] [14]

Model-based regression adjustment with model-free covariates for network interference.Journal of Causal Inference, 11(1):20230005, 2023

Kevin Han and Johan Ugander. Model-based regression adjustment with model-free covariates for network interference.Journal of Causal Inference, 11(1):20230005, 2023

work page 2023

[15] [15]

Modeling interference for individual treatment effect estimation from networked observational data.ACM Transactions on Knowledge Discovery from Data, 18(3):1–21, 2023

Qiang Huang, Jing Ma, Jundong Li, Ruocheng Guo, Huiyan Sun, and Yi Chang. Modeling interference for individual treatment effect estimation from networked observational data.ACM Transactions on Knowledge Discovery from Data, 18(3):1–21, 2023

work page 2023

[16] [16]

Toward causal inference with interference

Michael G Hudgens and M Elizabeth Halloran. Toward causal inference with interference. Journal of the American Statistical Association, 103(482):832–842, 2008

work page 2008

[17] [17]

Estimating causal effects on networked observational data via representation learning

Song Jiang, Yaliang Li, Jing Gao, and Aidong Zhang. Estimating causal effects on networked observational data via representation learning. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 6457–6466. ACM, 2022. 10

work page 2022

[18] [18]

Johansson, Uri Shalit, Nathan Kallus, and David Sontag

Fredrik D. Johansson, Uri Shalit, Nathan Kallus, and David Sontag. Generalization bounds and representation learning for estimation of potential outcomes and causal effects.Journal of Machine Learning Research, 23(166):1–48, 2022

work page 2022

[19] [19]

A fast and high quality multilevel scheme for partitioning irregular graphs.SIAM Journal on Scientific Computing, 20(1):359–392, 1998

George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs.SIAM Journal on Scientific Computing, 20(1):359–392, 1998

work page 1998

[20] [20]

Towards optimal doubly robust estimation of heterogeneous causal effects

Edward H Kennedy. Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17(2):3008–3049, 2023

work page 2023

[21] [21]

Semiparametric doubly robust targeted double machine learning: a review

Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. Handbook of Statistical Methods for Precision Medicine, pages 207–236, 2024

work page 2024

[22] [22]

Edward H Kennedy, Zongming Ma, Matthew D McHugh, and Dylan S Small. Non-parametric methods for doubly robust estimation of continuous treatment effects.Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(4):1229–1245, 2017

work page 2017

[23] [23]

Graph machine learning based doubly robust estimator for network causal effects.arXiv preprint arXiv:2403.11332, 2024

Seyedeh Baharan Khatami, Harsh Parikh, Haowei Chen, Sudeepa Roy, and Babak Salimi. Graph machine learning based doubly robust estimator for network causal effects.arXiv preprint arXiv:2403.11332, 2024

work page arXiv 2024

[24] [24]

Metalearners for estimating heterogeneous treatment effects using machine learning.Proceedings of the national academy of sciences, 116(10):4156–4165, 2019

Sören R Künzel, Jasjeet S Sekhon, Peter J Bickel, and Bin Yu. Metalearners for estimating heterogeneous treatment effects using machine learning.Proceedings of the national academy of sciences, 116(10):4156–4165, 2019

work page 2019

[25] [25]

Treatment and spillover effects under network interference.Review of Economics and Statistics, 102(2):368–380, 2020

Michael P Leung. Treatment and spillover effects under network interference.Review of Economics and Statistics, 102(2):368–380, 2020

work page 2020

[26] [26]

Causal inference under approximate neighborhood interference.Economet- rica, 90(1):267–293, 2022

Michael P Leung. Causal inference under approximate neighborhood interference.Economet- rica, 90(1):267–293, 2022

work page 2022

[27] [27]

Random graph asymptotics for treatment effect estimation under network interference.The Annals of Statistics, 50(4):2334–2358, 2022

Shuangning Li and Stefan Wager. Random graph asymptotics for treatment effect estimation under network interference.The Annals of Statistics, 50(4):2334–2358, 2022

work page 2022

[28] [28]

Learning causal effects on hypergraphs

Jing Ma, Mengting Wan, Longqi Yang, Jundong Li, Brent Hecht, and Jaime Teevan. Learning causal effects on hypergraphs. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1202–1212, 2022

work page 2022

[29] [29]

Causal inference under networked interference and interven- tion policy enhancement

Yunpu Ma and V olker Tresp. Causal inference under networked interference and interven- tion policy enhancement. InProceedings of The 24th International Conference on Artificial Intelligence and Statistics, pages 3700–3708. PMLR, 2021

work page 2021

[30] [30]

Identification of endogenous social effects: The reflection problem.The review of economic studies, 60(3):531–542, 1993

Charles F Manski. Identification of endogenous social effects: The reflection problem.The review of economic studies, 60(3):531–542, 1993

work page 1993

[31] [31]

Identification of treatment response with social interactions.The Economet- rics Journal, 16(1):S1–S23, 2013

Charles F Manski. Identification of treatment response with social interactions.The Economet- rics Journal, 16(1):S1–S23, 2013

work page 2013

[32] [32]

Quasi-oracle estimation of heterogeneous treatment effects

Xinkun Nie and Stefan Wager. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021

work page 2021

[33] [33]

Causal inference for social network data.Journal of the American Statistical Association, 119(545):597–611, 2024

Elizabeth L Ogburn, Oleg Sofrygin, Ivan Diaz, and Mark J Van der Laan. Causal inference for social network data.Journal of the American Statistical Association, 119(545):597–611, 2024

work page 2024

[34] [34]

Validating causal inference methods

Harsh Parikh, Carlos Varjao, Louise Xu, and Eric Tchetgen Tchetgen. Validating causal inference methods. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Research, pages 17346–17358. PMLR...

work page 2022

[35] [35]

Estimating causal effects of treatments in randomized and nonrandomized studies.Journal of educational Psychology, 66(5):688, 1974

Donald B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies.Journal of educational Psychology, 66(5):688, 1974

work page 1974

[36] [36]

Debiased machine learning of conditional average treatment effects and other causal functions.The Econometrics Journal, 24(2):264–289, 2021

Vira Semenova and Victor Chernozhukov. Debiased machine learning of conditional average treatment effects and other causal functions.The Econometrics Journal, 24(2):264–289, 2021. 11

work page 2021

[37] [37]

Towards understanding generalization of graph neural networks

Huayi Tang and Yong Liu. Towards understanding generalization of graph neural networks. In International Conference on Machine Learning, pages 33674–33719. PMLR, 2023

work page 2023

[38] [38]

Estimation of causal peer influence effects

Panos Toulis and Edward Kao. Estimation of causal peer influence effects. InInternational conference on machine learning, pages 1489–1497. PMLR, 2013

work page 2013

[39] [39]

Survey on generaliza- tion theory for graph neural networks.arXiv preprint arXiv:2503.15650, 2025

Antonis Vasileiou, Stefanie Jegelka, Ron Levie, and Christopher Morris. Survey on generaliza- tion theory for graph neural networks.arXiv preprint arXiv:2503.15650, 2025

work page arXiv 2025

[40] [40]

Graph Attention Networks

Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[41] [41]

Sarma, Michael M

Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. Dynamic graph cnn for learning on point clouds.ACM Transactions on Graphics (TOG), 38(5):146, 2019

work page 2019

[42] [42]

Causal graph transformer for treatment effect estimation under unknown interference

Anpeng Wu, Haiyi Qiu, Zhengming Chen, Zijian Li, Ruoxuan Xiong, Fei Wu, and Kun Zhang. Causal graph transformer for treatment effect estimation under unknown interference. In Proceedings of the 13th International Conference on Learning Representations (ICLR), 2025. 12

work page 2025