Network-aware IV Regression for Causal Node Discovery and Estimation

Dhrubajyoti Ghosh; Samhita Pal

arxiv: 2604.24969 · v1 · submitted 2026-04-27 · 📊 stat.ME

Network-aware IV Regression for Causal Node Discovery and Estimation

Samhita Pal , Dhrubajyoti Ghosh This is my paper

Pith reviewed 2026-05-08 02:04 UTC · model grok-4.3

classification 📊 stat.ME

keywords instrumental variablesgraph regularizationcausal inferencehigh-dimensional regressionnetwork datavariable selectionfused lassobrain imaging

0 comments

The pith

A two-stage IV regression method adds graph-fused penalties to recover sparse causal effects among network-structured exposures while tolerating some invalid instruments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a statistical procedure that first projects exposures onto valid instruments and then applies a fused penalty across edges of a known network to encourage similar causal coefficients for connected nodes. This produces an estimator that selects which exposures exert direct causal influence on an outcome and estimates their magnitudes with explicit error bounds. A reader would care because many domains supply both instrumental variables and a network of dependencies among candidate causes, yet standard high-dimensional IV methods treat predictors as isolated and can therefore miss or mis-estimate effects.

Core claim

The central claim is that embedding a graph-fused lasso penalty inside the second stage of an IV regression yields non-asymptotic guarantees on both coefficient estimation error and exact recovery of the causal support, even when some instruments are only partially valid. The fused penalty shrinks differences between coefficients of adjacent nodes in the supplied graph, thereby exploiting the network structure to improve sparsity recovery without requiring all instruments to be valid.

What carries the argument

The graph-fused penalty added to the second-stage objective, which penalizes absolute differences in regression coefficients between nodes joined by an edge in the given network.

If this is right

The estimator remains consistent for the causal support when the network encodes true similarity and the instruments satisfy standard IV conditions on average.
Non-asymptotic bounds quantify how estimation error decreases with sample size and how selection error depends on the minimum signal strength and the graph's connectivity.
The procedure can be applied directly to brain-region exposures with a known anatomical or functional connectivity graph to identify regions causally linked to cognitive scores.
Performance degrades gracefully when only a fraction of instruments are invalid, without needing to identify which ones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the network is misspecified, the fused penalty could pull unrelated coefficients together and increase false negatives; a data-driven way to learn or refine the graph would be a natural next step.
The same penalty structure could be inserted into other IV estimators, such as those based on deep networks or other regularizers, to handle networked predictors beyond linear models.
In applications where the graph itself is uncertain, one could treat the penalty strength as a tuning parameter and examine how selected causal sets change with different graphs.

Load-bearing premise

The supplied network graph correctly identifies pairs of nodes whose causal effects on the outcome are similar enough that penalizing their coefficient difference improves recovery.

What would settle it

Generate synthetic data from a known sparse causal model whose true coefficient vector is not constant on the given graph edges; if the method then selects fewer true causal nodes or produces larger estimation error than a plain two-stage IV estimator that ignores the graph, the performance gain disappears.

Figures

Figures reproduced from arXiv: 2604.24969 by Dhrubajyoti Ghosh, Samhita Pal.

**Figure 1.** Figure 1: Boxplots of MCC for IVGL and IVL across 100 simulations for view at source ↗

**Figure 2.** Figure 2: Boxplots of MCC for IVGL-S and IVGL across 100 simulations for view at source ↗

**Figure 3.** Figure 3: Orthogonal slice views (coronal, sagittal, axial) of baseline T1-weighted MRI scans view at source ↗

**Figure 4.** Figure 4: Selected ROIs overlayed on DKT for Slice 40, 60 and 80 view at source ↗

read the original abstract

Estimating causal effects from high-dimensional, structured exposures is a fundamental challenge in modern applications ranging from neuroscience and finance to environmental science. While the literature has addressed high-dimensional instrumental variable (IV) regression, and separately leveraged graph structure in penalized regression, the integration of both, especially for causal support recovery in the presence of latent confounding, remains unexplored. In this work, we propose a novel two-stage regression framework that incorporates instrumental variables and graph-based regularization to uncover sparse causal effects among network-structured exposures. Our method accommodates both valid and partially invalid instruments, and encourages structural similarity among connected predictors through a graph-fused penalty. We establish non-asymptotic guarantees for estimation accuracy and causal variable selection, and demonstrate that our approach yields improved performance over existing methods that ignore network dependencies or invalid IVs. Applied to ADNI brain imaging and genetic data, our method identifies interpretable causal ROIs associated with cognitive outcomes, underscoring the utility of graph-assisted IV regression in neuroscience and beyond.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper combines high-dimensional IV regression with a graph-fused penalty for causal node selection under latent confounding, but the non-asymptotic guarantees hinge on an unexamined assumption that the given network encodes effect similarity.

read the letter

The central new piece is the two-stage estimator that folds a graph penalty into IV regression to recover sparse causal effects among networked exposures while allowing some invalid instruments. The abstract positions this integration as unexplored, and the ADNI application shows it can surface interpretable ROIs tied to cognitive scores, which is a concrete plus over purely statistical IV methods that ignore structure.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-stage IV regression framework that incorporates a graph-fused penalty to estimate sparse causal effects among network-structured exposures. The method accommodates valid and partially invalid instruments, establishes non-asymptotic guarantees for estimation accuracy and causal variable selection, reports improved performance over baselines that ignore network structure or invalid IVs, and applies the approach to ADNI brain imaging and genetic data to identify interpretable causal ROIs linked to cognitive outcomes.

Significance. If the non-asymptotic guarantees are rigorously derived and the performance gains hold under realistic conditions, the work could meaningfully advance high-dimensional causal inference by integrating graph regularization with IV methods, with particular relevance to neuroscience applications where network data is common. The explicit handling of partially invalid instruments and the real-data application are strengths.

major comments (2)

[Theoretical analysis and method definition] The central theoretical claims rest on the assumption that the given network graph correctly encodes similarity of causal effects among connected nodes so that the graph-fused penalty improves both estimation and support recovery. No sensitivity analysis, robustness bounds, or misspecification results are provided for the case when this assumption is violated (a common practical concern with brain networks or other empirical graphs). This assumption is load-bearing for the stated non-asymptotic guarantees and the claim of improved performance.
[Abstract and theoretical results] The abstract asserts non-asymptotic guarantees for estimation accuracy and causal variable selection, yet the provided text supplies no derivation details, explicit assumptions on the penalty term, or simulation setups with error bars or variability measures. Without these, it is impossible to verify whether the bounds properly incorporate the fused penalty and remain valid under the paper's stated conditions.

minor comments (2)

[Numerical experiments] Simulation results should include explicit error bars, number of replications, and sensitivity checks for the regularization parameter and graph density to support the performance claims.
[Method section] Notation for the two-stage estimator and the graph-fused penalty term should be introduced with a clear table of symbols to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of our theoretical framework and presentation. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: The central theoretical claims rest on the assumption that the given network graph correctly encodes similarity of causal effects among connected nodes so that the graph-fused penalty improves both estimation and support recovery. No sensitivity analysis, robustness bounds, or misspecification results are provided for the case when this assumption is violated (a common practical concern with brain networks or other empirical graphs). This assumption is load-bearing for the stated non-asymptotic guarantees and the claim of improved performance.

Authors: We agree that the graph-fused penalty is predicated on the assumption that the provided network encodes similarity in causal effects, which is standard in the graph-regularized literature but merits explicit robustness analysis. In the revised manuscript we will add a new subsection (Section 3.4) deriving sensitivity bounds on the estimation error under additive perturbations to the graph Laplacian, along with a simulation study that systematically perturbs the adjacency matrix and reports degradation in support recovery and MSE. These additions will clarify the range of graph misspecification under which the non-asymptotic guarantees and performance gains remain meaningful. revision: yes
Referee: The abstract asserts non-asymptotic guarantees for estimation accuracy and causal variable selection, yet the provided text supplies no derivation details, explicit assumptions on the penalty term, or simulation setups with error bars or variability measures. Without these, it is impossible to verify whether the bounds properly incorporate the fused penalty and remain valid under the paper's stated conditions.

Authors: The non-asymptotic results, including the explicit assumptions on the graph-fused penalty (Assumption 3.2) and the incorporation of the fused term into the error bounds (Theorem 3.1 and Corollary 3.2), are fully derived in Section 3. Simulation protocols are described in Section 4.1, with all reported metrics computed as averages over 100 independent replications and accompanied by standard errors; the corresponding figures already contain error bars. To improve accessibility we will (i) expand the abstract by one sentence referencing the key assumptions and the location of the proofs, and (ii) add a short paragraph in Section 4.1 explicitly stating the number of replications and the variability measures used. No new derivations are required. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The abstract and provided context describe a two-stage IV regression with graph-fused penalty plus non-asymptotic guarantees for estimation and selection. No equations, fitted parameters renamed as predictions, or self-citation chains are quoted that reduce the central claims to inputs by construction. The network is treated as an external given input for the penalty term, and the guarantees are positioned as independent results rather than tautological. This matches the default expectation of a self-contained methods paper with external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review based solely on abstract; no specific free parameters, axioms, or invented entities are detailed in the provided text.

pith-pipeline@v0.9.0 · 5465 in / 1199 out tokens · 56662 ms · 2026-05-08T02:04:57.875153+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages

[1]

J., Convit, A., De Santi, S., Wegiel, J., Tarshish, C

Bobinski, M., De Leon, M. J., Convit, A., De Santi, S., Wegiel, J., Tarshish, C. Y., Saint Louis, L. & Wisniewski, H. M. (1999), ‘Mri of entorhinal cortex in mild alzheimer’s disease’,The Lancet353(9146), 38–40. 29 Braak, H. & Braak, E. (1991), ‘Neuropathological stageing of alzheimer-related changes’, Acta neuropathologica82(4), 239–259. Chen, C., Ren, M...

work page 1999
[2]

P., Jerby-Arnon, L., Marjanovic, N

Dixit, A., Parnas, O., Li, B., Chen, J., Fulco, C. P., Jerby-Arnon, L., Marjanovic, N. D., Dionne, D., Burks, T., Raychowdhury, R. et al. (2016), ‘Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens’,cell167(7), 1853–

work page 2016
[3]

Fan, J. & Lv, J. (2008), ‘Sure independence screening for ultrahigh dimensional feature space’, Journal of the Royal Statistical Society Series B: Statistical Methodology70(5), 849–911. Frisoni, G. B., Ganzola, R., Canu, E., Rüb, U., Pizzini, F. B., Alessandrini, F., Zoccatelli, G., Beltramello, A., Caltagirone, C. & Thompson, P. M. (2008), ‘Mapping local...

work page 2008
[4]

J., Killiany, R

Greene, S. J., Killiany, R. J., Initiative, A. D. N. et al. (2010), ‘Subregions of the inferior parietal lobule are affected in the progression to alzheimer’s disease’,Neurobiology of aging31(8), 1304–1311. 30 Gu, S.-C., Shen, C.-Y., Deng, J.-Q., Zhang, W., Zeng, S.-L., Hao, Y., Su, H. & Ye, Q. (2025), ‘The human cerebral cortex morphology in neuropsychia...

work page 2010
[5]

Igarashi, K. M. (2023), ‘Entorhinal cortex dysfunction in alzheimer’s disease’,Trends in neurosciences46(2), 124–136. Jack, C. R., Knopman, D. S., Jagust, W. J., Petersen, R. C., Weiner, M. W., Aisen, P. S., Shaw, L. M., Vemuri, P., Wiste, H. J., Weigand, S. D. et al. (2013), ‘Tracking pathophysiological processes in alzheimer’s disease: an updated hypoth...

work page 2023
[6]

A., Deng, Y

Knutson, K. A., Deng, Y. & Pan, W. (2020), ‘Implicating causal brain imaging endopheno- types in alzheimer’s disease using multivariable iwas and gwas summary data’,NeuroImage 223, 117347. Le, C. M. & Li, T. (2022), ‘Linear regression and its inference on noisy network-linked data’, Journal of the Royal Statistical Society Series B: Statistical Methodolog...

work page arXiv 2020

[1] [1]

J., Convit, A., De Santi, S., Wegiel, J., Tarshish, C

Bobinski, M., De Leon, M. J., Convit, A., De Santi, S., Wegiel, J., Tarshish, C. Y., Saint Louis, L. & Wisniewski, H. M. (1999), ‘Mri of entorhinal cortex in mild alzheimer’s disease’,The Lancet353(9146), 38–40. 29 Braak, H. & Braak, E. (1991), ‘Neuropathological stageing of alzheimer-related changes’, Acta neuropathologica82(4), 239–259. Chen, C., Ren, M...

work page 1999

[2] [2]

P., Jerby-Arnon, L., Marjanovic, N

Dixit, A., Parnas, O., Li, B., Chen, J., Fulco, C. P., Jerby-Arnon, L., Marjanovic, N. D., Dionne, D., Burks, T., Raychowdhury, R. et al. (2016), ‘Perturb-seq: dissecting molecular circuits with scalable single-cell rna profiling of pooled genetic screens’,cell167(7), 1853–

work page 2016

[3] [3]

Fan, J. & Lv, J. (2008), ‘Sure independence screening for ultrahigh dimensional feature space’, Journal of the Royal Statistical Society Series B: Statistical Methodology70(5), 849–911. Frisoni, G. B., Ganzola, R., Canu, E., Rüb, U., Pizzini, F. B., Alessandrini, F., Zoccatelli, G., Beltramello, A., Caltagirone, C. & Thompson, P. M. (2008), ‘Mapping local...

work page 2008

[4] [4]

J., Killiany, R

Greene, S. J., Killiany, R. J., Initiative, A. D. N. et al. (2010), ‘Subregions of the inferior parietal lobule are affected in the progression to alzheimer’s disease’,Neurobiology of aging31(8), 1304–1311. 30 Gu, S.-C., Shen, C.-Y., Deng, J.-Q., Zhang, W., Zeng, S.-L., Hao, Y., Su, H. & Ye, Q. (2025), ‘The human cerebral cortex morphology in neuropsychia...

work page 2010

[5] [5]

Igarashi, K. M. (2023), ‘Entorhinal cortex dysfunction in alzheimer’s disease’,Trends in neurosciences46(2), 124–136. Jack, C. R., Knopman, D. S., Jagust, W. J., Petersen, R. C., Weiner, M. W., Aisen, P. S., Shaw, L. M., Vemuri, P., Wiste, H. J., Weigand, S. D. et al. (2013), ‘Tracking pathophysiological processes in alzheimer’s disease: an updated hypoth...

work page 2023

[6] [6]

A., Deng, Y

Knutson, K. A., Deng, Y. & Pan, W. (2020), ‘Implicating causal brain imaging endopheno- types in alzheimer’s disease using multivariable iwas and gwas summary data’,NeuroImage 223, 117347. Le, C. M. & Li, T. (2022), ‘Linear regression and its inference on noisy network-linked data’, Journal of the Royal Statistical Society Series B: Statistical Methodolog...

work page arXiv 2020