pith. sign in

arxiv: 2509.10974 · v3 · submitted 2025-09-13 · 📊 stat.ME

A Latent Factor Panel Approach to Spatiotemporal Causal Inference

Pith reviewed 2026-05-18 16:20 UTC · model grok-4.3

classification 📊 stat.ME
keywords causal inferencespatiotemporal datalatent factorsunmeasured confoundingpanel datainterferenceenvironmental epidemiology
0
0 comments X

The pith

Latent factor models partially identify causal effects in spatiotemporal data with interference

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method to address unmeasured confounding in observational data varying over space and time by assuming that hidden confounders act on both exposures and outcomes through a shared latent factor structure. The factor confounding assumption allows causal effects to be partially identified even when there is interference between units. Adding assumptions that restrict how far that interference spreads across space and time then permits full point identification. A sympathetic reader would care because many real-world studies, such as those on environmental exposures, face confounders that do not vary smoothly and therefore defeat standard spatial or panel adjustments. Simulations show the approach cuts omitted-variable bias relative to common baselines, and the method is illustrated by estimating the effect of prenatal fine-particulate exposure on birth weight in California.

Core claim

Under a factor confounding assumption the effects of unmeasured confounders on exposures and outcomes are captured by a shared latent factor model. This assumption alone is sufficient to partially identify causal effects even when units interfere with one another. Additional assumptions that limit the degree of spatiotemporal interference are then sufficient to point-identify the effects.

What carries the argument

The factor confounding assumption that models unmeasured confounding through shared latent factors in a panel-data framework for spatiotemporal settings.

If this is right

  • The method substantially reduces omitted-variable bias relative to spatial-smoothing and standard panel-data baselines in simulation studies.
  • The approach can be applied to estimate the effect of prenatal PM2.5 exposure on birth weight using California data.
  • Partial identification of causal effects holds without requiring the assumption of no interference between units.
  • Point identification follows once the degree of spatiotemporal interference is limited by additional assumptions reasonable in most applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same latent-factor adjustment could be tested in other observational domains that feature both geographic structure and potential interference, such as economic or social-network panels.
  • Sensitivity checks that vary the number of latent factors or examine residual spatial correlation would help assess how strongly results depend on the factor model.
  • Connections to multivariate causal-inference techniques for multiple outcomes or treatments may offer further ways to strengthen identification.

Load-bearing premise

That the influence of unmeasured confounders on both exposures and outcomes can be represented by a shared latent factor model.

What would settle it

A setting in which the true confounding structure cannot be captured by low-rank latent factors, yet the method still produces estimates that match those from a randomized experiment or other gold-standard design in the same data.

Figures

Figures reproduced from arXiv: 2509.10974 by Alexander Franks, Jiaxi Wu.

Figure 1
Figure 1. Figure 1: Relative error of estimated causal effects under (a) fixed spatial effects and (b) [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Estimated causal dose-response curve and marginal causal effect under factor [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Left) Comparison of effect estimates when controlling for all [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Sampling distributions of βb across replications, faceted by confounding strength ρ (rows) and panel length T (columns). The dashed line marks true effect β = 1. For readability, we provide facet-specific y-scales. C.3 Neighborhood Interference Identification with our approach requires some negative controls but remains applicable in the presence of partial interference. To illustrate this, we examine spat… view at source ↗
Figure 5
Figure 5. Figure 5: Box plots of causal effect estimates under neighborhood interference across [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of estimated and true causal effects for each outcome under the [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Causal effect estimates under various distributions of unmeasured confounders and [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: True causal effect versus RMSE of the estimates derived by adjusting for [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Estimated direct, lag, and neighborhood effects under DML (NUC), IFE, and factor [PITH_FULL_IMAGE:figures/full_fig_p028_9.png] view at source ↗
read the original abstract

Unmeasured confounding can severely bias causal effect estimates from spatiotemporal observational data, especially when the confounders do not vary smoothly in time and space. In this work, we develop a method for addressing unmeasured confounding in spatiotemporal contexts by building on models from the panel data literature and methods in multivariate causal inference. Our method is based on a factor confounding assumption, which posits that effects of unmeasured confounders on exposures and outcomes can be captured by a shared latent factor model. Factor confounding is sufficient to partially identify causal effects, even when there is interference between units. Additional assumptions that limit the degree of spatiotemporal interference, reasonable in most applications, are sufficient to point identify the effects. Simulation studies demonstrate that the proposed approach can substantially reduce omitted variable bias relative to other spatial smoothing and panel data baselines. We illustrate our method in a case study of the effect of prenatal PM2.5 exposure on birth weight in California.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a latent factor panel model to address unmeasured confounding in spatiotemporal causal inference. It assumes that unmeasured confounders affect both exposures and outcomes through a shared low-dimensional latent factor structure. This assumption is claimed to allow partial identification of causal effects even under unit interference, with further assumptions on limited interference enabling point identification. The approach is evaluated through simulations demonstrating bias reduction relative to spatial smoothing and panel baselines, and applied to estimate the effect of prenatal PM2.5 exposure on birth weight using California data.

Significance. Should the identification strategy prove robust, the method could advance causal analysis in fields like environmental epidemiology and spatial statistics by handling complex confounding and interference without requiring strong smoothness assumptions on confounders. The simulation results and case study provide initial evidence of practical performance, though generalizability hinges on the factor confounding premise holding in real applications.

major comments (1)
  1. [Identification argument (abstract and §3)] The central claim that factor confounding suffices for partial identification under interference requires explicit verification that cross-unit confounding paths introduced by the interference kernel remain within the span of the shared latent factors. If the potential outcome model includes a low-rank factor term plus an interference kernel, it is not immediate that the observed data moments continue to bound the target parameter when the kernel is nonzero; this step appears to be the least secure and should be elaborated with a formal derivation or counterexample check.
minor comments (2)
  1. [Simulation studies] The simulation results should report standard errors or confidence intervals around bias estimates to allow assessment of the magnitude of improvement over baselines.
  2. [Abstract] Clarify the precise additional assumptions that limit the degree of spatiotemporal interference to achieve point identification, as these are described only at a high level in the abstract.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript. The major comment raises an important point about the rigor of the identification argument under interference, and we address it directly below with a plan for revision.

read point-by-point responses
  1. Referee: [Identification argument (abstract and §3)] The central claim that factor confounding suffices for partial identification under interference requires explicit verification that cross-unit confounding paths introduced by the interference kernel remain within the span of the shared latent factors. If the potential outcome model includes a low-rank factor term plus an interference kernel, it is not immediate that the observed data moments continue to bound the target parameter when the kernel is nonzero; this step appears to be the least secure and should be elaborated with a formal derivation or counterexample check.

    Authors: We agree that the current exposition of the identification result under nonzero interference would benefit from greater explicitness. In the revised manuscript we will expand Section 3 with a formal derivation that (i) writes the potential-outcome equation as the sum of the low-rank factor component and the interference kernel, (ii) shows that any additional cross-unit confounding paths generated by the kernel remain in the column space of the shared latent factors, and (iii) verifies that the observed-data moments therefore continue to deliver the same partial-identification bounds on the target causal parameter. We will also include a short discussion of the boundary case in which the kernel introduces confounding outside the factor span, confirming that our maintained assumptions rule this out. These additions will be placed immediately after the statement of the main identification theorem. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation rests on explicit modeling assumptions with external validation

full rationale

The paper states the factor confounding assumption as an explicit premise in the abstract and methods, then derives partial identification results from it under additional interference-limiting assumptions. Simulations and the California birth-weight case study serve as external checks rather than internal reductions. No quoted equations show a target causal effect being recovered by construction from a fitted parameter, nor does any load-bearing step collapse to a self-citation chain or renamed ansatz. The central claim therefore remains independent of its own fitted outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests primarily on the posited factor confounding assumption and the additional limited-interference assumptions needed for point identification; no free parameters or invented entities beyond the latent factors are detailed in the abstract.

axioms (2)
  • domain assumption Factor confounding assumption: effects of unmeasured confounders on exposures and outcomes can be captured by a shared latent factor model
    Invoked in abstract to enable partial identification of causal effects
  • domain assumption Limited spatiotemporal interference assumptions sufficient for point identification
    Added to move from partial to point identification; described as reasonable in most applications
invented entities (1)
  • shared latent factors no independent evidence
    purpose: To capture unmeasured confounders affecting both exposures and outcomes
    Introduced as modeling device under the factor confounding assumption; no independent evidence provided in abstract

pith-pipeline@v0.9.0 · 5676 in / 1290 out tokens · 41983 ms · 2026-05-18T16:20:19.455283+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

  1. [1]

    Anderson, T. W. and H. Rubin (1956). Statistical inference in factor analysis. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 5: Contributions to Econometrics, Industrial Research, and Psychometry , Volume 3.5, pp.\ 111--150. University of California Press

  2. [2]

    Arkhangelsky, D. and G. Imbens (2024). Causal models for longitudinal and panel data: A survey. The Econometrics Journal\/ 27\/ (3), C1--C61

  3. [3]

    Aronow, P. M. and C. Samii (2017). Estimating average causal effects under general interference, with application to a social network experiment. Annals of Applied Statistics\/ 11\/ (4), 1912--1947

  4. [4]

    Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica\/ 77\/ (4), 1229--1279

  5. [5]

    Bai, J. and S. Ng (2002). Determining the number of factors in approximate factor models. Econometrica\/ 70\/ (1), 191--221

  6. [6]

    Feller, and E

    Ben-Michael, E., A. Feller, and E. A. Stuart (2021). A trial emulation approach for policy evaluations with group-level longitudinal data. Epidemiology\/ 32\/ (4), 533--540

  7. [7]

    Bind, M.-A. (2019). Causal modeling in environmental health. Annual review of public health\/ 40\/ (1), 23--43

  8. [8]

    Bobb, J. F., M. F. Cruz, S. J. Mooney, A. Drewnowski, D. Arterburn, and A. J. Cook (2022). Accounting for spatial confounding in epidemiological studies with individual-level exposures: An exposure-penalized spline approach. Journal of the Royal Statistical Society Series A: Statistics in Society\/ 185\/ (3), 1271--1293

  9. [9]

    Chetverikov, M

    Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal\/ 21\/ (1), C1--C68

  10. [10]

    Cinelli, C. and C. Hazlett (2020). Making sense of sensitivity: Extending omitted variable bias. Journal of the Royal Statistical Society Series B-Statistical Methodology\/ 82\/ (1), 39--67

  11. [11]

    Diez Roux, A. V. (2001). Investigating neighborhood and area effects on health. American journal of public health\/ 91\/ (11), 1783--1789

  12. [12]

    Dupont, E., S. N. Wood, and N. H. Augustin (2022). Spatial+: a novel approach to spatial confounding. Biometrics\/ 78\/ (4), 1279--1290

  13. [13]

    Efron, B. (2009). Are a set of microarrays independent of each other? The annals of applied statistics\/ 3\/ (3), 922

  14. [14]

    Causey, K

    Ghosh, R., K. Causey, K. Burkart, S. Wozniak, A. Cohen, and M. Brauer (2021). Ambient and household pm2. 5 pollution and adverse perinatal outcomes: a meta-regression and analysis of attributable global burden for 204 countries and territories. PLoS medicine\/ 18\/ (9), e1003718

  15. [15]

    Datta, and E

    Gilbert, B., A. Datta, and E. Ogburn (2021). A causal inference framework for spatial confounding. arXiv preprint arXiv:2112.14946\/

  16. [16]

    Gong, C., J. Wang, Z. Bai, D. Q. Rich, and Y. Zhang (2022). Maternal exposure to ambient pm2. 5 and term birth weight: a systematic review and meta-analysis of effect estimates. Science of The Total Environment\/ 807 , 150744

  17. [17]

    Guan, Y., G. L. Page, B. J. Reich, M. Ventrucci, and S. Yang (2023). Spectral adjustment for spatial confounding. Biometrika\/ 110\/ (3), 699--719

  18. [18]

    Hern \'a n, M. A. and J. M. Robins (2020). Causal Inference: What If . Boca Raton: Chapman & Hall/CRC

  19. [19]

    Hodges, J. S. and B. J. Reich (2010). Adding spatially-correlated errors can mess up the fixed effect you love. The American Statistician\/ 64\/ (4), 325--334

  20. [20]

    Hoff, P. D. (2016). Limitations on detecting row covariance in the presence of column covariance. Journal of Multivariate Analysis\/ 152 , 249--258

  21. [21]

    Franks, M

    Kang, S., A. Franks, M. Audirac, D. Braun, and J. Antonelli (2025). Partial identification and unmeasured confounding with multiple treatments and multiple outcomes

  22. [22]

    Keller, J. P. and A. A. Szpiro (2020). Selecting a scale for spatial confounding adjustment. Journal of the Royal Statistical Society Series A: Statistics in Society\/ 183\/ (3), 1121--1143

  23. [23]

    Khan, K. and C. Berrett (2023). Re-thinking spatial confounding in spatial linear mixed models. arXiv preprint arXiv:2301.05743\/

  24. [24]

    Lumley, T. and L. Sheppard (2000). Assessing seasonal confounding and model selection bias in air pollution epidemiology using positive and negative control analyses. Environmetrics: The official journal of the International Environmetrics Society\/ 11\/ (6), 705--717

  25. [25]

    Schroeder, D

    Manson, S., J. Schroeder, D. Van Riper, K. Knowles, T. Kugler, F. Roberts, and S. Ruggles (2024). Ipums national historical geographic information system: Version 19.0. Dataset

  26. [26]

    Miao, W., W. Hu, E. L. Ogburn, and X.-H. Zhou (2022). Identifying effects of multiple treatments in the presence of unmeasured confounding. Journal of the American Statistical Association\/ , 1--15

  27. [27]

    Nyadanu, S. D., J. Dunne, G. A. Tessema, B. Mullins, B. Kumi-Boateng, M. L. Bell, B. Duko, and G. Pereira (2022). Prenatal exposure to ambient air pollution and adverse birth outcomes: an umbrella review of 36 systematic reviews and meta-analyses. Environmental pollution\/ 306 , 119465

  28. [28]

    Paciorek, C. J. (2010). The importance of scale for spatial-confounding bias and precision of spatial regression estimators. Statistical Science\/ 25\/ (1), 107

  29. [29]

    Choirat, and C

    Papadogeorgou, G., C. Choirat, and C. M. Zigler (2019). Adjusting for unmeasured spatial confounding with distance adjusted propensity score matching. Biostatistics\/ 20\/ (2), 256--272

  30. [30]

    Papadogeorgou, G. and S. Samanta (2023). Spatial causal inference in the presence of unmeasured confounding and interference. arXiv preprint arXiv:2303.08218\/

  31. [31]

    Prim, S.-N., Y. Guan, S. Yang, A. G. Rappold, K. L. Hill, W.-L. Tsai, C. Keeler, and B. J. Reich (2025). A spectral confounder adjustment for spatial regression with multiple exposures and outcomes. arXiv preprint arXiv:2506.09325\/

  32. [32]

    Reich, B. J., S. Yang, Y. Guan, A. B. Giffin, M. J. Miller, and A. Rappold (2021). A review of spatial causal inference methods for environmental and epidemiological applications. International Statistical Review\/ 89\/ (3), 605--634

  33. [33]

    Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology\/ 66\/ (5), 688

  34. [34]

    Schnell, P. M. and G. Papadogeorgou (2020). Mitigating unobserved spatial confounding when estimating the effect of supermarket access on cardiovascular disease deaths. The Annals of Applied Statistics\/ 14\/ (4), 2069--2095

  35. [35]

    Shen, S., C. Li, A. Van Donkelaar, N. Jacobs, C. Wang, and R. V. Martin (2024). Enhancing global estimation of fine particulate matter concentrations by including geophysical a priori information in deep learning. ACS ES&T Air\/ 1\/ (5), 332--345

  36. [36]

    Miao, and E

    Shi, X., W. Miao, and E. T. Tchetgen (2020). A selective review of negative control methods in epidemiology. Current epidemiology reports\/ 7 , 190--202

  37. [37]

    Stan modeling language users guide and reference manual, version 2.32

    Stan Development Team (2023). Stan modeling language users guide and reference manual, version 2.32

  38. [38]

    California Vital Data (Cal-ViDa), Birth Query

    State of California, Department of Public Health (2025). California Vital Data (Cal-ViDa), Birth Query . https://cal-vida.cdph.ca.gov/. Last modified Feb 1, 2025

  39. [39]

    Tchetgen Tchetgen, E. J., A. Ying, Y. Cui, X. Shi, and W. Miao (2024). An introduction to proximal causal inference. Statistical Science\/ 39\/ (3), 375--390

  40. [40]

    Thaden, H. and T. Kneib (2018). Structural equation models for dealing with spatial confounding. The American Statistician\/ 72\/ (3), 239--252

  41. [41]

    Tipping, M. E. and C. M. Bishop (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology\/ 61\/ (3), 611--622

  42. [42]

    Olson, A

    Uwak, I., N. Olson, A. Fuentes, M. Moriarty, J. Pulczinski, J. Lam, X. Xu, B. D. Taylor, S. Taiwo, K. Koehler, et al. (2021). Application of the navigation guide systematic review methodology to evaluate prenatal exposure to particulate matter air pollution and infant birth weight. Environment international\/ 148 , 106378

  43. [43]

    Gelman, and J

    Vehtari, A., A. Gelman, and J. Gabry (2017). Practical bayesian model evaluation using leave-one-out cross-validation and waic. Statistics and computing\/ 27 , 1413--1432

  44. [44]

    D'Amour, and A

    Zheng, J., A. D'Amour, and A. Franks (2025). Copula-based sensitivity analysis for multi-treatment causal inference with unobserved confounding. Journal of Machine Learning Research\/ 26\/ (36), 1--60

  45. [45]

    D’Amour, and A

    Zheng, J., A. D’Amour, and A. Franks (2022). Bayesian inference and partial identification in multi-treatment causal inference with unobserved confounding. In International Conference on Artificial Intelligence and Statistics , pp.\ 3608--3626. PMLR

  46. [46]

    Zheng, J., J. Wu, A. D’Amour, and A. Franks (2024). Sensitivity to unobserved confounding in studies with factor-structured outcomes. Journal of the American Statistical Association\/ 119\/ (547), 2026--2037