Spatial causal inference in the presence of preferential sampling to study the impacts of marine protected areas
Pith reviewed 2026-05-23 18:48 UTC · model grok-4.3
The pith
A joint spatial model identifies the causal effect of marine protected areas while correcting for preferential sampling bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a spatial causal inference method that simultaneously accounts for unmeasured spatial confounders in both the sampling process and the treatment allocation. We prove the identifiability of key parameters in the model and the consistency of the posterior distributions of those parameters. Simulation studies confirm that the causal effect of interest can be reliably estimated, and the Australian MPA application shows evidence of preferential sampling whose proper accounting changes the causal effect estimate.
What carries the argument
The joint hierarchical model for sampling locations, treatment assignment, and response that links them through shared spatial random effects representing unmeasured confounders.
If this is right
- The causal effect of MPAs on fish biomass is identifiable and its posterior is consistent under the joint model.
- Simulation studies recover the true causal effect when data are generated from the assumed model.
- In the Australian coast data, evidence of preferential sampling exists and adjusting for it alters the estimated causal effect.
- Standard separate modeling of sampling, treatment, and outcome would leave the effect unidentified.
Where Pith is reading between the lines
- The same joint-modeling strategy could be used for causal questions in other spatial domains where data collection depends on the outcome, such as pollution monitoring or forest inventories.
- Policy evaluations that treat sampling locations as fixed may systematically misstate the benefits of protected areas or regulations.
- Extensions that relax the parametric form of the spatial random effects while preserving identifiability would widen applicability.
Load-bearing premise
The joint model for the sampling locations, treatment assignment, and response correctly captures the dependence induced by unmeasured spatial confounders.
What would settle it
A simulation study or real dataset in which the posterior mean of the causal effect changes substantially when the preferential-sampling component is removed from the joint model.
Figures
read the original abstract
Marine Protected Areas (MPAs) have been established globally to conserve marine resources. Given their maintenance costs and impact on commercial fishing, it is critical to evaluate their effectiveness to support future conservation. In this paper, we use data collected from the Australian coast to estimate the effect of MPAs on biodiversity. Environmental studies such as these are often observational, and processes of interest exhibit spatial dependence, which presents challenges in estimating the causal effects. Spatial data can also be subject to preferential sampling, where the sampling locations are related to the policy and the response variable, further complicating inference and prediction. To address these challenges, we propose a spatial causal inference method that simultaneously accounts for unmeasured spatial confounders in both the sampling process and the treatment allocation. We prove the identifiability of key parameters in the model and the consistency of the posterior distributions of those parameters. We show via simulation studies that the causal effect of interest can be reliably estimated under the proposed model. The proposed method is applied to assess the effect of MPAs on fish biomass. We find evidence of preferential sampling and that properly accounting for this source of bias impacts the estimate of the causal effect.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a joint spatial model for sampling locations, treatment assignment (MPA status), and response (biodiversity/fish biomass) to estimate causal effects of marine protected areas while accounting for preferential sampling and unmeasured spatial confounders. It claims to prove identifiability of key parameters and posterior consistency, shows via simulations that the causal effect can be reliably estimated, and applies the method to Australian coast data, finding evidence of preferential sampling that impacts the causal effect estimate.
Significance. If the identifiability and consistency results hold under the stated model, the work provides a principled approach to causal inference in spatially dependent observational data with preferential sampling, a common issue in environmental policy evaluation. The explicit proofs, simulation validation, and real-data application are strengths that could inform future MPA assessments.
major comments (2)
- [identifiability section / abstract] The identifiability proof (referenced in the abstract and likely detailed in the model/identifiability section): the result conditions on a specific joint distribution form for the sampling intensity, treatment propensity, and outcome that relies on the shared spatial process (typically a Gaussian random field) inducing exactly the dependence needed to separate sampling bias from spatial confounding. Without explicit conditions on the covariance kernel or link functions, or verification that no additional latent factors are present, the separation may not hold generally, undermining the claim that the causal parameters are identifiable.
- [theoretical results] Posterior consistency claim (abstract and theoretical results section): consistency is stated for the key parameters, but the proof sketch appears to inherit the same joint-model restrictions as the identifiability result; if the spatial process specification is misspecified relative to the true data-generating process, consistency may fail even if identifiability holds conditionally.
minor comments (2)
- [abstract/introduction] The abstract and introduction could more clearly distinguish the proposed joint model from existing spatial causal methods that handle confounding but not preferential sampling.
- [simulation studies] Simulation section: provide more detail on the range of spatial kernels and link functions tested to assess robustness of the identifiability result.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive comments, which help clarify the scope of our theoretical results. We respond to each major comment below.
read point-by-point responses
-
Referee: [identifiability section / abstract] The identifiability proof (referenced in the abstract and likely detailed in the model/identifiability section): the result conditions on a specific joint distribution form for the sampling intensity, treatment propensity, and outcome that relies on the shared spatial process (typically a Gaussian random field) inducing exactly the dependence needed to separate sampling bias from spatial confounding. Without explicit conditions on the covariance kernel or link functions, or verification that no additional latent factors are present, the separation may not hold generally, undermining the claim that the causal parameters are identifiable.
Authors: We agree that making the modeling assumptions fully explicit strengthens the presentation. The identifiability result is derived under a joint model in which a single Gaussian random field with Matérn covariance drives the dependence among the sampling intensity, treatment propensity, and outcome processes, with logistic links and no additional latent factors. The proof uses the positive-definiteness of the kernel and the strict monotonicity of the links to separate the shared spatial effect from the causal parameter. In the revision we will add an explicit statement of these conditions (kernel class, link properties, and absence of further latent structure) immediately preceding the identifiability theorem. revision: yes
-
Referee: [theoretical results] Posterior consistency claim (abstract and theoretical results section): consistency is stated for the key parameters, but the proof sketch appears to inherit the same joint-model restrictions as the identifiability result; if the spatial process specification is misspecified relative to the true data-generating process, consistency may fail even if identifiability holds conditionally.
Authors: The consistency theorem is stated under correct specification of the joint model, which is the conventional setting for posterior consistency results. We acknowledge that misspecification of the spatial process can invalidate consistency in finite samples. Our simulation section already examines performance under several departures from the exact model; we will expand the discussion to note the conditional nature of the consistency guarantee and its practical implications. revision: partial
Circularity Check
No circularity: identifiability and consistency proved under explicit joint model with external simulation validation
full rationale
The paper introduces a joint spatial model for sampling locations, treatment assignment, and response to handle preferential sampling and unmeasured confounders. It states that identifiability of causal parameters and posterior consistency are proved for this model, with simulation studies confirming reliable estimation of the causal effect. No quoted equations or self-citations reduce any prediction or parameter to a fitted input by construction, nor does any load-bearing step rely on prior author work as an unverified uniqueness theorem. The derivation chain is therefore self-contained, with the central claims resting on model-specific proofs and independent simulation checks rather than tautological redefinitions or renamings.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Adler, R. J. and Taylor, J. E. (2009) Random fields and geometry. Springer Science & Business Media
work page 2009
-
[2]
Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2003) Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC
work page 2003
-
[3]
Beaman, R. J. (2023) AusBathyTopo (Australia) 250m 2023 - A High-resolution Depth Model (20230004C) . ://doi.org/10.26186/148758
-
[4]
Boswell, K. M., Wells, R., Cowan Jr, J. H. and Wilson, C. A. (2010) Biomass, density, and size distributions of fishes associated with a large-scale artificial reef complex in the G ulf of M exico. Bulletin of Marine Science, 86, 879--889
work page 2010
-
[5]
Campbell, S. J., Darling, E. S., Pardede, S., Ahmadia, G., Mangubhai, S., Amkieltiela, Estradivari and Maire, E. (2020) Fishing restrictions and remoteness deliver conservation outcomes for Indonesia's coral reef fisheries . Conservation Letters, 13, e12698
work page 2020
-
[6]
Center for International Earth Science Information Network - CIESIN - Columbia University (2018) Gridded Population of the World, Version 4.11 (GPWv4): Population Count, Revision 11 . ://doi.org/10.7927/H4JW8BX5. Accessed 15th August 2023
-
[7]
Cinner, J. E., Graham, N. A., Huchery, C. and MacNeil, M. A. (2013) Global effects of local human population density and distance to markets on the condition of coral reef fisheries . Conservation Biology, 27, 453--458
work page 2013
-
[8]
Davis, M. L., Neelon, B., Nietert, P. J., Hunt, K. J., Burgette, L. F., Lawson, A. B. and Egede, L. E. (2019) Addressing geographic confounding through spatial propensity scores: a study of racial disparities in diabetes. Statistical Methods in Medical Research, 28, 734--748
work page 2019
-
[9]
De Oliveira, V. and Han, Z. (2022) On information about covariance parameters in Gaussian Mat \'e rn random fields . Journal of Agricultural, Biological and Environmental Statistics, 27, 690--712
work page 2022
-
[10]
Desbureaux, S., Girard, J., Dalongeville, A., Devillers, R., Mouillot, D., Jiddawi, N., Sanchez, L., Velez, L., Mathon, L. and Leblois, A. (2024) The long-term impacts of marine protected areas on fish catch and socioeconomic development in tanzania. Conservation Letters, e13048
work page 2024
-
[11]
Devillers, R., Pressey, R. L., Grech, A., Kittinger, J. N., Edgar, G. J., Ward, T. and Watson, R. (2015) Reinventing residual reserves in the sea: are we favouring ease of establishment over need for protection? Aquatic conservation: marine and freshwater ecosystems, 25, 480--504
work page 2015
-
[12]
Diggle, P. J., Menezes, R. and Su, T.-l. (2010) Geostatistical inference under preferential sampling. Journal of the Royal Statistical Society Series C: Applied Statistics, 59, 191--232
work page 2010
- [13]
-
[14]
Dixon, A. M., Puotinen, M., Ramsay, H. A. and Beger, M. (2022) Coral reef exposure to damaging tropical cyclone waves in a warming climate . Earth's Future, 10, e2021EF002600
work page 2022
-
[15]
(1999) Essentials of stochastic processes, vol
Durrett, R. (1999) Essentials of stochastic processes, vol. 1. Springer
work page 1999
-
[16]
Edgar, G. J., Stuart-Smith, R. D., Willis, T. J., Kininmonth, S., Baker, S. C., Banks, S., Barrett, N. S., Becerro, M. A., Bernard, A. T., Berkhout, J. et al. (2014) Global conservation outcomes depend on marine protected areas with five key features. Nature, 506, 216--220
work page 2014
-
[17]
://www.arcgis.com/home/item.html?id=dfab3b294ab24961899b2a98e9e8cd3d
ESRI (2023) World Cities . ://www.arcgis.com/home/item.html?id=dfab3b294ab24961899b2a98e9e8cd3d. Accessed: 6th September 2023
work page 2023
-
[18]
Ferraro, P. J., Sanchirico, J. N. and Smith, M. D. (2019) Causal inference in coupled human and natural systems. Proceedings of the National Academy of Sciences, 116, 5311--5318
work page 2019
-
[19]
Gelfand, A. E., Sahu, S. K. and Holland, D. M. (2012) On the effect of preferential sampling in spatial prediction. Environmetrics, 23, 565--578
work page 2012
-
[20]
Gelfand, A. E. and Schliep, E. M. (2018) Bayesian inference and computing for spatial point patterns. In NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 10, i--125. JSTOR
work page 2018
-
[21]
Ghosal, S. and Roy, A. (2006) Posterior consistency of gaussian process prior for nonparametric binary regression. The Annals of Statistics
work page 2006
-
[22]
Ghosal, S. and Van der Vaart, A. (2017) Fundamentals of nonparametric Bayesian inference, vol. 44. Cambridge University Press
work page 2017
-
[23]
Ghosh, J. K. and Ramamoorthi, R. V. (2003) Bayesian nonparametrics. Springer series in statistics. New York: Springer-Verlag
work page 2003
-
[24]
Gill, D. A., Cheng, S. H., Glew, L., Aigner, E., Bennett, N. J. and Mascia, M. B. (2019) Social synergies, tradeoffs, and equity in marine conservation impacts . Annual Review of Environment and Resources, 44, 347--372
work page 2019
-
[25]
Gill, D. A., Lester, S. E., Free, C. M., Pfaff, A., Iversen, E., Reich, B. J., Yang, S., Ahmadia, G., Andradi-Brown, D. A., Darling, E. S. et al. (2024) A diverse portfolio of marine protected areas can better advance global conservation and equity . Proceedings of the National Academy of Sciences, 121, e2313205121
work page 2024
-
[26]
Gill, D. A., Mascia, M. B., Ahmadia, G. N., Glew, L., Lester, S. E., Barnes, M., Craigie, I., Darling, E. S., Free, C. M., Geldmann, J. et al. (2017) Capacity shortfalls hinder the performance of marine protected areas globally. Nature, 543, 665--669
work page 2017
-
[27]
Gockenbach, M. S. (2011) Finite-dimensional linear algebra. CRC Press
work page 2011
-
[28]
P., Kingston, N., Laffoley, D., Sala, E., Claudet, J
Grorud-Colvert, K., Sullivan-Stack, J., Roberts, C., Constant, V., Horta e Costa, B., Pike, E. P., Kingston, N., Laffoley, D., Sala, E., Claudet, J. et al. (2021) The MPA guide: A framework to achieve global goals for the ocean. Science, 373, eabf0861
work page 2021
-
[29]
Guan, Y., Page, G. L., Reich, B. J., Ventrucci, M. and Yang, S. (2023) Spectral adjustment for spatial confounding. Biometrika, 110, 699--719
work page 2023
-
[30]
Hern \'a n, M. A. and Robins, J. M. (2010) Causal inference
work page 2010
-
[31]
Imbens, G. W. and Rubin, D. B. (2015) Causal inference in statistics, social, and biomedical sciences. Cambridge University Press
work page 2015
-
[32]
(2018) Applying IUCN’s Global Conservation Standards to Marine Protected Areas (MPAs)
IUCN, W. (2018) Applying IUCN’s Global Conservation Standards to Marine Protected Areas (MPAs). Delivering effective conservation action through MPAs, to secure ocean health & sustainable development. Version 1.0
work page 2018
-
[33]
Jarner, M. F., Diggle, P. and Chetwynd, A. G. (2002) Estimation of spatial variation in risk using matched case-control data. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 44, 936--945
work page 2002
-
[34]
Kamat, V. R. (2014) " the ocean is our farm": marine conservation, food insecurity, and social suffering in southeastern tanzania. Human Organization, 73, 289--298
work page 2014
-
[35]
Knowles, J. E., Doyle, E., Schill, S. R., Roth, L. M., Milam, A. and Raber, G. T. (2015) Establishing a marine conservation baseline for the insular Caribbean . Marine Policy, 60, 84--97
work page 2015
-
[36]
(1991) Bounds for modified bessel functions
Laforgia, A. (1991) Bounds for modified bessel functions. Journal of Computational and Applied Mathematics, 34, 263--267
work page 1991
-
[37]
Li, F., Ding, P. and Mealli, F. (2023) Bayesian causal inference: a critical review. Philosophical Transactions of the Royal Society A, 381, 20220153
work page 2023
-
[38]
Marques, I., Kneib, T. and Klein, N. (2022) Mitigating spatial confounding by explicitly correlating Gaussian random fields . Environmetrics, 33, e2727
work page 2022
-
[39]
M ller, J., Syversveen, A. R. and Waagepetersen, R. P. (1998) Log Gaussian Cox processes. Scandinavian Journal of Statistics, 25, 451--482
work page 1998
-
[40]
Neal, R. M. et al. (2011) MCMC using H amiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2, 2
work page 2011
-
[41]
Olver, F. W. (2010) NIST handbook of mathematical functions . Cambridge university press
work page 2010
-
[42]
Pati, D., Reich, B. J. and Dunson, D. B. (2011) Bayesian geostatistical modelling with informative sampling locations. Biometrika, 98, 35--48
work page 2011
-
[43]
J., Yang, S., Guan, Y., Giffin, A
Reich, B. J., Yang, S., Guan, Y., Giffin, A. B., Miller, M. J. and Rappold, A. (2021) A review of spatial causal inference methods for environmental and epidemiological applications. International Statistical Review, 89, 605--634
work page 2021
-
[44]
Rosenbaum, P. R. and Rubin, D. B. (1983) The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41--55
work page 1983
-
[45]
Rubin, D. B. (1974) Estimating Causal effects of treatments in randomized and nonrandomized studies . Journal of Educational Psychology, 66, 688
work page 1974
-
[46]
The Annals of Statistics, 34--58
--- (1978) Bayesian inference for causal effects: The role of randomization. The Annals of Statistics, 34--58
work page 1978
-
[47]
Schliep, E. M., Wikle, C. K. and Daw, R. (2023) Correcting for informative sampling in spatial covariance estimation and kriging predictions. Journal of Geographical Systems, 1--27
work page 2023
-
[48]
Schnell, P. M. and Papadogeorgou, G. (2020) Mitigating unobserved spatial confounding when estimating the effect of supermarket access on cardiovascular disease deaths . The Annals of Applied Statistics, 14, 2069 -- 2095
work page 2020
-
[49]
Schwartz, L. (1965) On bayes procedures. Zeitschrift f \"u r Wahrscheinlichkeitstheorie und verwandte Gebiete , 4, 10--26
work page 1965
-
[50]
Tyberghein, L., Verbruggen, H., Pauly, K., Troupin, C., Mineur, F. and De Clerck, O. (2012) Bio-ORACLE : a global environmental dataset for marine species distribution modelling. Global Ecology and Biogeography, 21, 272--281
work page 2012
- [51]
-
[52]
Wessel, P. and Smith, W. H. (1996) A global, self-consistent, hierarchical, high-resolution shoreline database. Journal of Geophysical Research: Solid Earth, 101, 8741--8743
work page 1996
-
[53]
Williams, C. K. and Rasmussen, C. E. (2006) Gaussian processes for machine learning, vol. 2. MIT press Cambridge, MA
work page 2006
-
[54]
D., Walsh, W., Schroeder, R., Friedlander, A., Richards, B
Williams, I. D., Walsh, W., Schroeder, R., Friedlander, A., Richards, B. and Stamoulis, K. (2008) Assessing the importance of fishing impacts on Hawaiian coral reef fish assemblages along regional-scale human population gradients . Environmental Conservation, 35, 261--272
work page 2008
-
[55]
, " * write output.state after.block = add.period write newline
ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := ...
-
[56]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.