arxiv: 2605.14284 · v1 · submitted 2026-05-14 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Smooth Multi-Policy Causal Effect Estimation in Longitudinal Settings

Wenxin Chen , Weishen Pan , Kyra Gan , Fei Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:18 UTC · model grok-4.3

classification 💻 cs.LG

keywords causal inferencelongitudinal datadynamic treatmentLTMLEmulti-policy estimationQ-networkkernel mean embedding

0 comments

The pith

A shared policy encoder with kernel mean embeddings enables joint multi-policy causal estimation and constrains second-order remainder after LTMLE to reduce finite-sample variance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that estimating multiple dynamic treatment policies separately creates uncontrolled second-order bias and high variance, even after standard LTMLE debiasing. To fix this, it introduces a policy-aware reparameterization of ICE Q-functions inside the PEQ-Net architecture. A shared policy encoder trained on kernel mean embeddings lets the system borrow statistical strength across similar policies. After the LTMLE correction step, this design structurally limits the second-order remainder term, which the authors show stabilizes estimates in practice.

Core claim

After applying an LTMLE correction step, the PEQ-Net design imposes a structural constraint on the second-order remainder, thereby stabilizing finite-sample variance for joint multi-policy estimation.

What carries the argument

PEQ-Net shared policy encoder trained with kernel mean embeddings that reflect population-level policy dissimilarities, enabling joint ICE Q-function estimation.

Load-bearing premise

The kernel mean embeddings accurately capture population-level policy dissimilarities to enable effective information sharing in the shared encoder.

What would settle it

If re-running the semi-synthetic experiments shows no RMSE reduction for closely related policies when using the shared encoder versus separate estimation, the variance-stabilization claim is false.

Figures

Figures reproduced from arXiv: 2605.14284 by Fei Wang, Kyra Gan, Weishen Pan, Wenxin Chen.

**Figure 1.** Figure 1: Illustration of the PEQ-Net. Step 1 computes per-step policy embeddings using pairwise MMD distances followed by MDS. Step 2 aggregates the resulting embeddings with a policy encoder and conditions the shared Q-functions on the encoded policy representation. requirement for smooth policy contrasts. To address this, we propose to explicitly parameterize the outcome regression by the future policy tail, shif… view at source ↗

**Figure 2.** Figure 2: shows that both strategies improve over fully separate estimation, suggesting that sharing parameters across policies can reduce estimation variance. Notably, the multiQ-head variant outperforms independent fine-tuning, indicating that jointly training within a unified model is more effective than adapting separate models after pretraining. Nevertheless, the proposed PEQ-Net achieves substantially lower… view at source ↗

**Figure 3.** Figure 3: Higher MAP target associated with higher lactate level Williams & Seeger (2000); Rahimi & Recht (2007); Rudi et al. (2017) can reduce the O(N2 ) complexity to near-linear or sub-quadratic complexity and can be incorporated into our framework. 5.4. Real-world Case Study We applied PEQ-Net to a real-world cohort of sepsis patients with hypotension from the MIMIC-IV database to estimate the CATE of alternativ… view at source ↗

read the original abstract

Comparative evaluation of multiple dynamic treatment policies is essential for healthcare and policy decisions, yet conventional longitudinal causal inference methods estimate each in isolation, preventing information sharing across counterfactuals. We demonstrate that this separate estimation paradigm induces a structurally uncontrolled second-order bias, inflating finite-sample variance even after standard debiasing with longitudinal targeted maximum likelihood estimation(LTMLE). To address this, we propose a policy-aware reparameterization of Iterative Conditional Expectation (ICE) Q-functions that enables joint estimation through shared representations. We implement this approach in the Policy-Encoded Q Network (PEQ-Net), an architecture centered on a shared policy encoder. The encoder is trained using kernel mean embeddings, ensuring that the learned representation space reflects population-level policy dissimilarities. After applying an LTMLE correction step, we prove this design imposes a structural constraint on the second-order remainder, thereby stabilizing finite-sample variance. Experiments on semi-synthetic datasets demonstrate that PEQ-Net consistently outperforms existing ICE-based methods, achieving substantial reductions in root-mean-square error, particularly when evaluating closely related policies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's joint estimation trick for multiple longitudinal policies via a shared kernel-embedded encoder is a fresh angle that delivers RMSE gains on semi-synthetic data, but the claimed structural constraint on the remainder term still needs a tighter derivation to hold up.

read the letter

The core contribution is a policy-aware reparameterization of the ICE Q-functions inside a shared encoder (PEQ-Net) trained on kernel mean embeddings of the policies. This lets the model borrow strength across related dynamic treatment regimes instead of estimating each one separately, which the authors say leaves an uncontrolled second-order bias even after standard LTMLE. They then claim that the combination of the shared representation and the LTMLE step imposes a structural limit on that remainder, cutting finite-sample variance. On semi-synthetic data the RMSE drops noticeably, especially when policies are close to each other. That empirical pattern is the clearest positive signal so far. The architecture itself is straightforward to implement once the kernel embeddings are in place, and the motivation for joint estimation in healthcare or policy settings is solid. The main soft spot is the proof. The abstract asserts that the design constrains the remainder, but the link runs through the claim that the learned embeddings accurately reflect population-level policy dissimilarities and thereby couple the nuisance errors across policies. If the kernel mean embedding loss only encourages similarity in expectation without bounding the actual finite-sample cross-policy error component, the variance stabilization does not automatically follow. The stress-test note correctly flags this as the load-bearing assumption, and the provided abstract does not include the step-by-step argument that would let a reader check it. Experiments are limited to semi-synthetic setups, so we still lack evidence on how the method behaves with real longitudinal records or when policy dissimilarities are misspecified. Kernel parameters also remain free and will need sensible defaults or cross-validation. This work is aimed at causal-inference researchers who already use ICE or LTMLE for dynamic regimes and want to handle several policies at once. A reader who cares about variance reduction in multi-policy comparisons will find the empirical results useful even if they treat the theoretical claim as provisional. I would send it to peer review. The idea is new enough and the reported gains are concrete enough that referees should see the full derivation and any additional checks on the embedding assumption.

Referee Report

2 major / 2 minor

Summary. The paper proposes the Policy-Encoded Q Network (PEQ-Net) for joint estimation of causal effects under multiple dynamic treatment policies in longitudinal settings. It reparameterizes Iterative Conditional Expectation (ICE) Q-functions via a shared policy encoder trained with kernel mean embeddings to reflect policy dissimilarities, enabling information sharing across counterfactuals. The central claim is that, after an LTMLE correction step, this architecture imposes a structural constraint on the second-order remainder term, stabilizing finite-sample variance; semi-synthetic experiments report consistent RMSE reductions relative to separate ICE-based estimators, especially for closely related policies.

Significance. If the claimed structural constraint on the second-order remainder holds and produces the reported variance stabilization, the work would offer a principled way to improve efficiency in multi-policy longitudinal causal inference without uncontrolled bias, which is relevant for comparative effectiveness research in healthcare and policy settings where multiple regimes must be evaluated simultaneously.

major comments (2)

[Proof of structural constraint (abstract and theoretical section)] The abstract states that after the LTMLE correction the PEQ-Net design 'imposes a structural constraint on the second-order remainder.' No explicit derivation is supplied showing how the kernel mean embedding loss directly bounds or zeros the cross-policy component of the remainder (as opposed to merely encouraging encoder similarity in expectation). This step is load-bearing for the variance-stabilization claim.
[Theoretical analysis and assumption discussion] The weakest assumption—that kernel mean embeddings of policies accurately capture population-level dissimilarities sufficient to couple Q-function estimates across policies—is not accompanied by finite-sample bounds relating the KME loss to the nuisance estimation error that enters the remainder term. Without such bounds the structural constraint does not necessarily materialize.

minor comments (2)

[Methods] The notation for the policy-encoded Q-functions and the precise form of the shared encoder should be defined explicitly with an equation or diagram in the methods section to aid reproducibility.
[Experiments] The semi-synthetic data generation process and the exact policy sampling mechanism used to create 'closely related policies' should be described in greater detail, including any hyperparameters of the kernel mean embeddings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and commit to revisions that will make the theoretical claims more explicit and self-contained without altering the core contributions.

read point-by-point responses

Referee: [Proof of structural constraint (abstract and theoretical section)] The abstract states that after the LTMLE correction the PEQ-Net design 'imposes a structural constraint on the second-order remainder.' No explicit derivation is supplied showing how the kernel mean embedding loss directly bounds or zeros the cross-policy component of the remainder (as opposed to merely encouraging encoder similarity in expectation). This step is load-bearing for the variance-stabilization claim.

Authors: We agree that the derivation should be more prominent. The appendix contains the full proof (Section A.3) showing that the KME loss term directly constrains the cross-policy component of the second-order remainder after LTMLE by bounding the relevant covariance term via the embedding distance; the main text only summarizes the result. We will move the key steps of this derivation into the main theoretical section (Section 3.3) and add an explicit lemma stating that the loss zeros the cross-policy remainder contribution (rather than acting only in expectation). This change will be made in the revision. revision: yes
Referee: [Theoretical analysis and assumption discussion] The weakest assumption—that kernel mean embeddings of policies accurately capture population-level dissimilarities sufficient to couple Q-function estimates across policies—is not accompanied by finite-sample bounds relating the KME loss to the nuisance estimation error that enters the remainder term. Without such bounds the structural constraint does not necessarily materialize.

Authors: We acknowledge that the current analysis is stated at the population level and does not supply explicit finite-sample bounds linking KME estimation error to the nuisance functions. We will add a new subsection (Section 3.4) that (i) states the assumption more precisely, (ii) provides a high-level propagation argument under Lipschitz continuity of the Q-functions and bounded kernel, and (iii) discusses the resulting impact on the remainder term. Full non-asymptotic bounds would require additional technical development beyond the scope of the present work; we will therefore also note this as a limitation and outline the conditions under which the constraint holds in finite samples. revision: partial

Circularity Check

0 steps flagged

No significant circularity; central proof is design-dependent but not self-referential by construction

full rationale

The paper's core claim is a proof that the PEQ-Net shared encoder (trained on kernel mean embeddings) plus LTMLE imposes a structural constraint on the second-order remainder term. This is presented as following from the proposed reparameterization of ICE Q-functions and the LTMLE correction step. No equations or steps reduce the claimed variance stabilization directly to fitted parameters by construction, nor does the argument rely on self-citations, uniqueness theorems imported from prior work, or renaming of known results. The kernel mean embedding step is an explicit modeling assumption rather than a hidden tautology, and the derivation chain remains independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard longitudinal causal assumptions plus the new neural architecture; no explicit free parameters beyond training are listed, and the invented entity is the PEQ-Net itself.

free parameters (1)

kernel parameters for mean embeddings
Used to train the policy encoder to reflect policy dissimilarities; values are learned during training.

axioms (1)

domain assumption Standard assumptions for longitudinal causal inference including no unmeasured confounding
Required for validity of LTMLE correction step.

invented entities (1)

Policy-Encoded Q Network (PEQ-Net) no independent evidence
purpose: Joint estimation of multiple policies through shared representations
New neural architecture introduced to enable the policy-aware reparameterization.

pith-pipeline@v0.9.0 · 5481 in / 1246 out tokens · 59119 ms · 2026-05-15T05:18:58.959183+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

After applying an LTMLE correction step, we prove this design imposes a structural constraint on the second-order remainder, thereby stabilizing finite-sample variance. ... Theorem 4.2 (Lipschitz control of the CATE second-order remainder) ... |Rem(i),(j)| ≤ LR ∥μ(i)1:τ − μ(j)1:τ∥F1:τ
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The encoder is trained using kernel mean embeddings, ensuring that the learned representation space reflects population-level policy dissimilarities.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

104 extracted references · 104 canonical work pages · 2 internal anchors

[1]

Critical care medicine , volume=

Surviving sepsis campaign: international guidelines for management of sepsis and septic shock 2021 , author=. Critical care medicine , volume=. 2021 , publisher=

work page 2021
[2]

New England Journal of Medicine , volume=

High versus low blood-pressure target in patients with septic shock , author=. New England Journal of Medicine , volume=. 2014 , publisher=

work page 2014
[3]

2016 , month = sep, note =

Johnson, Alistair and Pollard, Tom and Mark, Roger , title =. 2016 , month = sep, note =. doi:10.13026/C2XW26 , url =

work page doi:10.13026/c2xw26 2016
[4]

2005 , publisher=

Modern multidimensional scaling: Theory and applications , author=. 2005 , publisher=

work page 2005
[5]

Psychometrika , volume=

Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , author=. Psychometrika , volume=. 1964 , publisher=

work page 1964
[6]

2018 , eprint=

Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation , author=. 2018 , eprint=

work page 2018
[7]

Statistics in biosciences , volume=

Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula , author=. Statistics in biosciences , volume=. 2011 , publisher=

work page 2011
[8]

The lancet HIV , volume=

Comparison of dynamic monitoring strategies based on CD4 cell counts in virally suppressed, HIV-positive individuals on combination antiretroviral therapy in high-income countries: a prospective, observational study , author=. The lancet HIV , volume=. 2017 , publisher=

work page 2017
[9]

The international journal of biostatistics , volume=

When to start treatment? A systematic approach to the comparison of dynamic regimes using observational data , author=. The international journal of biostatistics , volume=

work page
[10]

Health services research , volume=

Comparing the effectiveness of dynamic treatment strategies using electronic health records: an application of the parametric g-formula to anemia management strategies , author=. Health services research , volume=. 2018 , publisher=

work page 2018
[11]

arXiv preprint arXiv:2412.04799 , year=

Estimating the treatment effect over time under general interference through deep learner integrated TMLE , author=. arXiv preprint arXiv:2412.04799 , year=

work page arXiv
[12]

The international journal of biostatistics , volume=

Targeted maximum likelihood estimation of the parameter of a marginal structural model , author=. The international journal of biostatistics , volume=

work page
[13]

International Conference on Machine Learning , pages=

Kernel Debiased Plug-in Estimation: Simultaneous, Automated Debiasing without Influence Functions for Many Target Parameters , author=. International Conference on Machine Learning , pages=. 2024 , organization=

work page 2024
[14]

The International Journal of Biostatistics , volume=

A General Implementation of TMLE for Longitudinal Data Applied to Causal Inference in Survival Analysis , author=. The International Journal of Biostatistics , volume=. 2012 , publisher=

work page 2012
[15]

2011 , publisher=

Targeted learning: causal inference for observational and experimental data , author=. 2011 , publisher=

work page 2011
[16]

2013 , publisher=

Statistical methods for dynamic treatment regimes , author=. 2013 , publisher=

work page 2013
[17]

Mathematical modelling , volume=

A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , author=. Mathematical modelling , volume=. 1986 , publisher=

work page 1986
[18]

2010 , publisher=

Causal inference , author=. 2010 , publisher=

work page 2010
[19]

Statistical models in epidemiology, the environment, and clinical trials , pages=

Marginal structural models versus structural nested models as tools for causal inference , author=. Statistical models in epidemiology, the environment, and clinical trials , pages=. 2000 , publisher=

work page 2000
[20]

Biometrics , volume=

Doubly robust estimation in missing data and causal inference models , author=. Biometrics , volume=. 2005 , publisher=

work page 2005
[21]

American journal of epidemiology , volume=

Implementation of G-computation on a simulated data set: demonstration of a causal inference technique , author=. American journal of epidemiology , volume=. 2011 , publisher=

work page 2011
[22]

arXiv preprint arXiv:2206.08311 , year=

Continuous-time modeling of counterfactual outcomes using neural controlled differential equations , author=. arXiv preprint arXiv:2206.08311 , year=

work page arXiv
[23]

International Conference on Learning Representations , year=

Estimating counterfactual treatment outcomes over time through adversarially balanced representations , author=. International Conference on Learning Representations , year=

work page
[24]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Optimal dynamic treatment regimes , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2003 , publisher=

work page 2003
[25]

Borgwardt and Malte J

Arthur Gretton and Karsten M. Borgwardt and Malte J. Rasch and Bernhard Sch. A Kernel Two-Sample Test , journal =. 2012 , volume =

work page 2012
[26]

Advances in neural information processing systems , volume=

Mmd gan: Towards deeper understanding of moment matching network , author=. Advances in neural information processing systems , volume=

work page
[27]

International Conference on Machine Learning , pages=

Covariate balancing using the integral probability metric for causal inference , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[28]

Journal of the American Statistical Association , volume=

Nonparametric causal effects based on longitudinal modified treatment policies , author=. Journal of the American Statistical Association , volume=. 2023 , publisher=

work page 2023
[29]

and Varoquaux, G

Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V. and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P. and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E. , journal=. Scikit-learn: Machine Learning in

work page
[30]

PhysioNet

Mimic-iv , author=. PhysioNet. Available online at: https://physionet. org/content/mimiciv/1.0/(accessed August 23, 2021) , pages=

work page 2021
[31]

arXiv preprint arXiv:2407.05287 , year=

Model-agnostic meta-learners for estimating heterogeneous treatment effects over time , author=. arXiv preprint arXiv:2407.05287 , year=

work page arXiv
[32]

Chapman & Hall/CRC Handbooks of Modern Statistical Methods , pages=

Estimation of the causal effects of time-varying exposures , author=. Chapman & Hall/CRC Handbooks of Modern Statistical Methods , pages=. 2008 , publisher=

work page 2008
[33]

2018 , publisher=

Double/debiased machine learning for treatment and structural parameters , author=. 2018 , publisher=

work page 2018
[34]

Advances in neural information processing systems , volume=

Forecasting treatment responses over time using recurrent marginal structural networks , author=. Advances in neural information processing systems , volume=

work page
[35]

International conference on machine learning , pages=

Causal transformer for estimating counterfactual outcomes , author=. International conference on machine learning , pages=. 2022 , organization=

work page 2022
[36]

The annals of statistics , pages=

Equivalence of distance-based and RKHS-based statistics in hypothesis testing , author=. The annals of statistics , pages=. 2013 , publisher=

work page 2013
[37]

Advances in Neural Information Processing Systems , volume=

Fast two-sample testing with analytic representations of probability measures , author=. Advances in Neural Information Processing Systems , volume=

work page
[38]

Advances in neural information processing systems , volume=

Optimal kernel choice for large-scale two-sample tests , author=. Advances in neural information processing systems , volume=

work page
[39]

Advances in neural information processing systems , volume=

Kernel methods for deep learning , author=. Advances in neural information processing systems , volume=

work page
[40]

Advances in neural information processing systems , volume=

Random features for large-scale kernel machines , author=. Advances in neural information processing systems , volume=

work page
[41]

IEEE Signal Processing Magazine , volume=

Kernel embeddings of conditional distributions: A unified kernel framework for nonparametric inference in graphical models , author=. IEEE Signal Processing Magazine , volume=. 2013 , publisher=

work page 2013
[42]

arXiv preprint arXiv:2506.02793 , year=

Doubly-Robust Estimation of Counterfactual Policy Mean Embeddings , author=. arXiv preprint arXiv:2506.02793 , year=

work page arXiv
[43]

Artificial Intelligence and Statistics , pages=

A framework for optimal matching for causal inference , author=. Artificial Intelligence and Statistics , pages=. 2017 , organization=

work page 2017
[44]

Journal of Machine Learning Research , volume=

Generalized optimal matching methods for causal inference , author=. Journal of Machine Learning Research , volume=

work page
[45]

International conference on machine learning , pages=

Learning representations for counterfactual inference , author=. International conference on machine learning , pages=. 2016 , organization=

work page 2016
[46]

Journal of Machine Learning Research , volume=

Counterfactual mean embeddings , author=. Journal of Machine Learning Research , volume=

work page
[47]

International conference on machine learning , pages=

Conditional distributional treatment effect with kernel conditional mean embeddings and u-statistic regression , author=. International conference on machine learning , pages=. 2021 , organization=

work page 2021
[48]

Advances in Neural Information Processing Systems , volume=

An efficient doubly-robust test for the kernel treatment effect , author=. Advances in Neural Information Processing Systems , volume=

work page
[49]

BMC Infectious Diseases , volume=

Timing of vasopressin initiation and mortality in patients with septic shock: analysis of the MIMIC-III and MIMIC-IV databases , author=. BMC Infectious Diseases , volume=. 2023 , publisher=

work page 2023
[50]

Critical Care , volume=

Fluid-limiting treatment strategies among sepsis patients in the ICU: a retrospective causal analysis , author=. Critical Care , volume=. 2020 , publisher=

work page 2020
[51]

Journal of inflammation , volume=

Early lactate clearance is associated with biomarkers of inflammation, coagulation, apoptosis, organ dysfunction and mortality in severe sepsis and septic shock , author=. Journal of inflammation , volume=. 2010 , publisher=

work page 2010
[52]

International Conference on Machine Learning , pages=

More robust doubly robust off-policy evaluation , author=. International Conference on Machine Learning , pages=. 2018 , organization=

work page 2018
[53]

International journal of epidemiology , volume=

Intervening on risk factors for coronary heart disease: an application of the parametric g-formula , author=. International journal of epidemiology , volume=. 2009 , publisher=

work page 2009
[54]

Machine Learning for Health , pages=

G-net: a recurrent network approach to g-computation for counterfactual prediction under a dynamic treatment regime , author=. Machine Learning for Health , pages=. 2021 , organization=

work page 2021
[55]

International Conference on Machine Learning , pages=

More efficient off-policy evaluation through regularized targeted learning , author=. International Conference on Machine Learning , pages=. 2019 , organization=

work page 2019
[56]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Estimating average causal effects from patient trajectories , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[57]

arXiv preprint arXiv:2404.04399 , year=

Longitudinal targeted minimum loss-based estimation with temporal-difference heterogeneous transformer , author=. arXiv preprint arXiv:2404.04399 , year=

work page arXiv
[58]

arXiv preprint arXiv:2405.21012 , year=

G-transformer for conditional average potential outcome estimation over time , author=. arXiv preprint arXiv:2405.21012 , year=

work page arXiv
[59]

Biometrics , volume=

Parametric g-formula implementations for causal survival analyses , author=. Biometrics , volume=. 2021 , publisher=

work page 2021
[60]

Journal of Epidemiology & Community Health , volume=

Estimating causal effects from epidemiological data , author=. Journal of Epidemiology & Community Health , volume=. 2006 , publisher=

work page 2006
[61]

American journal of epidemiology , volume=

Constructing inverse probability weights for marginal structural models , author=. American journal of epidemiology , volume=. 2008 , publisher=

work page 2008
[62]

Biometrika , volume=

The central role of the propensity score in observational studies for causal effects , author=. Biometrika , volume=. 1983 , publisher=

work page 1983
[63]

Journal of the American statistical Association , volume=

Reducing bias in observational studies using subclassification on the propensity score , author=. Journal of the American statistical Association , volume=. 1984 , publisher=

work page 1984
[64]

Value in Health , volume=

Use of stabilized inverse propensity scores as weights to directly estimate relative risk and its confidence intervals , author=. Value in Health , volume=. 2010 , publisher=

work page 2010
[65]

Clinical kidney journal , volume=

An introduction to inverse probability of treatment weighting in observational research , author=. Clinical kidney journal , volume=. 2022 , publisher=

work page 2022
[66]

Multivariate behavioral research , volume=

An introduction to propensity score methods for reducing the effects of confounding in observational studies , author=. Multivariate behavioral research , volume=. 2011 , publisher=

work page 2011
[67]

Statistical science: a review journal of the Institute of Mathematical Statistics , volume=

Matching methods for causal inference: A review and a look forward , author=. Statistical science: a review journal of the Institute of Mathematical Statistics , volume=

work page
[68]

Advances in neural information processing systems , volume=

Weighted importance sampling for off-policy learning with linear function approximation , author=. Advances in neural information processing systems , volume=

work page
[69]

Statistical methods in medical research , volume=

Diagnosing and responding to violations in the positivity assumption , author=. Statistical methods in medical research , volume=. 2012 , publisher=

work page 2012
[70]

American journal of epidemiology , volume=

Evaluating model specification when using the parametric g-formula in the presence of censoring , author=. American journal of epidemiology , volume=. 2023 , publisher=

work page 2023
[71]

Patterns , volume=

gfoRmula: an R package for estimating the effects of sustained treatment strategies via the parametric g-formula , author=. Patterns , volume=. 2020 , publisher=

work page 2020
[72]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page
[73]

Statistics in Medicine , volume=

A Bayesian Approach to the G-Formula via Iterative Conditional Regression , author=. Statistics in Medicine , volume=. 2025 , publisher=

work page 2025
[74]

Deep learning methods for the noniterative conditional expectation g-formula for causal inference from complex observational data.arXiv preprint arXiv:2410.21531, 2024

Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data , author=. arXiv preprint arXiv:2410.21531 , year=

work page arXiv
[75]

Sequential Double Robustness in Right-Censored Longitudinal Models

Sequential double robustness in right-censored longitudinal models , author=. arXiv preprint arXiv:1705.02459 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[76]

International conference on machine learning , pages=

Data-efficient off-policy policy evaluation for reinforcement learning , author=. International conference on machine learning , pages=. 2016 , organization=

work page 2016
[77]

Advances in neural information processing systems , volume=

Towards optimal off-policy evaluation for reinforcement learning with marginalized importance sampling , author=. Advances in neural information processing systems , volume=

work page
[78]

International Conference on Machine Learning , pages=

Importance sampling policy evaluation with an estimated behavior policy , author=. International Conference on Machine Learning , pages=. 2019 , organization=

work page 2019
[79]

Advances in Neural Information Processing Systems , volume=

Importance resampling for off-policy prediction , author=. Advances in Neural Information Processing Systems , volume=

work page
[80]

2024 IEEE Conference on Artificial Intelligence (CAI) , pages=

Low variance off-policy evaluation with state-based importance sampling , author=. 2024 IEEE Conference on Artificial Intelligence (CAI) , pages=. 2024 , organization=

work page 2024

Showing first 80 references.