crumble: A comprehensive framework for modern causal mediation analysis with intermediate confounding
Pith reviewed 2026-05-10 16:38 UTC · model grok-4.3
The pith
Crumble package gives practical nonparametric tools for estimating direct and indirect effects while handling intermediate confounding.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Crumble enables nonparametric estimation of several mediation parameters, even when mediators are continuous and/or multi-dimensional or when treatments are non-binary, while accommodating intermediate confounding.
What carries the argument
The crumble R package, which implements estimators for the reviewed mediation parameters under standard identification assumptions and modified treatment policies for non-binary treatments.
If this is right
- Applied researchers can estimate natural direct and indirect effects with continuous mediators without relying on strong parametric models.
- The package supports analysis of non-binary treatments through modified treatment policies.
- Suitable mediation parameters can be chosen to account for intermediate confounding in the data.
- Demonstrations with real data lower the barrier for implementing these estimators in practice.
Where Pith is reading between the lines
- This could encourage wider adoption of mediation analysis in observational studies across health and social sciences.
- The tutorial approach suggests potential for similar guides on related causal inference methods for longitudinal or time-varying settings.
- Users might extend the framework by combining it with other tools for sensitivity analysis when positivity assumptions are borderline.
Load-bearing premise
The standard causal identification assumptions hold, including no unmeasured confounding, consistency, positivity, and correct specification of the nuisance functions.
What would settle it
Applying crumble to the Job Search Intervention Study data produces estimates that contradict results from established parametric methods without clear justification from the identification assumptions.
Figures
read the original abstract
Causal mediation analysis is widely used to investigate how causal effects operate through specific pathways linking treatments or exposures to outcomes. Recently, \texttt{crumble} was developed to enable nonparametric estimation of several mediation parameters, even when mediators are continuous and/or multi-dimensional or when treatments are non-binary. But a practical and accessible guide to using \texttt{crumble} -- one that does not require deep familiarity with mediation analysis or semiparametric theory -- is currently lacking. This tutorial aims to an accessible introduction to \texttt{crumble} while minimizing technical complexity. We first review the mediation parameters implemented in \texttt{crumble} -- natural direct and indirect effects, randomized interventional effects, and recanting-twin effects. For each, we give the definition, interpretation, identification assumptions, and suitability in the presence or absence of intermediate confounding. Then, we demonstrate the usage of \texttt{crumble} by examining an example configuration. Next, we describe how \texttt{crumble} accommodates non-binary treatments through modified treatment policies. Finally, we illustrate the practical use of \texttt{crumble} through two case studies -- one with a binary treatment and one with a non-binary treatment -- based on the Job Search Intervention Study data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a tutorial for the crumble R package, which supports nonparametric estimation of causal mediation parameters including natural direct and indirect effects, randomized interventional effects, and recanting-twin effects. It reviews definitions, interpretations, and identification assumptions for each parameter (with and without intermediate confounding), explains accommodation of continuous or multi-dimensional mediators and non-binary treatments via modified treatment policies, and demonstrates usage through case studies on the Job Search Intervention Study data.
Significance. If the crumble implementation follows the referenced semiparametric results, the tutorial meaningfully lowers the barrier for applied researchers to conduct modern mediation analysis in settings with intermediate confounding or complex mediator structures. By providing accessible explanations and concrete examples rather than new theory, it addresses a practical gap and could increase appropriate use of these methods.
minor comments (4)
- The tutorial sections reviewing identification assumptions for each effect could include brief practical guidance on assessing positivity and correct nuisance specification in the context of the Job Search Intervention Study example, to better support applied readers.
- In the case study sections, clarify how the modified treatment policies are defined and implemented for the non-binary treatment example, including any specific parameter choices or sensitivity checks.
- The manuscript would benefit from a short table comparing the mediation parameters (e.g., when each is identifiable, their interpretations, and handling of intermediate confounding) to aid quick reference.
- Ensure that all code snippets are self-contained and that the package version and dependencies are explicitly stated for reproducibility.
Simulated Author's Rebuttal
We thank the referee for their positive review of the manuscript and for recommending acceptance. We are pleased that the tutorial is viewed as addressing a practical gap by providing accessible explanations and examples for modern causal mediation analysis using the crumble package.
Circularity Check
No significant circularity; tutorial references prior package and standard assumptions
full rationale
The paper is structured as an accessible tutorial on the existing crumble package rather than a new theoretical derivation. It reviews mediation parameters (natural direct/indirect effects, randomized interventional effects, recanting-twin effects), standard causal identification assumptions (no unmeasured confounding, consistency, positivity), and demonstrates usage on the Job Search Intervention Study data. No quantities are defined in terms of fitted parameters from the current analysis, no predictions reduce to inputs by construction, and no uniqueness theorems or ansatzes are smuggled via self-citation. The mention of prior crumble development is a normal reference to the package being tutorialized and does not bear the load of any derivation. The work is self-contained against external benchmarks of causal mediation literature.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption No unmeasured confounding between treatment, mediator, and outcome; consistency; positivity; and correct specification of nuisance functions for identification of natural direct/indirect effects and related quantities.
Reference graph
Works this paper leans on
-
[1]
David Benkeser, Iv ´an D´ıaz, and Jialu Ran. Inference for natural mediation effects under case- cohort sampling with applications in identifying covid-19 vaccine correlates of protection.arXiv preprint arXiv:2103.02643,
-
[2]
Graphical Models for Processing Missing Data
doi: 10.1080/01621459.2021.1955691. URL https://doi.org/10.1080/01621459.2021.1955691. Daniel Falbel and Javier Luraschi.torch: Tensors and Neural Networks with ’GPU’ Acceleration,
-
[3]
Katherine L Hoffman, Diego Salazar-Barreto, Kara E Rudolph, and Iv´an D´ıaz. Introducing longi- tudinal modified treatment policies: a unified framework for studying complex exposures.arXiv preprint arXiv:2304.09460,
-
[4]
Semiparametric doubly robust targeted double machine learning: a review
Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469,
-
[5]
Richard Liu, Nicholas T Williams, Kara E Rudolph, and Iv ´an D´ıaz. General targeted machine learning for modern causal mediation analysis.arXiv preprint arXiv:2408.14620,
-
[6]
Judith J Lok. Organic direct and indirect effects with post-treatment common causes of mediator and outcome.arXiv preprint arXiv:1510.02753,
-
[7]
Judith J Lok. Causal organic direct and indirect effects: closer to baron and kenny.arXiv preprint arXiv:1903.04697,
-
[8]
Xiaxian Ou, Xinwei He, David Benkeser, and Razieh Nabi. Assessing racial disparities in health- care expenditures via mediator distribution shifts.arXiv preprint arXiv:2504.21688,
-
[9]
Amiram D Vinokur and Yaacov Schul
doi: 10.1097/EDE.0000000000000596. Amiram D Vinokur and Yaacov Schul. Mastery and inoculation against setbacks as active ingredi- ents in the jobs intervention for the unemployed.Journal of consulting and clinical psychology, 65(5):867,
-
[10]
Tat-Thang V o, Nicholas Williams, Richard Liu, Kara E Rudolph, and Ivan Dıaz. Recanting twins: addressing intermediate confounding in mediation analysis.arXiv preprint arXiv:2401.04450,
-
[11]
URLhttps://CRAN.R-project.org/package=crumble. R package version 0.1.2. Nicholas T Williams, Oliver J Hines, and Kara E Rudolph. Riesz representers for the rest of us. arXiv preprint arXiv:2507.19413, 2025a. Nicholas T Williams, Anton Hung, and Kara E Rudolph. Re: don’t let your analysis go to seed: on the impact of random seed on machine learning-based c...
-
[12]
31 A Incorporating MTPs into common mediation parameters This section briefly introduces the definition and identifications of the common mediation param- eters reviewed in §3 when MTPs are incorporated for non-binary treatment/exposures. Details on the estimation of these parameters are provided in §5 of (Liu et al., 2024). In what follows, we denoted 1 ...
work page 2024
-
[13]
C Technical Details for Estimation: Challenges and Solutions incrumble In this section, we briefly introduce the two estimation challenges that come up when eitherMor Zis continuous or high-dimensional. Then, we explain howcrumbleaddresses these estimation challenges by examining its implementation details. C.1 Challenges in Estimation One challenge comes...
work page 2021
-
[14]
However, not all identification formulas we reviewed in §3 are repeated conditional expectations ofY
or longitudinal causal effects (D ´ıaz et al., 2021), for example, can be applied to estimateψ( ˆP)without worrying about the data complexity. However, not all identification formulas we reviewed in §3 are repeated conditional expectations ofY. In fact, some of the identification formulas become hard to estimate when eitherMorZis continuous and/or high-di...
work page 2021
-
[15]
suggests that the conditions required for the Bayes reparameterization approach to achieve desirable statis- tical properties are more stringent than those needed for the approach implemented incrumble (see Remark 2 in (Liu et al., 2024)). C.2 Implementation Details incrumble Thecrumble()function incrumbleencompasses all the functionalities for estimation...
work page 2024
-
[16]
for more technical details about lines 4-7, and (Williams et al., 2025a) for another tutorial about Riesz regression, which was public after (Liu et al., 2024). We also note thatestimate phi n alpha()andestimate phi r alpha()in lines 4-7 rely on an input parameternn modulebecausecrumblesolves the unconstrained optimization prob- lems using deep learning o...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.