pith. sign in

arxiv: 1906.11658 · v2 · pith:G3RU7KFLnew · submitted 2019-06-27 · 📊 stat.ME · cs.LG

Interpretable Almost-Matching-Exactly With Instrumental Variables

Pith reviewed 2026-05-25 14:42 UTC · model grok-4.3

classification 📊 stat.ME cs.LG
keywords instrumental variablesmatchingcausal inferencecategorical confoundersobservational studiesalmost exact matchingtreatment effect estimation
0
0 comments X

The pith

A matching framework for instrumental variables first matches units exactly on categorical confounders and then drops variables sequentially to approximately match the rest.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a matching method for estimating causal effects with instrumental variables when observed confounders are categorical. It begins by exactly matching units on all variables and then consecutively drops variables to allow approximate matches on as many remaining variables as possible. This is intended to avoid strong parametric assumptions, arbitrary distance metrics, and poor scalability in prior IV approaches. A reader would care because improved matches could yield more reliable estimates of causal effects in observational data affected by unmeasured confounding. The authors demonstrate superior performance on simulated datasets and report results from an application to political canvassing.

Core claim

The paper claims that its almost-matching-exactly procedure for IV estimation with categorical confounders produces better matches than existing methods. The procedure works by first constructing exact matches and then sequentially dropping variables to approximately match the remaining units on the largest possible number of variables, as shown through improved performance on simulated data and an application to political canvassing.

What carries the argument

The almost-matching-exactly procedure: exact matching on all categorical confounders followed by consecutive variable dropping to achieve approximate matches on as many variables as possible.

If this is right

  • The method avoids strong parametric assumptions required by some existing IV estimators.
  • It replaces arbitrary distance metrics with a deterministic sequence of exact and approximate matches on categorical variables.
  • It scales to large datasets by avoiding exhaustive distance computations.
  • Superior matches on simulated data imply more accurate causal effect estimates under the IV assumptions.
  • Application to political canvassing yields interpretable results on treatment effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The sequential dropping rule could be adapted to prioritize variables by their strength as instruments or confounders.
  • The framework might extend to settings with some continuous confounders if they are first discretized into categories.
  • Results on simulated data suggest the method could be tested on benchmark causal inference datasets with hidden confounding.
  • Interpretable matches may allow domain experts to inspect which confounders are retained at each step.

Load-bearing premise

Observed confounders are categorical so that exact matching on subsets is feasible, and sequentially dropping variables produces approximately valid matches that preserve instrumental variable identification without new bias.

What would settle it

A simulation study with known true causal effect in which the method's estimates deviate more from the truth than those from competing IV matching approaches.

Figures

Figures reproduced from arXiv: 1906.11658 by Alexander Volfovsky, Cynthia Rudin, Marco Morucci, M. Usaid Awan, Sudeepa Roy, Yameng Liu.

Figure 1
Figure 1. Figure 1: Causal DAG for instrumental variables. Arrows repre [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance for nonlinear generation model with [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: Performance for linear generation model with various [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: True Individual Causal Effect vs. Estimated Individual Causal Effect. The numbers on each plot represent the total number of instrumented units for calculating unit-level LATE, and MSE of our predictions. The concentration parameter is the same for the whole dataset, set to 288.84 for the linear outcome model, and 272.92 for the nonlinear outcome model [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Running Time for FLAME-IV and Full Matching. Left [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Running Time for FLAME-IV on large dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Performance for linear generation model with con [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Performance for nonlinear generation model with [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
read the original abstract

Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the effects of unmeasured confounding. Existing methods for IV estimation either require strong parametric assumptions, use arbitrary distance metrics, or do not scale well to large datasets. We propose a matching framework for IV in the presence of observed categorical confounders that addresses these weaknesses. Our method first matches units exactly, and then consecutively drops variables to approximately match the remaining units on as many variables as possible. We show that our algorithm constructs better matches than other existing methods on simulated datasets, and we produce interesting results in an application to political canvassing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes an interpretable almost-exact matching procedure for instrumental variable estimation when observed confounders are categorical. Units are first matched exactly on all confounders; remaining units are then matched approximately by sequentially dropping the fewest variables possible. Section 5 simulations report higher match counts, improved covariate balance, and stronger instruments relative to existing IV matching methods. Section 6 applies the procedure to a political-canvassing dataset and reports substantive findings.

Significance. If the reported improvements in match quality and instrument strength hold under the stated categorical-confounder restriction, the method supplies a scalable, non-parametric alternative that avoids arbitrary distance metrics and strong parametric assumptions. The explicit algorithm in §3, the simulation design comparing multiple performance metrics, and the real-data application are strengths that make the contribution falsifiable and reproducible.

minor comments (3)
  1. [§3] §3: The description of the sequential dropping rule would be clearer with pseudocode or a small worked example showing which variable is dropped at each step and how the IV validity is preserved.
  2. [§5] §5: Table or figure captions should explicitly state the number of Monte Carlo replications and the exact definition of 'better matches' (e.g., the precise balance metric and instrument-strength threshold) used for the reported comparisons.
  3. Notation for the set of dropped variables is introduced inconsistently between the algorithm statement and the simulation results; a single symbol would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, accurate summary of the proposed almost-exact matching procedure for IV estimation, and recommendation for minor revision. No specific major comments were raised.

Circularity Check

0 steps flagged

No significant circularity; algorithmic proposal validated empirically

full rationale

The paper describes a matching algorithm (exact matching on categorical confounders followed by sequential variable dropping) and validates it via simulation comparisons and a real-data application. No equations, fitted parameters, or predictions are defined in terms of themselves. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claim reduces to an explicit algorithmic procedure whose performance is assessed against external benchmarks rather than by construction from its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. Standard IV assumptions (relevance, exclusion, monotonicity) are implicitly required but not detailed.

pith-pipeline@v0.9.0 · 5667 in / 1130 out tokens · 39580 ms · 2026-05-25T14:42:09.336471+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Semiparametric instrumental variable estimation of treatment response models

    Alberto Abadie. Semiparametric instrumental variable estimation of treatment response models. Journal of Econometrics, 113 0 (2): 0 231--263, 2003

  2. [2]

    The colonial origins of comparative development: An empirical investigation

    Daron Acemoglu, Simon Johnson, and James A Robinson. The colonial origins of comparative development: An empirical investigation. American Economic Review, 91 0 (5): 0 1369--1401, 2001

  3. [3]

    Does compulsory school attendance affect schooling and earnings? The Quarterly Journal of Economics, 106 0 (4): 0 979--1014, 1991

    Joshua D Angrist and Alan B Keueger. Does compulsory school attendance affect schooling and earnings? The Quarterly Journal of Economics, 106 0 (4): 0 979--1014, 1991

  4. [4]

    Identification of causal effects using instrumental variables

    Joshua D Angrist, Guido W Imbens, and Donald B Rubin. Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91 0 (434): 0 444--455, 1996

  5. [5]

    Autor, David Dorn, and Gordon H

    David H. Autor, David Dorn, and Gordon H. Hanson. The china syndrome: Local labor market effects of import competition in the united states. American Economic Review, 103 0 (6): 0 2121--68, 2013

  6. [6]

    Building a stronger instrument in an observational study of perinatal care for premature infants

    Mike Baiocchi, Dylan S Small, Scott Lorch, and Paul R Rosenbaum. Building a stronger instrument in an observational study of perinatal care for premature infants. Journal of the American Statistical Association, 105 0 (492): 0 1285--1296, 2010

  7. [7]

    Generalized instrumental variables

    Carlos Brito and Judea Pearl. Generalized instrumental variables. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, pages 85--93. Morgan Kaufmann Publishers Inc., 2002

  8. [8]

    Using geographic variation in college proximity to estimate the return to schooling

    David Card. Using geographic variation in college proximity to estimate the return to schooling. Technical report, National Bureau of Economic Research, 1993

  9. [9]

    Incorporating knowledge into structural equation models using auxiliary variables

    Bryant Chen, Judea Pearl, and Elias Bareinboim. Incorporating knowledge into structural equation models using auxiliary variables. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 3577--3583. AAAI Press, 2016

  10. [10]

    Interpretable almost-exact matching for causal inference

    Awa Dieng, Yameng Liu, Sudeepa Roy, Cynthia Rudin, and Alexander Volfovsky. Interpretable almost-exact matching for causal inference. In Proceedings of Artificial Intelligence and Statistics (AISTATS) , pages 2445--2453, 2019

  11. [11]

    Nonparametric IV estimation of local average treatment effects with covariates

    Markus Fr \"o lich. Nonparametric IV estimation of local average treatment effects with covariates. Journal of Econometrics, 139 0 (1): 0 35--75, July 2007

  12. [12]

    The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment

    Alan S Gerber and Donald P Green. The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review, 94 0 (3): 0 653--663, 2000

  13. [13]

    Deep IV : A flexible approach for counterfactual prediction

    Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV : A flexible approach for counterfactual prediction. In Proceedings of the 34th International Conference on Machine Learning, volume 70, pages 1414--1423, 2017

  14. [14]

    Propensity-score matching with instrumental variables

    Hidehiko Ichimura and Christopher Taber. Propensity-score matching with instrumental variables. American Economic Review, 91 0 (2): 0 119--124, 2001

  15. [15]

    Identification and estimation of local average treatment effects

    Guido W Imbens and Joshua D Angrist. Identification and estimation of local average treatment effects. Econometrica, 62 0 (2): 0 467--475, Mar. 1994

  16. [16]

    Bayesian inference for causal effects in randomized experiments with noncompliance

    Guido W Imbens and Donald B Rubin. Bayesian inference for causal effects in randomized experiments with noncompliance. The Annals of Statistics, pages 305--327, 1997

  17. [17]

    Causal inference in statistics, social, and biomedical sciences

    Guido W Imbens and Donald B Rubin. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, 2015

  18. [18]

    Contract duration and relationship-specific investments: Empirical evidence from coal markets

    Paul L Joskow. Contract duration and relationship-specific investments: Empirical evidence from coal markets. The American Economic Review, pages 168--185, 1987

  19. [19]

    The causal effect of malaria on stunting: a mendelian randomization and matching approach

    Hyunseung Kang, Benno Kreuels, Ohene Adjei, Ralf Krumkamp, J \"u rgen May, and Dylan S Small. The causal effect of malaria on stunting: a mendelian randomization and matching approach. International Journal of Epidemiology, 42 0 (5): 0 1390--1398, 2013

  20. [20]

    Full matching approach to instrumental variables estimation with application to the effect of malaria on stunting

    Hyunseung Kang, Benno Kreuels, J \"u rgen May, Dylan S Small, et al. Full matching approach to instrumental variables estimation with application to the effect of malaria on stunting. The Annals of Applied Statistics, 10 0 (1): 0 335--364, 2016

  21. [21]

    Instrumental variable estimation of nonparametric models

    Whitney K Newey and James L Powell. Instrumental variable estimation of nonparametric models. Econometrica, 71 0 (5): 0 1565--1578, 2003

  22. [22]

    Doubly robust estimation of the local average treatment effect curve

    Elizabeth L Ogburn, Andrea Rotnitzky, and James M Robins. Doubly robust estimation of the local average treatment effect curve. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77 0 (2): 0 373--396, 2015

  23. [23]

    Will a five-minute discussion change your mind? a countrywide experiment on voter choice in france

    Vincent Pons. Will a five-minute discussion change your mind? a countrywide experiment on voter choice in france. American Economic Review, 108 0 (6): 0 1322--63, 2018

  24. [24]

    Design of observational studies, volume 10

    Paul R Rosenbaum. Design of observational studies, volume 10. Springer, 2010

  25. [25]

    Learning instrumental variables with structural and non-gaussianity assumptions

    Ricardo Silva and Shohei Shimizu. Learning instrumental variables with structural and non-gaussianity assumptions. Journal of Machine Learning Research, 18 0 (120): 0 1--49, 2017. URL http://jmlr.org/papers/v18/17-014.html

  26. [26]

    A survey of weak instruments and weak identification in generalized method of moments

    James H Stock, Jonathan H Wright, and Motohiro Yogo. A survey of weak instruments and weak identification in generalized method of moments. Journal of Business & Economic Statistics, 20 0 (4): 0 518--529, 2002

  27. [27]

    Regression and weighting methods for causal inference using instrumental variables

    Zhiqiang Tan. Regression and weighting methods for causal inference using instrumental variables. Journal of the American Statistical Association, 101 0 (476): 0 1607--1618, 2006

  28. [28]

    Usaid Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, and Alexander Volfovsky

    Tianyu Wang, Marco Morucci, M. Usaid Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, and Alexander Volfovsky. FLAME : A fast large-scale almost matching exactly approach to causal inference. arXiv:1707.06315, 2019

  29. [29]

    Econometric analysis of cross section and panel data

    Jeffrey M Wooldridge. Econometric analysis of cross section and panel data. MIT press, 2010