IV regression with distribution-valued outcomes
Pith reviewed 2026-06-29 09:03 UTC · model grok-4.3
The pith
IV Fréchet regression projects weighted quantile curves onto valid distributions to recover coefficients for endogenous distributional outcomes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
IVFR extends global Fréchet regression to the case with endogenous covariates by projecting IV-weighted quantile curves onto the space of valid distributions and then recovering the corresponding regression coefficient functions. The IVFR estimator converges weakly to a mean-zero Gaussian process, the multiplier bootstrap is valid for uniform inference, and the projection provably reduces estimation error in finite samples while guaranteeing valid fitted distributions.
What carries the argument
The projection of IV-weighted quantile curves onto the space of valid distributions, which reduces estimation error and produces valid fitted distributions.
If this is right
- The IVFR estimator converges weakly to a mean-zero Gaussian process.
- The multiplier bootstrap delivers valid uniform confidence bands.
- The projection step reduces integrated mean squared error by up to 63 percent in simulations.
- In the import competition application the method produces 9-10 percent narrower confidence bands and detects effects only between the 10th and 35th quantiles.
Where Pith is reading between the lines
- The same projection device could be tested on other distributional outcomes such as county-level income or health distributions.
- The Wasserstein-space framing may allow direct comparison with other optimal-transport approaches to causal distributional analysis.
- Extending the method to local rather than global Fréchet regression would test whether the projection benefit persists under heterogeneity.
Load-bearing premise
The projection of IV-weighted quantile curves onto valid distributions preserves the identifying power of the instruments and does not distort the recovered regression coefficient functions.
What would settle it
A Monte Carlo experiment in which the projected estimator shows higher integrated mean squared error than the unprojected version, or an empirical application in which the projected and unprojected estimates disagree on whether the instruments remain valid after projection.
Figures
read the original abstract
We develop IV Fr\'echet regression (IVFR), an instrumental-variable (IV) method for settings where the outcome is an entire distribution. Framing the problem as an IV regression in 2-Wasserstein space, IVFR extends global Fr\'echet regression to the case with endogenous covariates. IVFR projects IV-weighted quantile curves onto the space of valid distributions and then recovers the corresponding regression coefficient functions. The projection provably reduces the estimation error in finite samples and guarantees valid fitted distributions. We show that the IVFR estimator converges weakly to a mean-zero Gaussian process and establish the validity of a multiplier bootstrap procedure for uniform inference. In simulations, the projection reduces the integrated mean squared error (IMSE) by up to 63% relative to existing methods. Revisiting the effects of Chinese import competition on the wage distribution within commuting zones, the proposed method produces 9-10% narrower confidence bands than existing methods. Using our novel uniform confidence bands, we find no evidence that import competition reduced wages at the very bottom of the distribution, but only between the 10th and 35th quantile. We also revisit the effect of county food stamp programs on the county's birth weight distribution and find no significant effects.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops IV Fréchet regression (IVFR) for instrumental-variables estimation with distribution-valued outcomes, by embedding the problem in 2-Wasserstein space, constructing an IV-weighted quantile curve, projecting it onto the space of valid distributions, and recovering the associated regression coefficient functions. It claims that the resulting estimator converges weakly to a mean-zero Gaussian process, that a multiplier bootstrap is valid for uniform inference, that the projection step reduces integrated mean squared error by up to 63 percent in simulations, and that, in the Chinese import-competition application, it yields 9–10 percent narrower confidence bands while detecting wage effects only between the 10th and 35th quantiles.
Significance. If the asymptotic claims hold, the contribution is a useful methodological advance that extends global Fréchet regression to endogenous regressors while supplying both a finite-sample error-reduction device and uniform inference tools. The reported simulation gains and the empirical finding of quantile-specific effects (rather than uniform wage depression) illustrate practical value. The explicit projection step that guarantees valid fitted distributions is a concrete strength.
major comments (2)
- [theoretical results on asymptotic convergence] Theoretical results on weak convergence (abstract and the section stating the main limit theorem): the claim that the post-projection IVFR estimator converges weakly to a mean-zero Gaussian process is stated without interior-point conditions on the true quantile curve or tangent-cone analysis for the nonlinear projection operator. When the true parameter lies on the boundary of the constraint set (monotonicity or integrability to a proper CDF), the limiting distribution after projection is generally the projection of the pre-projection Gaussian process onto the tangent cone and need not be centered or Gaussian; the paper therefore requires an additional assumption or case analysis to justify both the mean-zero Gaussian limit and the subsequent bootstrap validity.
- [theoretical results on asymptotic convergence] Bootstrap validity claim (same section): the multiplier bootstrap is asserted to be valid for uniform inference, but this rests on the same unverified interior-point condition; without it, the bootstrap may not consistently approximate the (possibly non-Gaussian) limiting law induced by the projection.
minor comments (2)
- [abstract] The abstract and introduction would benefit from an explicit list of the maintained assumptions (e.g., moment conditions on the instruments, regularity on the conditional quantile curves) under which the convergence and bootstrap results are derived.
- [simulation section] Simulation tables should report the precise definition of the IMSE metric and the number of Monte Carlo replications used to obtain the 63 percent reduction figure.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on the asymptotic theory. The points raised about boundary behavior are well taken, and we address them point by point below, indicating the revisions we will make.
read point-by-point responses
-
Referee: Theoretical results on weak convergence (abstract and the section stating the main limit theorem): the claim that the post-projection IVFR estimator converges weakly to a mean-zero Gaussian process is stated without interior-point conditions on the true quantile curve or tangent-cone analysis for the nonlinear projection operator. When the true parameter lies on the boundary of the constraint set (monotonicity or integrability to a proper CDF), the limiting distribution after projection is generally the projection of the pre-projection Gaussian process onto the tangent cone and need not be centered or Gaussian; the paper therefore requires an additional assumption or case analysis to justify both the mean-zero Gaussian limit and the subsequent bootstrap validity.
Authors: We agree that the current theorem statement does not explicitly impose an interior-point condition. To justify that the nonlinear projection operator is Hadamard differentiable at the true parameter (hence preserving the mean-zero Gaussian limit), we will add an explicit assumption requiring the true IV-weighted quantile curve to lie in the relative interior of the constraint set. We will also insert a brief remark noting that, on the boundary, the limit would instead be the projection of the pre-projection process onto the tangent cone. These changes will be reflected in both the abstract and the main theorem. revision: yes
-
Referee: Bootstrap validity claim (same section): the multiplier bootstrap is asserted to be valid for uniform inference, but this rests on the same unverified interior-point condition; without it, the bootstrap may not consistently approximate the (possibly non-Gaussian) limiting law induced by the projection.
Authors: The bootstrap consistency argument relies on the same differentiability of the projection map. Once the interior-point assumption is added, the standard multiplier-bootstrap arguments for Hadamard-differentiable functionals apply directly and deliver uniform consistency. We will update the bootstrap theorem statement and the accompanying proof sketch to reference the new assumption explicitly. revision: yes
Circularity Check
No significant circularity; derivation self-contained against external benchmarks
full rationale
The abstract and description frame IVFR as an extension of Fréchet regression to endogenous covariates via projection onto valid distributions in Wasserstein space, followed by claims of weak convergence to a mean-zero Gaussian process and bootstrap validity. No quoted equations or steps reduce any central result (e.g., the limiting distribution or IMSE reduction) to a fitted parameter renamed as prediction, a self-citation chain, or a definitional tautology. The projection is presented as an additional operator with claimed finite-sample benefits, and the limiting results are stated as theorems without visible reduction to inputs by construction. This is the normal case of an independent methodological contribution.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Outcomes are random elements of the 2-Wasserstein space of probability distributions.
- domain assumption Instrumental variables satisfy the usual relevance and exogeneity conditions extended to the distributional setting.
Reference graph
Works this paper leans on
-
[1]
Abadie, A., Angrist, J., and Imbens, G. (2002). Instrumental variables estimates of the effect of subsidized training on the quantiles of trainee earnings. Econometrica , 70(1):91--117
2002
-
[2]
and Carlier, G
Agueh, M. and Carlier, G. (2011). Barycenters in the wasserstein space. SIAM Journal on Mathematical Analysis , 43(2):904--924
2011
-
[3]
W., and Schanzenbach, D
Almond, D., Hoynes, H. W., and Schanzenbach, D. W. (2011). Inside the war on poverty: The impact of food stamps on birth outcomes. The Review of Economics and Statistics , 93(2):387--403
2011
-
[4]
H., Dorn, D., and Hanson, G
Autor, D. H., Dorn, D., and Hanson, G. H. (2013). The China Syndrome: Local Labor Market Effects of Import Competition in the United States . American Economic Review , 103(6):2121–68
2013
-
[5]
D., Ewing, G
Ayer, M., Brunk, H. D., Ewing, G. M., Reid, W. T., and Silverman, E. (1955). An empirical distribution function for sampling with incomplete information. The Annals of Mathematical Statistics , pages 641--647
1955
-
[6]
Beyhum, J., Tedesco, L., and Van Keilegom, I. (2023). Instrumental variable quantile regression under random right censoring. The Econometrics Journal , 27(1):21--36
2023
- [7]
-
[8]
Canay, I. A. (2011). A simple approach to quantile regression for panel data. The Econometrics Journal , 14(3):368--386
2011
-
[9]
Chen, S. (2025). Quantile regression with group-level treatments. Journal of Econometrics , 251:106079
2025
-
[10]
Chen, S. and Feng, J. (2023). Group-heterogeneous changes-in-changes and distributional synthetic controls. arXiv preprint arXiv:2307.15313
-
[11]
Chernozhukov, V., Fern \'a ndez-Val, I., and Galichon, A. (2010). Quantile and probability curves without crossing. Econometrica , 78(3):1093--1125
2010
-
[12]
Chernozhukov, V., Fern \'a ndez-Val, I., and Melly, B. (2013). Inference on counterfactual distributions. Econometrica , 81(6):2205--2268
2013
-
[13]
and Hansen, C
Chernozhukov, V. and Hansen, C. (2005). An IV model of quantile treatment effects. Econometrica , 73(1):245--261
2005
-
[14]
and Hansen, C
Chernozhukov, V. and Hansen, C. (2006). Instrumental quantile regression inference for structural and treatment effect models. Journal of Econometrics , 132(2):491--525
2006
-
[15]
and Hansen, C
Chernozhukov, V. and Hansen, C. (2008). Instrumental variable quantile regression: A robust inference approach. Journal of Econometrics , 142(1):379--398
2008
-
[16]
Chetverikov, D., Larsen, B., and Palmer, C. (2016). IV quantile regression for group-level treatments, with an application to the distributional effects of trade. Econometrica , 84(2):809--833
2016
-
[17]
F., Kaplan, D
de Castro , L., Galvao, A. F., Kaplan, D. M., and Liu, X. (2019). Smoothed GMM for quantile models. Journal of Econometrics , 213(1):121--144. Annals: In Honor of Roger Koenker
2019
-
[18]
Dvoretzky, A., Kiefer, J., and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. The Annals of Mathematical Statistics , pages 642--669
1956
-
[19]
and M \"u ller, H.-G
Fan, J. and M \"u ller, H.-G. (2024). Conditional wasserstein barycenters and interpolation/extrapolation of distributions. IEEE Transactions on Information Theory
2024
-
[20]
and Santos, A
Fang, Z. and Santos, A. (2019). Inference on directionally differentiable functions. The Review of Economic Studies , 86(1):377--412
2019
-
[21]
M., and Lemieux, T
Firpo, S., Fortin, N. M., and Lemieux, T. (2009). Unconditional quantile regressions. Econometrica , 77(3):953--973
2009
-
[22]
and Melly, B
Fr \"o lich, M. and Melly, B. (2013). Unconditional quantile treatment effects under endogeneity. Journal of Business & Economic Statistics , 31(3):346--357
2013
-
[23]
F., Gu, J., and Volgushev, S
Galvao, A. F., Gu, J., and Volgushev, S. (2020). On the unbiased asymptotic normality of quantile regression with fixed effects. Journal of Econometrics , 218(1):178--215
2020
-
[24]
Galvao, A. F. and Kato, K. (2016). Smoothed quantile regression for panel data. Journal of Econometrics , 193(1):92--112
2016
-
[25]
Galvao, A. F. and Wang, L. (2015). Efficient minimum distance estimator for quantile regression fixed effects panel data. Journal of Multivariate Analysis , 133:1--26
2015
-
[26]
and Panaretos, V
Ghodrati, L. and Panaretos, V. M. (2022). Distribution-on-distribution regression via optimal transport maps. Biometrika , 109(4):957--974
2022
-
[27]
Gunsilius, F. F. (2023). Distributional synthetic controls. Econometrica , 91(3):1105--1117
2023
-
[28]
Hansen, B. E. (2022). Econometrics . Princeton University Press
2022
-
[29]
Hausman, J. A. and Taylor, W. E. (1981). Panel data and unobservable individual effects. Econometrica: Journal of the Econometric Society , pages 1377--1398
1981
-
[30]
Holovchak, A., Saengkyongam, S., Meinshausen, N., and Shen, X. (2025). Distributional instrumental variable method. arXiv preprint arXiv:2502.07641
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[31]
Horowitz, J. L. and Lee, S. (2007). Nonparametric instrumental variables estimation of a quantile regression model. Econometrica , 75(4):1191--1208
2007
- [32]
-
[33]
and W\"uthrich, K
Kaido, H. and W\"uthrich, K. (2021). Decentralization estimators for instrumental variable quantile regression models. Quantitative Economics , 12(2):443--475
2021
-
[34]
Kaplan, D. M. and Sun, Y. (2017). Smoothed estimating equations for instrumental variables quantile regression. Econometric Theory , 33(1):105–157
2017
-
[35]
Katta, S., Parikh, H., Rudin, C., and Volfovsky, A. (2024). Interpretable causal inference for analyzing wearable, sensor, and distributional data. In International Conference on Artificial Intelligence and Statistics , pages 3340--3348. PMLR
2024
-
[36]
Koenker, R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis , 91(1):74--89. Special Issue on Semiparametric and Nonparametric Mixed Models
2004
-
[37]
Kruskal, J. B. (1964). Nonmetric multidimensional scaling: a numerical method. Psychometrika , 29(2):115--129
1964
- [38]
- [39]
- [40]
-
[41]
Lee, S. (2007). Endogeneity in quantile regression models: A control function approach. Journal of Econometrics , 141(2):1131--1158
2007
-
[42]
Lin, Z., Kong, D., and Wang, L. (2023). Causal inference on distribution functions. Journal of the Royal Statistical Society Series B: Statistical Methodology , 85(2):378--398
2023
-
[43]
and Pons, M
Melly, B. and Pons, M. (2025a). mdqr. R package version 0.1.0. https://github.com/martinapons/mdqr
-
[44]
Melly, B. and Pons, M. (2025b). Minimum distance estimation of quantile panel data models. arXiv preprint arXiv:2502.18242
-
[45]
Miles, R. (1959). The complete amalgamation into blocks, by weighted means, of a finite set of real numbers. Biometrika , 46(3/4):317--327
1959
-
[46]
Oliva, J., P \'o czos, B., and Schneider, J. (2013). Distribution to distribution regression. In International Conference on Machine Learning , pages 1049--1057. PMLR
2013
-
[47]
Panaretos, V. M. and Zemel, Y. (2020). An invitation to statistics in Wasserstein space . Springer Nature
2020
-
[48]
Petersen, A., Liu, X., and Divani, A. A. (2021). Wasserstein F-tests and confidence bands for the Fr \'e chet regression of density response curves . The Annals of Statistics , 49(1):590--611
2021
-
[49]
and M \"u ller, H.-G
Petersen, A. and M \"u ller, H.-G. (2019). Fr\'echet regression for random objects with Euclidean predictors . The Annals of Statistics , 47(2):691 -- 719
2019
-
[50]
Pons, M. (2024). Quantile on quantiles. Working Paper
2024
-
[51]
Qu, Z. and Kwon, Y. (2024). Distributionally robust instrumental variables estimation. arXiv preprint arXiv:2410.15634
-
[52]
Rio, E. (2017). Asymptotic theory of weakly dependent random processes , volume 80. Springer
2017
-
[53]
T., and Dykstra, R
Robertson, T., Wright, F. T., and Dykstra, R. L. (1988). Order Restricted Statistical Inference . John Wiley & Sons, New York
1988
-
[54]
Rychlik, T. (2012). Projecting statistical functionals , volume 160. Springer Science & Business Media
2012
-
[55]
Skorokhod, A. V. (1956). Limit theorems for stochastic processes. Theory of Probability & Its Applications , 1(3):261--290
1956
-
[56]
Song, W., Dubey, P., M \"u ller, H.-G., and Petersen, A. (2026). Inference for Fr\'echet regression. arXiv preprint arXiv:2605.19519
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[57]
Torous, W., Gunsilius, F., and Rigollet, P. (2024). An optimal transport approach to estimating causal effects via nonlinear difference-in-differences. Journal of Causal Inference , 12(1)
2024
-
[58]
van der Vaart, A. W. (2000). Asymptotic Statistics , volume 3. Cambridge University Press
2000
-
[59]
van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics . Springer, New York
1996
- [60]
-
[61]
and Xu, H
Vuong, Q. and Xu, H. (2017). Counterfactual mapping and individual treatment effects in nonseparable models with binary endogeneity. Quantitative Economics , 8(2):589--610
2017
-
[62]
W\"uthrich, K. (2019). A closed-form estimator for quantile treatment effects with endogeneity. Journal of Econometrics , 210(2):219--235
2019
-
[63]
W\"uthrich, K. (2020). A comparison of two quantile models with endogeneity. Journal of Business & Economic Statistics , 38(2):443--456
2020
-
[64]
and Li, H
Xu, H. and Li, H. (2025). Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds . Journal of Machine Learning Research , 26(77):1--123
2025
- [65]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.