Improving estimation of the volume under the ROC surface when data are missing not at random
Pith reviewed 2026-05-25 19:05 UTC · model grok-4.3
The pith
Mean score equations produce consistent estimators for the volume under the ROC surface under nonignorable verification bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By deriving and solving mean score equations from a parametric regression model of the verification process, four estimators for the volume under the ROC surface can be constructed that correct for nonignorable verification bias, achieve consistency, and possess asymptotic normality, with the disease model needed only for verified subjects.
What carries the argument
Mean score equation derived from the parametric verification model, which uses estimated verification and disease probabilities to adjust the VUS calculation for missingness.
If this is right
- The estimators are consistent when the verification model is correct.
- Asymptotic normality holds and supports inference procedures.
- Instrumental variables can be used to address identifiability in the verification model.
- The disease model needs to be specified only for verified subjects.
- Four distinct estimators can be formed from different combinations of the estimated probabilities.
Where Pith is reading between the lines
- The mean-score route may prove simpler to implement than full-likelihood maximization in some settings.
- Similar score-equation adjustments could be explored for other diagnostic accuracy summaries under missing data.
- Relaxing the verification model to semiparametric form would be a natural next step to test robustness.
Load-bearing premise
The parametric regression model for the verification process must be correctly specified.
What would settle it
A simulation in which the verification model is misspecified and the resulting VUS estimators exhibit persistent bias or fail to converge to the true value.
read the original abstract
In this paper, we propose a mean score equation-based approach to estimate the the volume under the receiving operating characteristic (ROC) surface (VUS) of a diagnostic test, under nonignorable (NI) verification bias. The proposed approach involves a parametric regression model for the verification process, which accommodates for possible NI missingness in the disease status of sample subjects, and may use instrumental variables, which help avoid possible identifiability problems. In order to solve the mean score equation derived by the chosen verification model, we preliminarily need to estimate the parameters of a model for the disease process, but its specification is required only for verified subjects under study. Then, by using the estimated verification and disease probabilities, we obtain four verification bias-corrected VUS estimators, which are alternative to those recently proposed by To Duc et al. (2019), based on a full likelihood approach. Consistency and asymptotic normality of the new estimators are established. Simulation experiments are conducted to evaluate their finite sample performances, and an application to a dataset from a research on epithelial ovarian cancer is presented.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a mean score equation-based approach to estimate the volume under the ROC surface (VUS) of a diagnostic test under nonignorable verification bias. It specifies a parametric regression model for the verification process (accommodating NI missingness and possibly using instrumental variables), estimates parameters of a disease model only for verified subjects, derives four verification bias-corrected VUS estimators as alternatives to the full-likelihood method of To Duc et al. (2019), establishes consistency and asymptotic normality of the estimators, and evaluates performance via simulation experiments and an application to epithelial ovarian cancer data.
Significance. If the consistency and asymptotic normality results hold under the stated parametric assumptions, the work supplies a computationally lighter alternative to full-likelihood estimation for VUS correction in the presence of nonignorable verification bias, a frequent challenge in diagnostic accuracy studies. The explicit provision of simulation studies and a real-data example, together with the theoretical guarantees, strengthens the methodological contribution to statistical methods for incomplete diagnostic data.
major comments (2)
- [Theoretical results on consistency] The consistency and asymptotic normality results (abstract and theoretical development) are established only under correct specification of the parametric verification model for the entire sample; this assumption is load-bearing for the NI correction yet the manuscript provides no sensitivity analyses or robustness checks when the verification model is misspecified.
- [Estimation procedure and identifiability] The approach invokes instrumental variables to restore identifiability under NI missingness (abstract), but supplies neither formal conditions on IV validity/strength nor any diagnostic procedures or sensitivity checks for the chosen instruments; this directly affects practical use of the four proposed estimators.
minor comments (1)
- [Abstract] Abstract contains a typographical error ('estimate the the volume').
Simulated Author's Rebuttal
We thank the referee for the constructive comments and positive overall assessment of our work. We address each major comment point by point below, indicating planned revisions where the manuscript can be strengthened.
read point-by-point responses
-
Referee: The consistency and asymptotic normality results (abstract and theoretical development) are established only under correct specification of the parametric verification model for the entire sample; this assumption is load-bearing for the NI correction yet the manuscript provides no sensitivity analyses or robustness checks when the verification model is misspecified.
Authors: We agree that the consistency and asymptotic normality results are derived under the assumption of correct specification of the parametric verification model, which is standard for such parametric estimators. The manuscript does not include sensitivity analyses for misspecification. In the revised version, we will add simulation experiments that deliberately misspecify the verification model to assess the robustness of the four proposed estimators. revision: yes
-
Referee: The approach invokes instrumental variables to restore identifiability under NI missingness (abstract), but supplies neither formal conditions on IV validity/strength nor any diagnostic procedures or sensitivity checks for the chosen instruments; this directly affects practical use of the four proposed estimators.
Authors: The manuscript mentions that instrumental variables may be used to help with identifiability under nonignorable missingness but does not provide formal conditions on validity or strength, nor diagnostics or sensitivity checks. We will revise the manuscript to include a dedicated discussion of these conditions (drawing on standard IV literature), along with practical guidance on diagnostics and sensitivity checks for the instruments. revision: yes
Circularity Check
No significant circularity; estimators derived from independent parametric models with explicit consistency proof
full rationale
The paper specifies separate parametric regression models for the verification process (using instrumental variables for identifiability under NI missingness) and the disease process (only on verified subjects). Mean-score equations are solved using these fitted probabilities to produce four VUS estimators. Consistency and asymptotic normality are established directly under the assumption of correct verification-model specification. The method is presented as an alternative to the authors' own prior full-likelihood approach (To Duc et al. 2019), but the new derivation does not reduce to that prior work or to any fitted quantity by construction. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain. The result remains falsifiable via model misspecification checks external to the fitted values.
Axiom & Free-Parameter Ledger
free parameters (2)
- verification model parameters
- disease model parameters
axioms (1)
- domain assumption The chosen parametric regression model for the verification process is correctly specified and accommodates nonignorable missingness.
Reference graph
Works this paper leans on
-
[1]
Goeman, J. J. and le Cessie, S. (2006). A goodness-of-fit test for multinomial logistic regres- sion. Biometrics, 62(4):980–985
work page 2006
-
[2]
Kim, J. K. and Shao, J. (2013). Statistical methods for handling incomplete data . Chapman and Hall/CRC
work page 2013
-
[3]
Liu, D. and Zhou, X. H. (2010). A model for adjusting for nonignorable verification bias in estimation of the ROC curve and its area with likelihood–based approach. Biometrics, 66(4):1119–1128
work page 2010
-
[4]
Louis, T. A. (1982). Finding the observed information matrix when using the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological) , 44(2):226–233
work page 1982
-
[5]
Mor, G., Visintin, I., Lai, Y., Zhao, H., Schwartz, P., Rutherford, T., Yue, L., Bray-Ward, P., and Ward, D. C. (2005). Serum protein markers for early detection of ovarian cancer. Proceedings of the National Academy of Sciences , 102(21):7677–7682
work page 2005
-
[6]
Morikawa, K., Kim, J. K., and Kano, Y. (2017). Semiparametric maximum likelihood esti- mation with data missing not at random. Canadian Journal of Statistics , 45(4):393–409
work page 2017
-
[7]
Nakas, C. T. and Yiannoutsos, C. T. (2004). Ordered multiple-class ROC analysis with continuous measurements. Statistics in Medicine , 23(22):3437–3449
work page 2004
-
[8]
Riddles, M. K., Kim, J. K., and Im, J. (2016). A propensity-score-adjustment method for nonignorable nonresponse. Journal of Survey Statistics and Methodology , 4(2):215–245. Scurfield, B. K. (1996). Multiple-event forced-choice tasks in the theory of signal detectability. Journal of Mathematical Psychology , 40(3):253–269. To Duc, K., Chiogna, M., and Adi...
-
[9]
Visintin, I., Feng, Z., Longton, G., Ward, D. C., Alvero, A. B., Lai, Y., Tenthorey, J., Leiser, A., Flores-Saaib, R., Yu, H., et al. (2008). Diagnostic markers for early detection of ovarian cancer. Clinical cancer research, 14(4):1065–1072
work page 2008
-
[10]
Wang, S., Shao, J., and Kim, J. K. (2014). An instrumental variable approach for identification and estimation with nonignorable nonresponse. Statistica Sinica, 24(3):1097–1116
work page 2014
-
[11]
Yu, W., Kim, J. K., and Park, T. (2018). Estimation of area under the ROC curve under the nonignorable verification bias. Statistica Sinica, 28(4):2149–2166
work page 2018
-
[12]
Zhang, Y. and Alonzo, T. A. (2018). Estimation of the volume under the receiver-operating characteristic surface adjusting for non-ignorable verification bias. Statistical Methods in Medical Research, 27(3):715–739. 23
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.