HIMCE: High-dimensional multiple imputation via covariance-mode updating for neuroimaging and spatiotemporal blocks
Pith reviewed 2026-05-08 17:42 UTC · model grok-4.3
The pith
HIMCE approximates covariance uncertainty via mode updating to impute high-dimensional blocks faster and more accurately than HIMA or MICE.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HIMCE is a hybrid multiple-imputation procedure for continuous blocks that preserves the Gaussian conditional imputation law and propagates mean-parameter uncertainty through stochastic coefficient or local-ridge draws. In high-dimensional blocks it approximates covariance uncertainty through covariance-mode updating, optionally with a scalar bridge; in small blocks it restores exact covariance uncertainty through a conditional inverse-Wishart refresh. The authors record the exact Bayesian reference sampler, prove fixed-dimensional posterior consistency, and establish asymptotic equivalence of mode plug-in prediction in total variation.
What carries the argument
Covariance-mode updating, which replaces full posterior sampling of the covariance matrix with direct updates at its mode to approximate uncertainty without repeated matrix factorizations.
If this is right
- In primary spatial benchmarks HIMCE reduces posterior-mean error relative to HIMA and screened MICE.
- Runtime matches HIMA and stays below half the runtime of MICE.
- Interval coverage improves over HIMA although MICE remains better calibrated.
- Fixed-dimensional posterior consistency and total-variation equivalence of the mode-plug-in predictor hold.
- Randomized rank-cell PIT, PIT-consistent empirical coverage, and marginal overlays supply practical diagnostics.
Where Pith is reading between the lines
- The same mode-updating device could be inserted into other imputation pipelines that already rely on a multivariate normal working model.
- Because the method separates the conditional imputation law from the covariance refresh, it offers a modular template for hybrid imputers that mix exact and approximate steps.
- The PIT-based diagnostics could be applied directly to any chained-equation or data-augmentation procedure to compare calibration across methods.
- Asymptotic equivalence in total variation suggests that, for fixed dimension and growing sample size, HIMCE predictions converge to those of the exact sampler.
Load-bearing premise
The multivariate normal working model supplies a coherent posterior predictive target and the covariance-mode updating approximates the full covariance uncertainty closely enough to avoid substantial bias.
What would settle it
A simulation study on high-dimensional MVN data with known parameters in which HIMCE produces higher posterior-mean imputation error or worse interval calibration than exact MVN data augmentation would falsify the central performance claims.
Figures
read the original abstract
High-dimensional neuroimaging and spatiotemporal blocks often contain structured missingness from acquisition artifacts, preprocessing failures, and sensor dropout. Multiple imputation propagates uncertainty, but fully conditional specification methods such as multivariate imputation by chained equations (MICE) can be slow or unstable when block dimension is large and correlations are strong. A multivariate normal (MVN) working model provides a coherent posterior predictive target and an exact data augmentation sampler, but repeated covariance sampling and matrix factorizations become costly in large dimensions. We propose High-dimensional Imputation via covariance Mode and Chained Equations (HIMCE), a hybrid multiple-imputation procedure for continuous blocks. Relative to exact MVN data augmentation, HIMCE preserves the Gaussian conditional imputation law and propagates mean- parameter uncertainty through stochastic coefficient or local-ridge draws. In high-dimensional blocks, it approximates covariance uncertainty through covariance-mode updating, optionally with a scalar bridge; in small blocks, it can restore exact covariance uncertainty through a conditional inverse-Wishart refresh. We record the exact Bayesian reference sampler and prove fixed-dimensional posterior consistency and asymptotic equivalence of mode plug-in prediction in total variation. We also develop diagnostics based on randomized rank-cell probability integral transform (PIT), PIT-consistent empirical coverage, and marginal distribution overlays. In the primary spatial benchmark, HIMCE improves posterior-mean error relative to HIMA and screened MICE, runs at HIMA-like speed and below half the MICE runtime, and improves interval coverage over HIMA, although MICE remains better calibrated. A repeated low- dimensional NHANES illustration shows improved coverage with competitive point prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes HIMCE, a hybrid multiple-imputation method for high-dimensional continuous blocks with structured missingness in neuroimaging and spatiotemporal data. It preserves the Gaussian conditional imputation law from an MVN working model, propagates mean-parameter uncertainty via stochastic draws, and approximates covariance uncertainty through covariance-mode updating (optionally with a scalar bridge) in large blocks while allowing exact inverse-Wishart refresh in small blocks. The authors record an exact Bayesian reference sampler, prove fixed-dimensional posterior consistency and asymptotic equivalence of mode plug-in prediction in total variation, introduce PIT-based diagnostics (randomized rank-cell PIT, empirical coverage, marginal overlays), and report benchmark results showing reduced posterior-mean error and runtime versus HIMA and screened MICE, plus improved coverage over HIMA (though MICE is better calibrated) in a primary spatial benchmark, with a repeated low-dimensional NHANES illustration.
Significance. If the covariance-mode approximation controls bias adequately, HIMCE would supply a practical, faster alternative to full MVN data augmentation and MICE for high-dimensional blocks, supported by explicit consistency proofs (fixed dimension), new PIT diagnostics, and concrete benchmark gains in point estimation and speed. The work credits the exact reference sampler and develops falsifiable diagnostics that could aid reproducibility in neuroimaging imputation.
major comments (1)
- [Abstract] Abstract: The manuscript states proofs of fixed-dimensional posterior consistency and asymptotic equivalence of mode plug-in prediction in total variation, yet the central empirical claims and motivating regime concern high-dimensional blocks (where dimension may grow with sample size). No extension, rate, or bound is supplied showing that the total-variation equivalence or the bias introduced by covariance-mode updating remains controlled when p grows with n, which is the regime in which the method is benchmarked and motivated.
minor comments (1)
- [Abstract] The description of the optional scalar bridge and its effect on the approximation could be expanded with a brief equation or pseudocode to clarify when it is activated versus the full mode update.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The manuscript states proofs of fixed-dimensional posterior consistency and asymptotic equivalence of mode plug-in prediction in total variation, yet the central empirical claims and motivating regime concern high-dimensional blocks (where dimension may grow with sample size). No extension, rate, or bound is supplied showing that the total-variation equivalence or the bias introduced by covariance-mode updating remains controlled when p grows with n, which is the regime in which the method is benchmarked and motivated.
Authors: We agree that the theoretical results are derived under fixed dimension p, as stated in the manuscript. The covariance-mode updating is introduced precisely to enable scalable approximation of covariance uncertainty in the high-dimensional regime where exact inverse-Wishart sampling becomes computationally infeasible; the procedure preserves the exact conditional Gaussian imputation law from the MVN working model while propagating mean-parameter uncertainty via stochastic draws. No rates or bounds are supplied for the growing-p case, and extending the total-variation equivalence result to p = o(n) or similar regimes would require substantial additional technical work that lies outside the present scope. The high-dimensional performance claims rest on the design of the approximation together with the reported benchmark evidence (reduced posterior-mean error and competitive coverage relative to HIMA in the primary spatial example). We will revise the abstract to make the fixed-dimensional scope of the consistency and equivalence statements explicit while retaining the description of the empirical high-dimensional results. revision: yes
Circularity Check
No significant circularity detected; derivation remains self-contained
full rationale
The paper presents HIMCE as a hybrid procedure that preserves the Gaussian conditional law while approximating covariance uncertainty via mode updating (with optional scalar bridge) in large blocks and exact inverse-Wishart refresh in small blocks. It separately records an exact Bayesian reference sampler and states proofs of fixed-dimensional posterior consistency plus total-variation equivalence of mode plug-in prediction. These elements are introduced as distinct from the approximation itself; no equation or claim reduces a prediction, consistency result, or empirical performance metric to a fitted parameter or self-referential definition by construction. External benchmarks against HIMA and MICE further anchor the claims without load-bearing self-citation loops.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Data blocks follow a multivariate normal distribution
- ad hoc to paper Covariance-mode updating approximates full posterior covariance uncertainty sufficiently well
Reference graph
Works this paper leans on
-
[1]
D. B. Rubin, Multiple Imputation for Nonresponse in Surveys, Wiley, New York, 2004
work page 2004
-
[2]
S. van Buuren, K. Groothuis-Oudshoorn, mice: Multivariate imputation by chained equations in R, Journal of Statistical Software 45 (3) (2011) 1–67
work page 2011
-
[3]
van Buuren, Flexible Imputation of Missing Data, 2nd Edition, Chapman and Hall/CRC, Boca Raton, 2018
S. van Buuren, Flexible Imputation of Missing Data, 2nd Edition, Chapman and Hall/CRC, Boca Raton, 2018
work page 2018
-
[4]
J. L. Schafer, Analysis of Incomplete Multivariate Data, Chapman and Hall/CRC, London, 1997
work page 1997
-
[5]
J. L. Schafer, J. W. Graham, Missing data: Our view of the state of the art, Psychological Methods 7 (2) (2002) 147–177
work page 2002
-
[6]
R. J. A. Little, D. B. Rubin, Statistical Analysis with Missing Data, 3rd Edition, Wiley, Hoboken, 2019
work page 2019
- [7]
-
[8]
T. Lu, P. Kochunov, C. Chen, H.-H. Huang, L. E. Hong, S. Chen, A new multiple imputation method for high-dimensional neuroimaging data, Hu- man Brain Mapping 46 (5) (2025) e70161
work page 2025
-
[9]
C. J. Champion, Empirical Bayesian estimation of normal variances and covariances, Journal of Multivariate Analysis 87 (1) (2003) 60–79.doi: 10.1016/S0047-259X(02)00076-3. 25
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.