A Direct Variance Estimation (DiVE) for Meta-Analysis of Median Differences
Pith reviewed 2026-05-25 04:08 UTC · model grok-4.3
The pith
DiVE directly estimates the variance of a pooled median difference from study medians and sample sizes alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DiVE directly estimates the variance of the pooled median difference using only study-level median differences and their sample sizes. This avoids the need for dispersion measures such as quartiles or ranges that two-stage methods require. A comprehensive simulation study across a wide range of distributional scenarios shows that DiVE performs comparably to or better than conventional two-stage methods, with clear advantages when the number of studies is small. A re-analysis of published meta-analyses demonstrates that DiVE enables the inclusion of studies lacking dispersion statistics, leading to a more comprehensive and potentially less biased synthesis of evidence.
What carries the argument
The Direct Variance Estimation (DiVE) formula that computes the variance of the pooled median difference directly from the vector of study medians and study sample sizes.
If this is right
- Studies lacking quartiles or ranges can now be retained rather than dropped from the meta-analysis.
- Pooled estimates become feasible from a larger and potentially less selected set of evidence.
- Performance gains appear most clearly when the total number of studies is small.
- The same median-difference data can be re-analyzed under both DiVE and two-stage methods for comparison.
Where Pith is reading between the lines
- DiVE could be adapted to other location measures that are commonly reported without accompanying scale statistics.
- Journals might begin to require only medians and sample sizes for certain trial summaries if DiVE becomes standard.
- Past meta-analyses that excluded many studies for missing dispersion data could be revisited with the new estimator.
- A practical next step would be to apply DiVE to a registry of published median-difference trials and compare the resulting pooled intervals with those from restricted two-stage analyses.
Load-bearing premise
The direct variance formula remains valid and unbiased across the range of real-world distributions and study sizes where median differences are reported, without requiring the dispersion statistics that two-stage methods use.
What would settle it
A simulation in which DiVE produces empirical coverage rates for the pooled median difference that fall well below the nominal level in heavy-tailed distributions or with very small per-study sample sizes.
Figures
read the original abstract
Meta-analyses of two-group studies that report median differences typically rely on methods that require, in addition to the median difference and sample size, summary measures of dispersion such as quartiles or ranges. Studies that do not report such statistics are often excluded from the meta-analysis. Existing two-stage approaches first estimate the asymptotic variance of the median difference within each study under parametric assumptions, and then combine these study-specific estimates to obtain the pooled median difference and its variance. We propose Direct Variance Estimation (DiVE), a method that directly estimates the variance of the pooled difference using only study-level median differences and their sample sizes. A comprehensive simulation study across a wide range of distributional scenarios shows that DiVE performs comparably to or better than conventional two-stage methods, with clear advantages when the number of studies is small. A re-analysis of published meta-analyses demonstrates that DiVE enables the inclusion of studies lacking dispersion statistics, leading to a more comprehensive and potentially less biased synthesis of evidence.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Direct Variance Estimation (DiVE) for meta-analysis of two-group median differences. Unlike conventional two-stage methods that first estimate study-specific asymptotic variances (requiring dispersion statistics such as quartiles) and then pool, DiVE claims to estimate the variance of the pooled median difference directly from the vector of study medians and sample sizes alone. Simulations across distributional scenarios are reported to show DiVE performs comparably or better than two-stage estimators, with advantages at small k; a re-analysis of published meta-analyses is used to illustrate that DiVE permits inclusion of studies lacking dispersion data.
Significance. If the direct estimator is unbiased and its performance generalizes beyond the simulated designs, the method would allow meta-analysts to retain studies that currently must be dropped for lack of quartiles or ranges, thereby reducing selection bias and increasing the number of includable studies, especially in small-k settings common in median-difference meta-analyses.
major comments (3)
- [§2, Eq. (4)] §2, Eq. (4): the claimed direct variance formula is presented without an explicit derivation showing how the term involving the unknown density f(m) at the median is recovered from medians and n_i alone; the asymptotic variance 1/(4n f(m)^2) per arm cannot be obtained from the median value and sample size without either a shape assumption or an implicit proxy that uses cross-study median variation, which would conflate sampling variance with heterogeneity.
- [§4] Simulation design (described in §4): all scenarios appear to hold the underlying dispersion (hence f(m)) constant across studies within each replicate; this does not test the heterogeneous-scale case that would be required to establish robustness when real-world studies report medians on different measurement scales or with different spreads.
- [Table 3] Table 3, k=5 rows: the reported coverage probabilities and MSE advantages for DiVE are shown only under the constant-dispersion design; without results under scale heterogeneity, the claim of “clear advantages when the number of studies is small” remains conditional on an assumption that may not hold in the target application.
minor comments (2)
- [§2] Notation for the pooled estimator and its variance should be introduced once in §2 and used consistently thereafter; occasional reuse of “V” for both study-specific and pooled quantities is confusing.
- [§5] The re-analysis section would benefit from a table listing, for each meta-analysis, how many additional studies are recovered by DiVE and the resulting change in the pooled point estimate and CI width.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate where revisions will be made to improve clarity and robustness.
read point-by-point responses
-
Referee: [§2, Eq. (4)] §2, Eq. (4): the claimed direct variance formula is presented without an explicit derivation showing how the term involving the unknown density f(m) at the median is recovered from medians and n_i alone; the asymptotic variance 1/(4n f(m)^2) per arm cannot be obtained from the median value and sample size without either a shape assumption or an implicit proxy that uses cross-study median variation, which would conflate sampling variance with heterogeneity.
Authors: We agree that an explicit derivation of Equation (4) is required. In the revised manuscript we will insert a full derivation that shows how the density term is recovered under the fixed-effect model using only the vector of medians and the n_i values. The approach relies on the fact that, when all studies share the same underlying distribution, the observed dispersion among the study medians supplies an estimate of the common within-study variance component; we will explicitly state that this construction assumes absence of heterogeneity and will add a paragraph discussing the consequences if that assumption is violated. revision: yes
-
Referee: [§4] Simulation design (described in §4): all scenarios appear to hold the underlying dispersion (hence f(m)) constant across studies within each replicate; this does not test the heterogeneous-scale case that would be required to establish robustness when real-world studies report medians on different measurement scales or with different spreads.
Authors: The referee is correct that the reported simulations maintain constant dispersion within each replicate. This design isolates performance under the model assumptions used to derive DiVE. To address the concern, the revised manuscript will include an additional set of simulations in which scale (and therefore f(m)) varies across studies within each replicate, allowing direct assessment of robustness under heterogeneous dispersion. revision: yes
-
Referee: [Table 3] Table 3, k=5 rows: the reported coverage probabilities and MSE advantages for DiVE are shown only under the constant-dispersion design; without results under scale heterogeneity, the claim of “clear advantages when the number of studies is small” remains conditional on an assumption that may not hold in the target application.
Authors: We acknowledge that the advantages reported for small k in Table 3 are obtained under constant dispersion. In the revision we will either qualify the claim to reflect this scope or, preferably, augment Table 3 (or add a supplementary table) with the corresponding results from the new heterogeneous-scale simulations, thereby making the small-k advantage claim less conditional. revision: partial
Circularity Check
No circularity detected; derivation is self-contained
full rationale
The paper derives DiVE as a standalone formula for the variance of the pooled median difference that uses only the reported medians and sample sizes per study, without fitting parameters to the target pooled quantity or invoking self-citations for uniqueness. No equations reduce the claimed variance estimator to a re-expression of the input medians by construction, nor does any simulation step serve as the derivation itself. The method is presented as mathematically independent of two-stage dispersion-based estimators, and external validation via simulation does not create a fitted-input loop. This is the normal case of a self-contained statistical derivation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The variance of the pooled median difference estimator can be obtained directly from study-level medians and sample sizes without intermediate per-study variance estimates.
Reference graph
Works this paper leans on
-
[1]
Bahadur, R. R. (1966). A note on quantiles in large samples.Ann. Math. Stat.37 (3), 577–580. Bland, M. (2014). Estimating mean and standard deviation from the sample size, three quartiles, minimum, and maximum.Int. J. Stat. Med. Res.4 (1), 57–64. Borenstein, M., Hedges, L. V., Higgins, J. P. T., and Rothstein, H. R. (2009).Introduction to meta-analysis. H...
work page 1966
-
[2]
Calatayud, D., Arora, S., Aggarwal, R., Kruglikova, I., Schulze, S., Funch-Jensen, P., and Grantcharov, T. (2010). Warm-up in a virtual reality environment improves performance in the operating room.Ann. Surg.251 (6), 1181–1185. Chowdhry, A. K., Dworkin, R. H., and McDermott, M. P. (2016). Meta-analysis with missing study-level sample variance data.Stat. ...
work page 2010
-
[3]
DerSimonian, R. and Laird, N. (1986). Meta-analysis in clinical trials.Control. Clin. Trials7 (3), 177–188. Desender, L., Van Herzeele, I., Lachat, M., Duchateau, J., Bicknell, C., Teijink, J., Heyligers, J., Vermassen, F., and PAVLOV Study Group (2017). A multicentre trial of patient specific rehearsal prior to EVAR: impact on procedural planning and tea...
work page 1986
-
[4]
Katzenschlager, S., Zimmer, A. J., Gottschalk, C., Grafeneder, J., Schmitz, S., Kraker, S., Ganslmeier, M., Muth, A., Seitel, A., Maier-Hein, L., Benedetti, A., Larmann, J., Weigand, M. A., McGrath, S., and Denkinger, C. M. (2021). Can we predict the severe course of COVID-19 - a systematic review and meta-analysis of indicators of clinical outcome?PLoS O...
work page 2021
-
[5]
Early supported dis- charge services for people with acute stroke.Cochrane Database Syst
Langhorne, P., Baylan, S., and Early Supported Discharge Trialists (2017). Early supported dis- charge services for people with acute stroke.Cochrane Database Syst. Rev.7 (7), CD000443. Lehmann, E. L. and Casella, G. (1998).Theory of point estimation. 2nd ed. New York, NY: Springer. Luo, D., Wan, X., Liu, J., and Tong, T. (2018). Optimally estimating the ...
work page 2017
-
[6]
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hr´ objartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., McGuinness, L. A., Stewart, L. A., Thomas, J., Tricco, A. C., Welch, V. A., Whi...
work page 2021
-
[7]
Saldanha, I. J., Lindsley, K. B., Money, S., Kimmel, H. J., Smith, B. T., and Dickersin, K. (2020). Outcome choice and definition in systematic reviews leads to few eligible studies included in meta-analyses: a case study.BMC Med. Res. Methodol.20 (1),
work page 2020
-
[8]
Shore, E. M., Grantcharov, T. P., Husslein, H., Shirreff, L., Dedy, N. J., McDermott, C. D., and Lefebvre, G. G. (2016). Validating a standardized laparoscopy curriculum for gynecology residents: a randomized controlled trial.Am. J. Obstet. Gynecol.215 (2), 204.e1–204.e11. Sutton, A. J. and Higgins, J. P. T. (2008). Recent developments in meta-analysis.St...
work page 2016
-
[9]
J., Butcher, I., Assi, V., Lewis, S
Weir, C. J., Butcher, I., Assi, V., Lewis, S. C., Murray, G. D., Langhorne, P., and Brady, M. C. (2018). Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.BMC Med. Res. Methodol.18 (1),
work page 2018
-
[10]
T., Conteh, L., Cibulskis, R., and Ghani, A
White, M. T., Conteh, L., Cibulskis, R., and Ghani, A. C. (2011). Costs and cost-effectiveness of malaria control interventions: a systematic review.Malar. J.10 (1),
work page 2011
-
[11]
Wiebe, N., Vandermeer, B., Platt, R. W., Klassen, T. P., Moher, D., and Barrowman, N. J. (2006). A systematic review identifies a lack of standardization in methods for handling missing variance data.J. Clin. Epidemiol.59 (4), 342–353. 17 Appendix A Direct estimator forVar(ˆµ s): derivation and proof A.1 Notation and assumptions Recall ˆµs = PN i=1 ˜ws i ...
work page 2006
-
[12]
LetR denote the number of replicates per design setting. Within a replicate and studyi, letσ 2 i be the theoretical variance of the study-level median difference, computed from the population densities at the group-specific medians and the real- ized sample sizes (standard large-sample formula; see McGrath et al. (2020)). Given a target heterogeneityI 2,τ...
work page 2020
-
[13]
It is set to a pre-specified non-zero constant in the scenarios, following the mean-shift approach of McGrath et al. (2020). Group 2 receives no shift, so the true median difference equalsc. Skew-normal parameters follow the direct parameterisation (location, scale, shape). Log-normal parameters are on the natural-log scale. 23 I² = 0% I² = 25% I² = 50% I...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.