pith. sign in

arxiv: 2606.26309 · v1 · pith:3SYBM2S5new · submitted 2026-06-24 · 📊 stat.ME

Variance Deltas for Visualizing and Explaining Posterior Uncertainty

Pith reviewed 2026-06-26 01:18 UTC · model grok-4.3

classification 📊 stat.ME
keywords variance deltasposterior uncertaintymissing informationBayesian visualizationinteractive explorationcausal inferencepolling forecasts
0
0 comments X

The pith

Variance deltas build trees of unobserved model subsets to explain what is missing from Bayesian posteriors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces variance deltas, an interactive visualization tool that identifies missing information capable of reducing uncertainty about a target quantity in a Bayesian posterior. It treats this missing information as subsets of unobserved model quantities and organizes those subsets into a tree ordered by explanatory strength. The system generates candidate subsets automatically from limited user input and lets users interactively split or merge them to find useful explanations. Demonstrations apply the method to a simulated causal inference setting for a treatment effect and to a real polling forecast for a population proportion.

Core claim

Variance deltas represent missing information as subsets of unobserved model quantities, arranged in a tree according to how much each subset accounts for uncertainty about the quantity of interest, and provide interactive operations to refine those subsets from a Bayesian posterior.

What carries the argument

The variance deltas tree, which ranks subsets of unobserved quantities by their power to explain posterior uncertainty and supports interactive division and combination.

If this is right

  • The system can surface nonobvious drivers of uncertainty about a treatment effect in causal models.
  • It can isolate sources of bias affecting a population proportion estimate in polling data.
  • Automated subset construction plus interactive refinement allows efficient exploration without exhaustive manual specification.
  • The tree structure directly links explanatory power to concrete unobserved quantities rather than abstract sensitivity measures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could guide targeted data collection by highlighting which additional observations would most reduce uncertainty.
  • It may apply to sequential decision problems where new measurements are chosen to shrink posterior variance on a policy parameter.
  • Integration with existing Bayesian software would let analysts apply the tree view to models they already fit.

Load-bearing premise

The posterior already contains the structure needed to decompose missing information into ranked subsets without extra modeling assumptions.

What would settle it

A simulation in which the highest-ranked subset in the variance deltas tree produces little or no actual reduction in posterior variance when its quantities are observed.

Figures

Figures reproduced from arXiv: 2606.26309 by Collin Cademartori.

Figure 1
Figure 1. Figure 1: A variance delta constructed for the toy model given in ( [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The tree from Figure [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The simulated panel data. Units 1-3 are all identical in the pre-treatment period, [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The generated variance delta with leaves for [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The variance delta for the leaves corresponding to units 1-3, with leaves 1 & 2 [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The path connecting the node for λ2· and λ3· to the root. 4 Polling Model Example Our second example applies variance deltas to characterize uncertainty in a complex model of polls from the 2016 U.S. presidential election. In particular, we analyze the model 25 [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Initial variance delta with a single leaf representing all hypothetical polls. [PITH_FULL_IMAGE:figures/full_fig_p029_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The result of subdividing the initial variance delta , adding a new node that [PITH_FULL_IMAGE:figures/full_fig_p030_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The result of adding branches from node 7 for each combination of additional [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
read the original abstract

In observational settings, where the data generating process and possibly the sample size are not controlled, it is typically impossible to guarantee a priori that quantities of interest will be estimated with sufficient precision. However, even when the data do not determine the quantities of interest, they may still allow determination of what is missing -- unobserved information which, if observed, would meaningfully reduce uncertainty. We propose an interactive visualization system, termed variance deltas, to enable the discovery of such missing information from a Bayesian posterior distribution. This system, which we provide as a software package, represents missing information as subsets of unobserved model quantities, organized into a tree based on how well each subset explains uncertainty about the quantity of interest. This system both automates the construction of candidate subsets from minimal user input and implements interactive operations for the division and combination of subsets, allowing the efficient discovery of interesting and useful explanations. We demonstrate this system by using it to discover nonobvious explanations of uncertainty for (1) a treatment effect parameter in a simulated causal inference problem and (2) a population proportion in a forecasting model of real polling data with many sources of bias.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes 'variance deltas', an interactive visualization system for discovering missing information in Bayesian posteriors. It represents uncertainty-reducing subsets of unobserved model quantities as nodes in a tree, ranked by explanatory power, with automated construction from minimal user input and interactive subset division/combination operations. The system is implemented as open software and demonstrated on (1) a simulated causal inference model for treatment effect uncertainty and (2) a real polling dataset for population proportion uncertainty with multiple bias sources.

Significance. If the tree construction and ranking reliably surface non-obvious, actionable explanations of posterior uncertainty, the tool could support exploratory analysis in observational settings where a priori precision guarantees are unavailable. The provision of a software package is a clear strength for reproducibility. The work is primarily methodological and exploratory rather than providing formal statistical guarantees.

major comments (2)
  1. [§3] §3 (tree construction): The manuscript states that subsets are organized 'based on how well each subset explains uncertainty' via a tree metric, but provides no derivation or invariance result showing that the ranking is robust to posterior sampling variability or the choice of MCMC approximation; this is load-bearing for the claim that the automated construction discovers meaningful explanations.
  2. [§4.2] §4.2 (polling data demonstration): The surfaced explanations for bias sources are presented as non-obvious, yet the section reports no quantitative measure (e.g., actual conditional variance reduction) or comparison against a baseline such as random subset selection or expert-identified sources, weakening the claim that the interface enables efficient discovery beyond manual inspection.
minor comments (2)
  1. [Figure 3] Figure 3: Axis labels and color scale for variance delta values are not fully described in the caption, making it difficult to interpret the magnitude of explanatory power.
  2. The software package is referenced but the manuscript does not include a link or DOI in the main text or reproducibility statement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive comments on our manuscript. We address the major comments point by point below and will make the indicated revisions to strengthen the presentation.

read point-by-point responses
  1. Referee: [§3] §3 (tree construction): The manuscript states that subsets are organized 'based on how well each subset explains uncertainty' via a tree metric, but provides no derivation or invariance result showing that the ranking is robust to posterior sampling variability or the choice of MCMC approximation; this is load-bearing for the claim that the automated construction discovers meaningful explanations.

    Authors: We acknowledge that the manuscript provides no formal derivation or invariance result establishing robustness of the ranking to posterior sampling variability or MCMC choice. The tree metric is defined directly from empirical conditional variances computed on the available posterior samples, and the construction is intended as a practical, exploratory heuristic rather than a procedure with theoretical guarantees. In the revision we will add an explicit discussion of this point in §3, noting that the metric is invariant to affine transformations of the posterior samples and that stability can be assessed by recomputing the tree on independent MCMC chains; we will also include a brief empirical check using the simulated example to illustrate sensitivity. revision: yes

  2. Referee: [§4.2] §4.2 (polling data demonstration): The surfaced explanations for bias sources are presented as non-obvious, yet the section reports no quantitative measure (e.g., actual conditional variance reduction) or comparison against a baseline such as random subset selection or expert-identified sources, weakening the claim that the interface enables efficient discovery beyond manual inspection.

    Authors: We agree that the polling demonstration would be strengthened by quantitative evidence. In the revised §4.2 we will report the actual conditional variance reductions (as fractions of the marginal variance) for each surfaced subset and will add a simple baseline comparison: the variance reductions obtained by 100 randomly chosen subsets of comparable cardinality drawn from the same pool of unobserved quantities. This will allow readers to see that the tree-guided selections achieve larger reductions than random selection on average, supporting the claim that the interface aids discovery beyond unaided manual inspection. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes an interactive visualization system (variance deltas) that decomposes posterior uncertainty over subsets of unobserved quantities into a tree structure. No formal derivation, theorem, or prediction is presented that reduces by construction to fitted parameters, self-citations, or ansatzes. The method is described as a software layer operating on an existing Bayesian posterior, with demonstrations on simulated and real data serving as exploratory illustrations rather than load-bearing claims. No equations or self-referential steps appear in the provided text that would trigger any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unstated premise that the posterior already contains the relevant missing-information structure and that a tree metric can rank explanatory power without further assumptions.

pith-pipeline@v0.9.1-grok · 5718 in / 1132 out tokens · 11385 ms · 2026-06-26T01:18:10.718552+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 14 canonical work pages · 2 internal anchors

  1. [1]

    Matematicheskoe modelirovanie , volume=

    On sensitivity estimation for nonlinear mathematical models , author=. Matematicheskoe modelirovanie , volume=. 1990 , publisher=

  2. [2]

    Ivanova and Freddie Bickford Smith , title =

    Tom Rainforth and Adam Foster and Desi R. Ivanova and Freddie Bickford Smith , title =. Statistical Science , number =. 2024 , doi =

  3. [3]

    Journal of Statistical Software , author=

    ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , volume=. Journal of Statistical Software , author=. 2017 , pages=. doi:10.18637/jss.v077.i01 , number=

  4. [4]

    A Review on Global Sensitivity Analysis Methods , bookTitle=

    Iooss, Bertrand and Lema. A Review on Global Sensitivity Analysis Methods , bookTitle=. 2015 , publisher=

  5. [5]

    Quantum marginal inequalities and the conjectured entropic inequalities

    Stijn Hawinkel and Willem Waegeman and Steven Maere , title =. The American Statistician , volume =. 2024 , publisher =. doi:10.1080/00031305.2023.2216252 , URL =

  6. [6]

    Statistical Science , number =

    Sander Greenland , title =. Statistical Science , number =. 2009 , doi =

  7. [7]

    Journal of the Royal Statistical Society: Series A (Statistics in Society) , volume =

    Greenland, Sander , title =. Journal of the Royal Statistical Society: Series A (Statistics in Society) , volume =. doi:https://doi.org/10.1111/j.1467-985X.2004.00349.x , url =. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-985X.2004.00349.x , year =

  8. [8]

    and Lee, Su-In , title =

    Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

  9. [9]

    , title =

    Owen, Art B. , title =. SIAM/ASA Journal on Uncertainty Quantification , volume =. 2014 , doi =

  10. [10]

    Journal of the American Statistical Association , volume =

    Alberto Abadie and Alexis Diamond and Jens Hainmueller , title =. Journal of the American Statistical Association , volume =. 2010 , publisher =. doi:10.1198/jasa.2009.ap08746 , URL =

  11. [11]

    Political Analysis , author=

    A Bayesian Alternative to Synthetic Control for Comparative Case Studies , volume=. Political Analysis , author=. 2022 , pages=. doi:10.1017/pan.2021.22 , number=

  12. [12]

    Entropy , VOLUME =

    Linardatos, Pantelis and Papastefanopoulos, Vasilis and Kotsiantis, Sotiris , TITLE =. Entropy , VOLUME =. 2021 , NUMBER =

  13. [13]

    Elliott , journal =

    Heidemanns, Merlin and Gelman, Andrew and Morris, G. Elliott , journal =. An. 2020 , month =

  14. [14]

    Journal of Business & Economic Statistics , volume =

    Jushan Bai and Peng Wang , title =. Journal of Business & Economic Statistics , volume =. 2015 , publisher =. doi:10.1080/07350015.2014.941467 , URL =

  15. [15]

    2022 , note =

    shinystan: Interactive Visual and Numerical Diagnostics and Posterior Analysis for Bayesian Models , author =. 2022 , note =

  16. [16]

    2025 , eprint=

    Explanations are a means to an end , author=. 2025 , eprint=

  17. [17]

    Designing for

    Hullman, Jessica and Gelman, Andrew , journal =. Designing for. 2021 , month =

  18. [18]

    2025 , publisher =

    VMC: A Grammar for Visualizing Statistical Model Checks , volume=. IEEE Transactions on Visualization and Computer Graphics , author=. 2025 , month=jan, pages=. doi:10.1109/TVCG.2024.3456402 , number=

  19. [19]

    Visualizing Variable Importance and Variable Interaction Effects in Machine Learning Models , url=

    Inglis, Alan and Parnell, Andrew and Hurley, Catherine , year=. Visualizing Variable Importance and Variable Interaction Effects in Machine Learning Models , url=. doi:10.48550/arXiv.2108.04310 , note=

  20. [20]

    Visualizing the Consequences of Evidence in Bayesian Networks

    Champion, Clifford and Elkan, Charles , year=. Visualizing the Consequences of Evidence in Bayesian Networks , url=. doi:10.48550/arXiv.1707.00791 , note=

  21. [21]

    BMC Bioinformatics , author=

    Addressing the unmet need for visualizing conditional random fields in biological data , volume=. BMC Bioinformatics , author=. 2014 , month=july, pages=. doi:10.1186/1471-2105-15-202 , number=

  22. [22]

    2024 , eprint=

    Sensitivity of MCMC-based analyses to small-data removal , author=. 2024 , eprint=

  23. [23]

    Friedman and Werner Stuetzle , title =

    Jerome H. Friedman and Werner Stuetzle , title =. The Annals of Statistics , number =. 2002 , doi =

  24. [24]

    2025 , eprint=

    Open Problems in Mechanistic Interpretability , author=. 2025 , eprint=

  25. [25]

    2020 , publisher=

    Handbook of Graphical Models , author=. 2020 , publisher=

  26. [26]

    Stan User's Guide , year=

  27. [27]

    A visual analytics workflow for probabilistic modeling , journal =

    Julien Klaus and Mark Blacher and Andreas Goral and Philipp Lucas and Joachim Giesen , keywords =. A visual analytics workflow for probabilistic modeling , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.visinf.2023.05.001 , url =

  28. [28]

    2020 , eprint=

    Bayesian Workflow , author=. 2020 , eprint=

  29. [29]

    2025 , note =

    bayesplot: Plotting for Bayesian Models , author =. 2025 , note =

  30. [30]

    Tukey , title =

    John W. Tukey , title =. Statistical Science , number =. 1990 , doi =

  31. [31]

    Sensitivity analysis in presence of model uncertainty and correlated inputs , journal =

    Julien Jacques and Christian Lavergne and Nicolas Devictor , keywords =. Sensitivity analysis in presence of model uncertainty and correlated inputs , journal =. 2006 , note =. doi:https://doi.org/10.1016/j.ress.2005.11.047 , url =

  32. [32]

    2016 , eprint=

    Global sensitivity metrics from active subspaces , author=. 2016 , eprint=

  33. [33]

    Martins and Leonhard Held and H

    Małgorzata Roos and Thiago G. Martins and Leonhard Held and H. Bayesian Analysis , number =. 2015 , doi =

  34. [34]

    Detecting and diagnosing prior and likelihood sensitivity with power-scaling , volume=

    Kallioinen, Noa and Paananen, Topi and Bürkner, Paul-Christian and Vehtari, Aki , year=. Detecting and diagnosing prior and likelihood sensitivity with power-scaling , volume=. Statistics and Computing , publisher=. doi:10.1007/s11222-023-10366-5 , number=

  35. [35]

    Felli and Gordon B

    James C. Felli and Gordon B. Hazen , title =. Medical Decision Making , volume =. 1998 , doi =

  36. [36]

    Skinny gibbs: A consistent and scalable gibbs sampler for model selection

    Christopher Jackson and Anne Presanis and Stefano Conti and Daniela De Angelis , title =. Journal of the American Statistical Association , volume =. 2019 , publisher =. doi:10.1080/01621459.2018.1562932 , URL =