Investigating Performance and Practices with Univariate Distribution Charts
Pith reviewed 2026-05-10 16:58 UTC · model grok-4.3
The pith
Different charts for univariate distributions yield varying accuracy on analysis tasks, and popular options like histograms or boxplots are not always the most effective.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Charts for univariate distributions differ in the accuracy they support for low-level tasks, with measurable gaps in performance, distinct patterns of misunderstanding, and a mismatch between what users prefer or know and how well they actually perform; interviews confirm that widely adopted charts such as histograms and boxplots are not the best choice for every task.
What carries the argument
Mixed-methods user study that measures task accuracy via click-to-select on four representative charts (boxplots, violinplots, jittered stripplots, histograms) while also collecting preference and familiarity data plus practitioner interviews.
If this is right
- Task accuracy is not uniform across chart types; some charts support certain low-level questions more reliably than others.
- User preference and prior familiarity with a chart do not reliably predict higher accuracy on analysis tasks.
- Commonly used charts such as histograms and boxplots can underperform relative to less familiar alternatives on specific tasks.
- Visualization practitioners select charts based on convention or audience preference rather than measured effectiveness for the intended task.
Where Pith is reading between the lines
- Design tools could default to the chart type shown to be most accurate for the current analysis goal rather than the user's stated favorite.
- Training materials might need to separate familiarity from effectiveness, teaching users to switch chart types when accuracy matters.
- Follow-up work could test whether the accuracy gaps persist when participants use the charts in their own data-analysis workflows rather than in a controlled benchmark.
Load-bearing premise
The selected low-level benchmark tasks, click-to-select measurement, and sample of 215 participants stand in for real-world visual analysis needs and broader user populations.
What would settle it
A replication with a different set of tasks or a new participant pool in which accuracy differences between the four charts disappear or reverse would falsify the reported performance gaps.
Figures
read the original abstract
A range of charts with different strengths and weaknesses exists to support the visual analysis of univariate distributions, with a limited understanding of which charts best support which tasks and users, and how practitioners use charts. We categorize the available charts for univariate distributions into four groups and present the results of a mixed-methods comparison (n=215) of participants' perception and preferences across boxplots, violinplots, jittered stripplots, and histograms as representatives of their respective categories. The click-to-select approach in our study, combined with data on participants' subjective experiences and preferences, allows to both measure accuracy on benchmark tasks and discuss participants' choices qualitatively. Our analysis reveals differences between charts in task accuracy, common misunderstandings, and preferences across various low-level tasks, and indicates that chart preference and familiarity do not necessarily align with participants' task performance. Interviews with five visualization practitioners further reveal that charts widely preferred by general audiences (such as histograms) or commonly used in scientific domains (such as boxplots) are not inherently the most effective for all tasks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports on a mixed-methods empirical study (n=215 participants) comparing four representative univariate distribution charts—boxplots, violin plots, jittered strip plots, and histograms—using click-to-select tasks for low-level analysis operations. It finds variations in accuracy, common errors, and user preferences, notes misalignment between preference/familiarity and performance, and based on five practitioner interviews concludes that widely used charts are not inherently optimal for all tasks.
Significance. If the results are robust, the work offers practical guidance for visualization design by identifying performance differences across chart types and highlighting the value of empirical evaluation over reliance on convention or preference. The mixed-methods approach, combining quantitative accuracy data with qualitative insights, is a notable strength, as is the relatively large participant sample for the user study component.
major comments (2)
- The central claims about differences in effectiveness and the conclusion that common charts (histograms, boxplots) are not inherently optimal rest on the assumption that the chosen low-level benchmark tasks and click-to-select measurement are representative of real-world univariate distribution analysis. The Methods section provides no explicit justification or validation for how these isolated tasks map to integrated workflows or domain-expert use cases, and the general participant sample (rather than experts) limits extrapolation; this is load-bearing for the practitioner-interview claims.
- Abstract and Results sections: the description of the study design and sample size is given, but detailed statistical results (including error bars, exact p-values or effect sizes for accuracy differences, and exclusion criteria) are not reported, preventing full verification of the claimed differences in task accuracy and misunderstandings.
minor comments (1)
- The four-group categorization of univariate distribution charts in the Introduction could be strengthened by explicit references to prior visualization taxonomies.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive feedback. We value the recognition of the mixed-methods design and sample size as strengths. We address each major comment below, with planned revisions to improve clarity and verifiability.
read point-by-point responses
-
Referee: The central claims about differences in effectiveness and the conclusion that common charts (histograms, boxplots) are not inherently optimal rest on the assumption that the chosen low-level benchmark tasks and click-to-select measurement are representative of real-world univariate distribution analysis. The Methods section provides no explicit justification or validation for how these isolated tasks map to integrated workflows or domain-expert use cases, and the general participant sample (rather than experts) limits extrapolation; this is load-bearing for the practitioner-interview claims.
Authors: We agree that an explicit justification for the task selection would strengthen the paper. The click-to-select tasks were adapted from established low-level analysis operations in the visualization literature (e.g., identifying extrema, spread, outliers, and shape features). We will add a subsection in Methods that cites relevant task taxonomies and explains their mapping to common univariate analysis workflows. The general participant sample was intentional to assess broad perceptual performance rather than domain expertise; we will expand the Limitations and Discussion sections to address extrapolation to experts and clarify that the five practitioner interviews provide complementary qualitative context on real-world usage rather than direct validation of the quantitative findings. revision: partial
-
Referee: Abstract and Results sections: the description of the study design and sample size is given, but detailed statistical results (including error bars, exact p-values or effect sizes for accuracy differences, and exclusion criteria) are not reported, preventing full verification of the claimed differences in task accuracy and misunderstandings.
Authors: We acknowledge that fuller statistical reporting is required for verification. In the revision we will add exact p-values, effect sizes (e.g., Cramér’s V), 95% confidence intervals or error bars to all accuracy figures, and a transparent account of exclusion criteria and data-cleaning steps in both Methods and Results. The Abstract will be updated to reference the key statistical outcomes within length constraints. revision: yes
Circularity Check
No significant circularity in empirical user study
full rationale
The paper reports results from a mixed-methods empirical study with 215 participants completing click-to-select benchmark tasks on four chart types plus qualitative interviews with five practitioners. All central claims (task accuracy differences, common misunderstandings, preference-performance misalignment, and practitioner insights) are grounded in this independently collected data rather than any derivation, fitted parameter, or self-citation chain. No equations, ansatzes, or uniqueness theorems appear; the work contains no load-bearing self-referential steps that reduce outputs to inputs by construction. This matches the default case of an empirical paper whose findings are falsifiable against external replication.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Accuracy on click-to-select benchmark tasks serves as a valid proxy for chart effectiveness in univariate distribution analysis
- domain assumption The participant pool and task set generalize to broader visualization practice
Reference graph
Works this paper leans on
-
[1]
[AES05] R. Amar, J. Eagan, and J. Stasko. Low-level components of analytic activity in information visualization. InIEEE Symposium on Information Visualization, 2005., pages 111–117,
work page 2005
-
[2]
Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making
[FWM+18] Michael Fernandes, Logan Walls, Sean Munson, Jessica Hullman, and Matthew Kay. Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pages 1–12, Montreal QC Canada,
work page 2018
-
[3]
14 APREPRINT- APRIL10, 2026 [HGW+24] Anja Heim, Alexander Gall, Manuela Waldner, Eduard Gröller, and Christoph Heinzl. AccuStripes: Visual exploration and comparison of univariate data distributions using color and binning.Computers & Graphics, 119:103906,
work page 2026
-
[4]
[KKHM16] Matthew Kay, Tara Kola, Jessica R. Hullman, and Sean A. Munson. When (ish) is My Bus?: User- centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages 5092–5103. ACM,
work page 2016
-
[5]
RidgeBuilder: Interactive Authoring of Expressive Ridgeline Plots
[LLL+25] Shuhan Liu, Yangtian Liu, Junxin Li, Yanwei Huang, Yue Shangguan, Zikun Deng, Di Weng, and Yingcai Wu. RidgeBuilder: Interactive Authoring of Expressive Ridgeline Plots. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–18. ACM,
work page 2025
-
[6]
[MF17] Justin Matejka and George Fitzmaurice. Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. InProceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pages 1290–1294, Denver Colorado USA,
work page 2017
- [7]
-
[8]
[R C21] R Core Team.R: A Language and Environment for Statistical Computing
Accessed: 2025-11-23. [R C21] R Core Team.R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria,
work page 2025
-
[9]
The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations
[Shn96] Ben Shneiderman. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. Proceedings of the 1996 IEEE Symposium on Visual Languages, page 336,
work page 1996
-
[10]
16 APREPRINT- APRIL10, 2026 [Wil05] Leland Wilkinson.The grammar of graphics. Statistics and Computing. Springer, New York, NY , second edition edition,
work page 2026
-
[11]
How to Visualize and Compare Distributions in R
[Yau12] Nathan Yau. How to Visualize and Compare Distributions in R. https://flowingdata.com/2012/ 05/15/how-to-visualize-and-compare-distributions/,
work page 2012
-
[12]
Accessed: 2025-11-12. 17
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.