Reading Speed, Image Quality Ratings, and Comfort Ratings in Augmented Reality
Pith reviewed 2026-05-14 21:16 UTC · model grok-4.3
The pith
The Read-AR dataset collects over 11,000 reading speeds and nearly 6,000 quality and comfort ratings to benchmark AR headset text display.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors assembled a large collection of reading speed data along with subjective image quality and comfort ratings under tightly controlled conditions on one experimental AR setup, producing a resource that functions as a common reference for benchmarking the text display quality of varied AR headset architectures.
What carries the argument
The single consistent and controlled experimental setup used across all 80-plus conditions.
Load-bearing premise
The tested conditions and participant responses produce ratings and speeds that apply beyond this specific lab setup to other AR devices and real-world situations.
What would settle it
New measurements of reading speed, quality ratings, or comfort taken with the same text content but on a different AR headset or in uncontrolled daily environments that deviate substantially from the dataset values.
read the original abstract
The rendering and display of text is a key use-case for augmented reality (AR). Here, we present the Read-AR, a dataset of reading in AR, for which we collected over 11,000 reading speeds and almost 6000 visual quality and comfort ratings across over 80 different experiment conditions on the same experiment set-up. The consistent, controlled set-up enables the dataset to function as a reference for benchmarking the quality of different AR headset architectures.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents the Read-AR dataset, which comprises over 11,000 reading speed measurements and nearly 6,000 visual quality and comfort ratings collected across more than 80 experimental conditions using a single fixed augmented reality setup. The authors assert that the consistent, controlled nature of this data collection allows the dataset to serve as a reference benchmark for evaluating the quality of different AR headset architectures.
Significance. If the experimental conditions are fully specified and replicable, the large-scale collection of reading speeds and subjective ratings under controlled conditions could provide a useful standardized reference point for future AR text rendering research. However, the claimed benchmarking utility across architectures does not hold without supporting cross-headset data or mapping procedures.
major comments (1)
- [Abstract] Abstract: The claim that 'the consistent, controlled set-up enables the dataset to function as a reference for benchmarking the quality of different AR headset architectures' is not supported by the reported work. All 11,000+ speeds and 6,000 ratings derive from a single rig; the manuscript contains no cross-headset measurements, no sensitivity analysis to architectural parameters, and no normalization or mapping procedure that would allow transfer to other headsets.
minor comments (1)
- The abstract asserts data collection at scale but provides no details on participant numbers, statistical analysis methods, error handling, or validation procedures; these must be explicitly reported in the methods section for the dataset to be usable as a reference.
Simulated Author's Rebuttal
We thank the referee for their careful review. We agree that the abstract claim about benchmarking across AR headset architectures is not supported by the single-rig data and will revise it. We respond to the comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that 'the consistent, controlled set-up enables the dataset to function as a reference for benchmarking the quality of different AR headset architectures' is not supported by the reported work. All 11,000+ speeds and 6,000 ratings derive from a single rig; the manuscript contains no cross-headset measurements, no sensitivity analysis to architectural parameters, and no normalization or mapping procedure that would allow transfer to other headsets.
Authors: We agree that the dataset was collected exclusively on a single fixed AR setup with no cross-headset measurements, sensitivity analysis to architectural parameters, or mapping/normalization procedures. The original abstract phrasing overstated the dataset's direct utility for benchmarking different architectures. We will revise the abstract to remove this claim and instead describe the work as providing a large-scale, controlled collection of reading speeds and subjective ratings under replicable conditions that can serve as a reference baseline for future AR text rendering studies. revision: yes
Circularity Check
No circularity: purely empirical dataset with no derivations or self-referential claims
full rationale
The manuscript collects reading speeds and subjective ratings across 80 conditions on a single fixed AR rig and presents the resulting dataset. No equations, models, fitted parameters, or derivations appear in the provided text. The claim that the controlled setup allows the data to serve as a benchmark is a descriptive statement about the experimental design, not a prediction or result obtained by reducing one quantity to another by construction. No self-citations, uniqueness theorems, or ansatzes are invoked. The contribution rests on the raw measurements themselves.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The consistent, controlled set-up enables the dataset to function as a reference for benchmarking the quality of different AR headset architectures.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
(2025).Photoshop 2025(Version 26.8.1)
Adobe Inc. (2025).Photoshop 2025(Version 26.8.1). https://www.adobe.com/products/photoshop.html Atilgan, N., Xiong, Y.-Z., & Legge, G. E. (2020). Reconciling print-size and display-size constraints on reading.Proceedings of the National Academy of Sciences,117(48), 30276–30284. Bailey, I. L., & Lovie, J. E. (1976). New design principles for visual acuity ...
work page 2025
-
[2]
READING IN AR 20 Dillon, A. (1992). Reading from paper versus screens: A critical review of the empirical literature.Ergonomics,35(10), 1297–1326. Dobres, J., Chahine, N., Reimer, B., Gould, D., Mehler, B., & Coughlin, J. F. (2016). Utilising psychophysical techniques to investigate the effects of age, typeface design, size and display polarity on glance ...
work page 1992
-
[3]
Huey, E. B. (1898). Preliminary experiments in the physiology and psychology of reading. The American Journal of Psychology,9(4), 575–586. Hughes, L., & Wilkins, A. (2000). Typography in children’s reading schemes may be suboptimal: Evidence from measures of reading rate.Journal of Research in Reading,23(3), 314–324. International Organization for Standar...
-
[4]
O’Brien, B. A., Mansfield, J. S., & Legge, G. E. (2000). The effect of contrast on reading speed in dyslexia.Vision research,40(14), 1921–1935. Pelli, D. G., & The EasyEyes Team. (2025, July 14).EasyEyes(Version 2025-07-14). https://easyeyes.app READING IN AR 23 Pelli, D. G., Tillman, K. A., Freeman, J., Su, M., Berger, T. D., & Majaj, N. J. (2007). Crowd...
work page 2000
-
[5]
F., Matteson, S., Gould, D., Chahine, N., & Levantovsky, V
Reimer, B., Mehler, B., Dobres, J., Coughlin, J. F., Matteson, S., Gould, D., Chahine, N., & Levantovsky, V. (2014). Assessing the impact of typeface design in a text-rich automotive user interface.Ergonomics,57(11), 1643–1658. Şimşek, B., Direkci, B., Koparan, B., Canbulat, M., Gülmez, M., & Nalçacıgil, E. (2025). Examining the effect of augmented realit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.