Accessible Fine-grained Data Representation via Spatial Audio
Pith reviewed 2026-05-10 17:52 UTC · model grok-4.3
The pith
Mapping data values to sound direction in the azimuth plane lets users perceive signs and exact numbers better than pitch sonification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Pitch representations reveal coarse-grained data information such as trends and value comparisons but cannot effectively convey fine-grained details like the sign and exact value of individual data points. Representing data values as the sound direction in the azimuth plane achieves accessible fine-grained data representation. In a user study with 26 participants including 10 BLV individuals across four data perception tasks, this spatial approach significantly outperforms pitch on recognizing data signs and exact values, performs similarly on data trend identification, and shows inferior accuracy on data value comparison.
What carries the argument
Representation of each quantitative data value as a unique sound direction in the azimuth plane, so that spatial hearing rather than pitch height carries the fine detail.
If this is right
- Fine-grained tasks such as confirming whether a value is positive or reading its precise magnitude become feasible through audio alone.
- Trend detection stays available at the same level as existing pitch methods.
- Value comparison tasks remain weaker than with pitch, so the two methods may need to be combined depending on the use case.
- Data visualizations that are currently inaccessible to blind users can now support reading of individual points rather than only overall patterns.
Where Pith is reading between the lines
- The same direction mapping could be tested on multi-dimensional data sets where two axes are encoded as azimuth and elevation.
- Mobile-device use might require calibration for head orientation so that absolute directions remain stable when the user turns.
- Real-world adoption would benefit from quick calibration sounds that let users learn the direction-to-value scale in under a minute.
- Hybrid systems that switch between pitch for overview and spatial audio for zoom-in detail could reduce the observed weakness on value comparison.
Load-bearing premise
Listeners can reliably tell apart sound directions in the azimuth plane without training and without interference from other sounds or room acoustics.
What would settle it
A replication study in which blind participants wearing headphones in a real room with mild background noise fail to identify data-point signs or exact values at rates above chance when using only spatial direction cues.
read the original abstract
Pitch-based sonification of quantitative data increases the accessibility of data visualizations that are otherwise inaccessible for blind and low-vision (BLV) individuals. We argue that, although pitch representations can reveal the coarse-grained information of data, such as data trend and value comparison, they cannot effectively convey the fine-grained details like the sign and exact value of individual data points. Informed by existing sound perception research, we propose a spatial audio-based approach by representing data values as the sound direction in the azimuth plane to achieve accessible fine-grained data representation. We conducted a user study with 26 participants (including 10 BLV participants) on four data perception tasks. The results show our approach significantly outperforms pitch representation on fine-grained data perception tasks like recognizing data signs and exact values, and performs similarly on data trend identification, despite its inferior accuracy on data value comparison.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes encoding quantitative data values as sound directions in the azimuth plane for spatial-audio sonification, arguing this enables fine-grained perception (sign and exact value of data points) that pitch-based methods cannot provide. It reports a user study with 26 participants (10 BLV) across four tasks where the spatial approach significantly outperforms pitch on sign recognition and exact-value tasks, performs similarly on trend identification, and is inferior on value comparison.
Significance. If the empirical results hold after methodological clarification, the work could meaningfully advance accessible data visualization by offering a practical alternative for precise quantitative access, particularly for BLV users. It draws on established sound-perception findings and tests a mixed participant pool, providing a direct comparison to pitch sonification. The contribution would be stronger with reproducible details on the encoding and study design.
major comments (3)
- [User Study and Results sections] The central empirical claim rests on the user study results, yet the manuscript provides no statistical details (tests, p-values, effect sizes, confidence intervals, or power analysis) nor raw data or pre-registration. This prevents verification of the reported significant gains on sign and exact-value tasks and leaves open the possibility of confounds such as training effects or participant strategy.
- [Approach and User Study sections] The spatial encoding assumes participants can reliably map specific azimuth angles to quantitative values, but the study description includes no pre-screening for spatial hearing ability, no familiarization phase, no individualized HRTF calibration, and no reporting of angular resolution or spacing used for data values. Human azimuth localization error (typically 5–15°) could exceed the resolution needed for fine-grained distinctions, undermining the advantage over pitch.
- [User Study section] The participant pool mixes sighted and BLV individuals without separate subgroup analyses or tests for group-by-condition interactions. This makes it impossible to determine whether the reported benefits generalize to the target BLV population or are driven by sighted participants' visual strategies.
minor comments (1)
- [Abstract] The abstract states the spatial method is 'inferior' on value comparison but does not quantify the difference or discuss why spatial encoding underperforms there despite its directional precision.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We have revised the manuscript to address the concerns raised and provide point-by-point responses below.
read point-by-point responses
-
Referee: [User Study and Results sections] The central empirical claim rests on the user study results, yet the manuscript provides no statistical details (tests, p-values, effect sizes, confidence intervals, or power analysis) nor raw data or pre-registration. This prevents verification of the reported significant gains on sign and exact-value tasks and leaves open the possibility of confounds such as training effects or participant strategy.
Authors: We agree that the original manuscript lacked sufficient statistical detail. In the revised version, we have added the full statistical reporting, including the tests used (repeated-measures ANOVA with post-hoc comparisons), exact p-values, effect sizes, and confidence intervals for the key results on sign recognition and exact-value tasks. A post-hoc power analysis based on observed effects has been included. Anonymized raw data has been made available via a public repository linked in the paper. The study was not pre-registered; we now explicitly note this as a limitation in the Discussion while describing the counterbalancing, practice trials, and instructions used to mitigate training effects and strategy confounds. revision: yes
-
Referee: [Approach and User Study sections] The spatial encoding assumes participants can reliably map specific azimuth angles to quantitative values, but the study description includes no pre-screening for spatial hearing ability, no familiarization phase, no individualized HRTF calibration, and no reporting of angular resolution or spacing used for data values. Human azimuth localization error (typically 5–15°) could exceed the resolution needed for fine-grained distinctions, undermining the advantage over pitch.
Authors: We have expanded the Approach and User Study sections to report the angular spacing (data values mapped to azimuth angles at 5° increments within a -60° to +60° range) and the familiarization phase (a 5-minute guided practice session with feedback). A generic HRTF from a standard database was used rather than individualized calibration. Pre-screening was limited to self-reported normal hearing; we acknowledge this as a limitation and have added it to the Discussion. We also discuss typical localization error and note that the design emphasizes relative directional changes and was validated by participants' actual task performance, which exceeded what would be expected from error alone. revision: partial
-
Referee: [User Study section] The participant pool mixes sighted and BLV individuals without separate subgroup analyses or tests for group-by-condition interactions. This makes it impossible to determine whether the reported benefits generalize to the target BLV population or are driven by sighted participants' visual strategies.
Authors: We have added subgroup analyses for BLV and sighted participants, as well as tests for group-by-condition interactions, to the revised Results section. The interaction terms were non-significant, and the advantages on sign and exact-value tasks held in both subgroups. Sighted participants were blindfolded during all tasks to eliminate visual strategies. These additions support that the benefits are not driven by the sighted group and generalize to BLV users. revision: yes
Circularity Check
No circularity: empirical user-study results stand independently
full rationale
The paper advances a spatial-audio encoding for quantitative data, motivated by cited sound-perception literature, then evaluates it via a between-subjects user study (N=26) on four perception tasks. All performance claims (outperformance on sign/exact-value recognition, parity on trend identification) are direct statistical comparisons of participant accuracy and response times; no equations, fitted parameters, or first-principles derivations appear. Prior work is invoked only as background motivation, not as a load-bearing uniqueness theorem or self-referential definition. The derivation chain therefore contains no self-definitional, fitted-input, or self-citation reductions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Listeners can accurately perceive and map sound azimuth directions to quantitative values after brief exposure.
- domain assumption The four perception tasks (trend, sign, exact value, comparison) are representative of fine-grained data needs.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we propose leveraging sound direction in the azimuth plane ... to indicate the data values
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
pitch representation ... cannot effectively convey the fine-grained details
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Accessible visualization: Design space, opportuni- ties, and challenges,
N. W. Kim, S. C. Joyner, A. Riegelhuth, and Y . Kim, “Accessible visualization: Design space, opportuni- ties, and challenges,” inComputer graphics forum, vol. 40, no. 3. Wiley Online Library, 2021, pp. 173– 188
work page 2021
-
[2]
Supporting accessible data visualization through au- dio data narratives,
A. Siu, G. SH Kim, S. O’Modhrain, and S. Follmer, “Supporting accessible data visualization through au- dio data narratives,” inProceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–19
work page 2022
-
[3]
J. Y ang, A. Barde, and M. Billinghurst, “Audio aug- mented reality: A systematic review of technologies, applications, and future research directions,”journal of the audio engineering society, vol. 70, no. 10, pp. 788–809, 2022
work page 2022
-
[4]
M. Rébillat, X. Boutillon, É. Corteel, and B. F . Katz, “Audio, visual, and audio-visual egocentric distance perception by moving subjects in virtual environ- ments,”ACM Transactions on Applied Perception (TAP), vol. 9, no. 4, pp. 1–17, 2012
work page 2012
-
[5]
Short-term effects of sound lo- calization training in virtual reality,
M. A. Steadman, C. Kim, J.-H. Lestang, D. F . Good- man, and L. Picinali, “Short-term effects of sound lo- calization training in virtual reality,”Scientific Reports, vol. 9, no. 1, p. 18284, 2019
work page 2019
-
[6]
Understanding screen-reader users’ experiences with online data visualizations,
A. Sharif, S. S. Chintalapati, J. O. Wobbrock, and K. Reinecke, “Understanding screen-reader users’ experiences with online data visualizations,” inPro- ceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility, 2021, pp. 1–16
work page 2021
-
[7]
Voxlens: Making online data visualizations accessible with an interactive javascript plug-in,
A. Sharif, O. H. Wang, A. T. Muongchan, K. Reinecke, and J. O. Wobbrock, “Voxlens: Making online data visualizations accessible with an interactive javascript plug-in,” inProceedings of the 2022 CHI conference on human factors in computing systems, 2022, pp. 1–19
work page 2022
-
[8]
V. Potluri, J. Thompson, J. Devine, B. Lee, N. Morsi, P . De Halleux, S. Hodges, and J. Mankoff, “Psst: Enabling blind or visually impaired developers to author sonifications of streaming sensor data,” in Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology, ser. UIST ’22. New Y ork, NY , USA: Association for Computing Ma...
work page 2022
-
[9]
K. Enge, E. Elmquist, V. Caiola, N. Rönnberg, A. Rind, M. Iber, S. Lenzi, F . Lan, R. Höldrich, and W. Aigner, “Open your ears and take a look: A state- of-the-art report on the integration of sonification and visualization,” inComputer Graphics Forum, vol. 43, no. 3. Wiley Online Library, 2024, p. e15114
work page 2024
-
[10]
Central auditory skills in blind and sighted subjects,
C. Muchnik, M. Efrati, E. Nemeth, M. Malin, and M. Hildesheimer, “Central auditory skills in blind and sighted subjects,”Scandinavian audiology, vol. 20, no. 1, pp. 19–23, 1991
work page 1991
-
[11]
Infosonics: Accessible infographics for people who are blind using sonification and voice,
L. M. Holloway, C. Goncu, A. Ilsar, M. Butler, and K. Marriott, “Infosonics: Accessible infographics for people who are blind using sonification and voice,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–13
work page 2022
-
[12]
T. Nasir and J. C. Roberts, “Sonification of spatial data,” inThe 13th International Conference on Audi- tory Display. ICAD, 2007, pp. 112–119
work page 2007
-
[13]
R. Wang, C. Jung, and Y . Kim, “Seeing through sounds: Mapping auditory dimensions to data and charts for people with visual impairments,” inCom- puter Graphics Forum, vol. 41, no. 3. Wiley Online Library, 2022, pp. 71–83
work page 2022
-
[14]
Navigation and space perception assistance for the visually impaired: The navig project,
S. Kammoun, G. Parseihian, O. Gutierrez, A. Bril- hault, A. Serpa, M. Raynal, B. Oriola, M.-M. Macé, M. Auvray, M. Deniset al., “Navigation and space perception assistance for the visually impaired: The navig project,”Irbm, vol. 33, no. 2, pp. 182–189, 2012
work page 2012
-
[15]
Interactive spatial sonification for non-visual exploration of virtual maps,
M. Geronazzo, A. Bedin, L. Brayda, C. Campus, and F . Avanzini, “Interactive spatial sonification for non-visual exploration of virtual maps,”International Journal of Human-Computer Studies, vol. 85, pp. 4– 15, 2016
work page 2016
-
[16]
Localization error: Ac- curacy and precision of auditory localization,
T. Letowski and S. Letowski, “Localization error: Ac- curacy and precision of auditory localization,”Ad- vances in sound localization, vol. 55, pp. 55–78, 2011
work page 2011
-
[17]
The potentials for spatial audio 12 to convey information in virtual environments,
K. A. Mcmullen, “The potentials for spatial audio 12 to convey information in virtual environments,” in Proceedings of 2014 IEEE VR Workshop: Sonic Interaction in Virtual Environments (SIVE). IEEE, 2014, pp. 31–34
work page 2014
-
[18]
Accessible data representation with natural sound,
M. N. Hoque, M. Ehtesham-Ul-Haque, N. Elmqvist, and S. M. Billah, “Accessible data representation with natural sound,” inProceedings of the 2023 CHI Con- ference on Human Factors in Computing Systems, 2023, pp. 1–19
work page 2023
-
[19]
A comparative study in real-time scene sonification for visually impaired people,
W. Hu, K. Wang, K. Y ang, R. Cheng, Y . Y e, L. Sun, and Z. Xu, “A comparative study in real-time scene sonification for visually impaired people,”Sensors, vol. 20, no. 11, p. 3222, 2020
work page 2020
-
[20]
Spatial audio in virtual reality: A systematic review,
G. Corrêa De Almeida, V. Costa de Souza, L. G. Da Silveira Júnior, and M. R. Veronez, “Spatial audio in virtual reality: A systematic review,” inProceedings of the 25th Symposium on Virtual and Augmented Reality, 2023, pp. 264–268. ABOUT THE AUTHORS Can Liuis currently a research fellow at the College of Computing and Data Science, Nanyang Technological U...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.