pith. sign in

arxiv: 2604.14637 · v1 · submitted 2026-04-16 · 💻 cs.HC

Touching Space: Accessible Map Exploration Through Conversational Audio-Haptic Interaction

Pith reviewed 2026-05-10 11:20 UTC · model grok-4.3

classification 💻 cs.HC
keywords assistive technologyhaptic feedbackconversational agentscognitive mapsspatial explorationblind and low visionmap interfaces
0
0 comments X

The pith

Touching Space lets Blind and Low-Vision users build cognitive maps of places through touch and spoken questions on standard hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Touching Space as an end-to-end system that loads map data into an interface for exploration. Users feel spatial layouts through touch while asking a conversational agent spoken questions that return audio descriptions and relations. This targets the gap in tools focused only on real-time directions by instead supporting pre-travel mental models of how landmarks relate to each other. A reader would care because such understanding can improve planning, orientation, and independence when visiting unfamiliar spaces without relying on sight.

Core claim

Touching Space is an end-to-end system that retrieves map data for a target place and loads it into a frontend interface for exploration. The system combines haptic and audio feedback: users explore spatial layouts through touch and ask spoken questions to a conversational agent during exploration. It contributes a conversational interface that supports BLV users in building cognitive maps on commodity hardware.

What carries the argument

The conversational audio-haptic interface that lets users simultaneously explore map layouts by touch and query spatial relations through speech.

If this is right

  • Users can learn relative positions such as a fountain being south of a tower before arriving on site.
  • The approach runs on everyday phones and tablets rather than requiring specialized hardware.
  • It shifts assistive map tools from live guidance only toward supporting holistic pre-travel understanding.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same touch-plus-speech pattern might extend to learning interior floor plans or museum layouts.
  • Integration with existing navigation apps could let users plan with the system and then receive real-time cues.
  • Long-term studies could check whether repeated sessions improve users' ability to navigate the same places from memory.

Load-bearing premise

The assumption that pairing touch-based exploration with spoken answers from a conversational agent will let users form accurate mental representations of spatial layouts.

What would settle it

A controlled test in which BLV participants using the system cannot accurately describe relative positions of landmarks at rates higher than those given only verbal descriptions without any haptic component.

Figures

Figures reproduced from arXiv: 2604.14637 by David T. Lee, Jiaming Qu, Leilani H. Gilpin, Li Liu, Marc Jowell Bagaoisan.

Figure 1
Figure 1. Figure 1: Touching Space: an audio-haptic map system with conversational agent support for spatial learning. (1) The system retrieves raw [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The frontend interface of Touching Space showing a three-turn conversation during map exploration. The map panel sits at the left and the conversation panel shows on the right. The user’s finger trajectory (the red path) moves from (1) MoPOP to (2) Hyatt House, and then to (3) Space Needle with different vibration patterns. At each location, the user asks a spatial question and receives a voice response. I… view at source ↗
read the original abstract

Most existing assistive navigation tools focus on providing real-time guidance for Blind and Low-Vision (BLV) people, but few support building a holistic spatial understanding of unfamiliar environments before travel. Such cognitive map construction (e.g., knowing that a fountain is south of a tower and west of a hotel) is important for pre-travel planning, yet remains underexplored in prior work. To address this gap, we present Touching Space, an end-to-end system that retrieves map data for a target place and loads it into a frontend interface for exploration. The system combines haptic and audio feedback: users explore spatial layouts through touch and ask spoken questions to a conversational agent during exploration. Touching Space contributes a conversational interface that supports BLV users in building cognitive maps on commodity hardware.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript presents Touching Space, an end-to-end system that retrieves map data for a target location and loads it into a frontend for exploration. The interface combines haptic rendering of spatial layouts with audio feedback from a conversational spoken QA agent, targeting Blind and Low-Vision (BLV) users. The stated goal is to support construction of cognitive maps (e.g., relative landmark positions) for pre-travel planning on commodity hardware, addressing a gap in existing real-time navigation tools.

Significance. The work identifies a genuine underexplored niche in assistive technologies by shifting focus from real-time guidance to pre-travel holistic spatial understanding. A validated system of this type could meaningfully improve independence for BLV users. The manuscript provides a clear pipeline description (map retrieval, haptic frontend, conversational agent) and emphasizes commodity hardware, which is a practical strength. However, because no empirical evidence is supplied, the significance remains that of a design proposal rather than a demonstrated contribution.

major comments (1)
  1. [Abstract] Abstract: the central claim that the system 'supports BLV users in building cognitive maps' is load-bearing for the contribution yet is presented without any user studies, pre/post spatial recall tests, quantitative metrics on cognitive map accuracy (e.g., landmark positioning or route knowledge), or baseline comparisons. The full text details the technical pipeline but contains no evaluation section or data to substantiate the 'supports' assertion.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for identifying the need to clarify the scope and evidential basis of our claims. We address the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the system 'supports BLV users in building cognitive maps' is load-bearing for the contribution yet is presented without any user studies, pre/post spatial recall tests, quantitative metrics on cognitive map accuracy (e.g., landmark positioning or route knowledge), or baseline comparisons. The full text details the technical pipeline but contains no evaluation section or data to substantiate the 'supports' assertion.

    Authors: We acknowledge that the manuscript presents Touching Space as a system design and implementation contribution without accompanying user studies or quantitative evaluation of cognitive map outcomes. The abstract's phrasing that the system 'supports BLV users in building cognitive maps' is intended to describe the design intent and technical capabilities (haptic rendering of spatial layouts combined with conversational audio queries for relational information), which are grounded in prior literature on audio-haptic interfaces for spatial learning. However, we agree this wording overstates the current evidence. In the revised version we will (1) rephrase the abstract and introduction to state that the system is designed to enable exploration that can support cognitive map construction, (2) add an explicit Limitations section noting the absence of empirical validation, and (3) outline concrete metrics (e.g., landmark positioning accuracy, route knowledge recall) that could be used in future user studies. These changes will be made without altering the technical novelty of the end-to-end pipeline on commodity hardware. revision: yes

Circularity Check

0 steps flagged

No circularity: system description with no derivations or self-referential claims

full rationale

The paper is a descriptive HCI system proposal for an audio-haptic conversational interface. It contains no equations, no fitted parameters, no predictions, and no derivation chain. The central claim (that the interface supports cognitive map construction) is presented as a design contribution without any internal reduction to inputs by construction, self-citation load-bearing, or ansatz smuggling. Per the guidelines, this qualifies as a self-contained non-mathematical paper with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the domain assumption that cognitive map construction via multi-modal feedback is both feasible and valuable for BLV users, with no free parameters or invented entities introduced in the abstract.

axioms (1)
  • domain assumption Cognitive map construction (e.g., knowing relative positions of landmarks) is important for pre-travel planning by BLV people and remains underexplored.
    Explicitly stated as motivation and gap in the abstract.

pith-pipeline@v0.9.0 · 5447 in / 1187 out tokens · 32474 ms · 2026-05-10T11:20:48.619757+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 1 internal anchor

  1. [1]

    Planning your journey in audio: design and evaluation of auditory route overviews.ACM Transactions on Accessible Comput- ing, 15(4):1–48, 2022

    Nida Aziz, Tony Stockman, and Rebecca Stewart. Planning your journey in audio: design and evaluation of auditory route overviews.ACM Transactions on Accessible Comput- ing, 15(4):1–48, 2022. 2

  2. [2]

    Qwen3-VL Technical Report

    Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, et al. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 3

  3. [3]

    Osmnx: New methods for acquiring, con- structing, analyzing, and visualizing complex street net- works.Computers, environment and urban systems, 65:126– 139, 2017

    Geoff Boeing. Osmnx: New methods for acquiring, con- structing, analyzing, and visualizing complex street net- works.Computers, environment and urban systems, 65:126– 139, 2017. 2

  4. [4]

    Interactivity improves us- ability of geographic maps for visually impaired people

    Anke M Brock, Philippe Truillet, Bernard Oriola, Delphine Picard, and Christophe Jouffrais. Interactivity improves us- ability of geographic maps for visually impaired people. Human–Computer Interaction, 30(2):156–194, 2015. 2

  5. [5]

    Exploration patterns shape cognitive map learning.Cognition, 233:105360, 2023

    Iva K Brunec, Meghan M Nantais, Jennifer E Sutton, Rus- sell A Epstein, and Nora S Newcombe. Exploration patterns shape cognitive map learning.Cognition, 233:105360, 2023. 1

  6. [6]

    Jido: A conversational tactile map for blind people

    Luis Cavazos Quero, Jorge Iranzo Bartolom ´e, Dongmyeong Lee, Yerin Lee, Sangwon Lee, and Jundong Cho. Jido: A conversational tactile map for blind people. InProceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility, pages 682–684, 2019. 2

  7. [7]

    Blindnavi: A navigation app for the visually impaired smartphone user

    Hsuan-Eng Chen, Yi-Ying Lin, Chien-Hsing Chen, and I- Fang Wang. Blindnavi: A navigation app for the visually impaired smartphone user. InProceedings of the 33rd an- nual ACM conference extended abstracts on human factors in computing systems, pages 19–24, 2015. 1

  8. [8]

    Learning from human tutoring.Cognitive Science, 25(4):471–533, 2001

    Michelene TH Chi, Stephanie A Siler, Heisawn Jeong, Takashi Yamauchi, and Robert G Hausmann. Learning from human tutoring.Cognitive Science, 25(4):471–533, 2001. 2

  9. [9]

    Active and pas- sive contributions to spatial learning.Psychonomic Bulletin & Review, 19(1):1–23, 2012

    Elizabeth R Chrastil and William H Warren. Active and pas- sive contributions to spatial learning.Psychonomic Bulletin & Review, 19(1):1–23, 2012. 1, 3

  10. [10]

    Grounding in com- munication

    Herbert H Clark and Susan E Brennan. Grounding in com- munication. InPerspectives on Socially Shared Cognition, pages 127–149. American Psychological Association, 1991. 2, 3

  11. [11]

    Streetviewai: Making street view accessible using context-aware multimodal ai

    Jon E Froehlich, Alexander J Fiannaca, Nimer M Jaber, Vic- tor Tsaran, and Shaun K Kane. Streetviewai: Making street view accessible using context-aware multimodal ai. InPro- ceedings of the 38th Annual ACM Symposium on User Inter- face Software and Technology, pages 1–22, 2025. 2

  12. [12]

    Navigating without vision: Principles of blind spatial cognition

    Nicholas A Giudice. Navigating without vision: Principles of blind spatial cognition. InHandbook of Behavioral and Cognitive Geography, pages 260–288. Edward Elgar Pub- lishing, 2018. 2, 3

  13. [13]

    Ricardo E Gonzalez Penuela, Fannie Liu, Blair MacIntyre, and David Saffo. Tapnav: Adaptive spatiotactile screen read- ers for tactually guided touchscreen interactions for blind and low vision people.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 9(4):1–29,

  14. [14]

    Learning by communicating in natural language with conversational agents.Current Directions in Psychological Science, 23(5): 374–380, 2014

    Arthur C Graesser, Haiying Li, and Carol Forsyth. Learning by communicating in natural language with conversational agents.Current Directions in Psychological Science, 23(5): 374–380, 2014. 2

  15. [15]

    The effec- tiveness of an interactive audio-tactile map for the process of cognitive mapping and recall among people with visual impairments.Brain and Behavior, 10(7):e01650, 2020

    Emma Griffin, Lorenzo Picinali, and Mark Scase. The effec- tiveness of an interactive audio-tactile map for the process of cognitive mapping and recall among people with visual impairments.Brain and Behavior, 10(7):e01650, 2020. 2

  16. [16]

    Scenescout: Towards ai agent-driven access to street view imagery for blind users.arXiv preprint arXiv:2504.09227, 2025

    Gaurav Jain, Leah Findlater, and Cole Gleason. Scenescout: Towards ai agent-driven access to street view imagery for blind users.arXiv preprint arXiv:2504.09227, 2025. 1, 2

  17. [17]

    Llm-powered assistant with elec- trotactile feedback to assist blind and low vision people with maps and routes preview.International Journal of Human- Computer Studies, 2025

    Chutian Jiang, Yinan Fan, Junan Xie, Emily Kuang, Kaihao Zhang, and Mingming Fan. Llm-powered assistant with elec- trotactile feedback to assist blind and low vision people with maps and routes preview.International Journal of Human- Computer Studies, 2025. 2

  18. [18]

    How can haptic feedback assist people with blind and low vision (blv): A systematic literature review.ACM Transactions on Acces- sible Computing, 18(1):1–57, 2025

    Chutian Jiang, Emily Kuang, and Mingming Fan. How can haptic feedback assist people with blind and low vision (blv): A systematic literature review.ACM Transactions on Acces- sible Computing, 18(1):1–57, 2025. 2

  19. [19]

    Chitchatguide: Conversational interaction using large language models for assisting people with visual impairments to explore a shopping mall

    Yuka Kaniwa, Masaki Kuribayashi, Seita Kayukawa, Daisuke Sato, Hironobu Takagi, Chieko Asakawa, and Shi- geo Morishima. Chitchatguide: Conversational interaction using large language models for assisting people with visual impairments to explore a shopping mall. InProceedings of the ACM on Human-Computer Interaction. ACM, 2024. 2

  20. [20]

    Touch mapper: Tactile maps for the visually impaired

    Samuli K ¨arkk¨ainen. Touch mapper: Tactile maps for the visually impaired. 1

  21. [21]

    Enabling uniform computer interaction experience for blind users through large language models

    Satwik Ram Kodandaram, Utku Uckun, Xiaojun Bi, IV Ra- makrishnan, and Vikas Ashok. Enabling uniform computer interaction experience for blind users through large language models. InProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility,

  22. [22]

    Multimodal navigation systems for users with visual impair- ments—a review and analysis.Multimodal Technologies and Interaction, 4(4):73, 2020

    Bineeth Kuriakose, Raju Shrestha, and Frode Eika Sandnes. Multimodal navigation systems for users with visual impair- ments—a review and analysis.Multimodal Technologies and Interaction, 4(4):73, 2020. 1, 3

  23. [23]

    Construction of cognitive maps of unknown spaces using a multi-sensory virtual en- vironment for people who are blind.Computers in Human Behavior, 24(3):1139–1155, 2008

    Orly Lahav and David Mioduser. Construction of cognitive maps of unknown spaces using a multi-sensory virtual en- vironment for people who are blind.Computers in Human Behavior, 24(3):1139–1155, 2008. 2

  24. [24]

    Mapio: a gestural and con- versational interface for tactile maps.IEEE Access, 2025

    Matteo Manzoni, Sergio Mascetti, Dragan Ahmetovic, Ryan Crabb, and James M Coughlan. Mapio: a gestural and con- versational interface for tactile maps.IEEE Access, 2025. 2

  25. [25]

    3d building plans: Supporting navigation by people who are blind or have low vision in multi-storey buildings

    Ruth G Nagassa, Matthew Butler, Leona Holloway, Cagatay Goncu, and Kim Marriott. 3d building plans: Supporting navigation by people who are blind or have low vision in multi-storey buildings. InProceedings of the 2023 CHI Con- ference on Human Factors in Computing Systems, pages 1– 19, 2023. 1

  26. [26]

    Cognitive map formation supported by auditory, haptic, and multimodal information in persons with blindness.Neuro- science & Biobehavioral Reviews, 140:104797, 2022

    Loes Ottink, Hendrik Buimer, Bram van Raalte, Christian F Doeller, Thea M van der Geest, and Richard JA Van Wezel. Cognitive map formation supported by auditory, haptic, and multimodal information in persons with blindness.Neuro- science & Biobehavioral Reviews, 140:104797, 2022. 2 6

  27. [27]

    Beyond canes and guide dogs: A review of 40 years of robotics for wayfinding, navigating, and orienting assistance for people with visual impairments

    John Pohovey, Maria Lusardi, Aamir Hasan, Shuijing Liu, Andre Schreiber, Samuel A Olatunji, Wendy A Rogers, and Katherine Driggs-Campbell. Beyond canes and guide dogs: A review of 40 years of robotics for wayfinding, navigating, and orienting assistance for people with visual impairments. engrXiv Preprints, 2025. 1, 2

  28. [28]

    Active and passive explo- ration for spatial knowledge acquisition: A meta-analysis

    Yue Qin and Hassan A Karimi. Active and passive explo- ration for spatial knowledge acquisition: A meta-analysis. Quarterly Journal of Experimental Psychology, 77(5), 2024. 1, 3

  29. [29]

    Audio-vibratory you-are-here mobile maps for people with visual impairments.Proceedings of the ACM on Human-Computer Interaction, 8(ISS):624–648,

    Elen Sargsyan, Bernard Oriola, Marcos Serrano, and Christophe Jouffrais. Audio-vibratory you-are-here mobile maps for people with visual impairments.Proceedings of the ACM on Human-Computer Interaction, 8(ISS):624–648,

  30. [30]

    Molder: an accessible de- sign tool for tactile maps

    Lei Shi, Yuhang Zhao, Ricardo Gonzalez Penuela, Elizabeth Kupferstein, and Shiri Azenkot. Molder: an accessible de- sign tool for tactile maps. InProceedings of the 2020 CHI conference on human factors in computing systems, pages 1–14, 2020. 1

  31. [31]

    The develop- ment of spatial representations of large-scale environments

    Alexander W Siegel and Sheldon H White. The develop- ment of spatial representations of large-scale environments. Advances in child development and behavior, 10:9–55, 1975. 1, 3

  32. [32]

    User-centered insights into assistive navigation technologies for individuals with visual impairment.arXiv preprint arXiv:2504.06379, 2025

    Iman Soltani, Johnaton Schofield, Mehran Madani, Daniel Kish, and Parisa Emami-Naeini. User-centered insights into assistive navigation technologies for individuals with visual impairment.arXiv preprint arXiv:2504.06379, 2025. 1

  33. [33]

    Cognitive load during problem solving: Ef- fects on learning.Cognitive Science, 12(2):257–285, 1988

    John Sweller. Cognitive load during problem solving: Ef- fects on learning.Cognitive Science, 12(2):257–285, 1988. 3

  34. [34]

    Representa- tion of space in blind persons: Vision as a spatial sense? Psychological Bulletin, 121(1):20–42, 1997

    Catherine Thinus-Blanc and Florence Gaunet. Representa- tion of space in blind persons: Vision as a spatial sense? Psychological Bulletin, 121(1):20–42, 1997. 3

  35. [35]

    Learning and navigating digi- tally rendered haptic spatial layouts.npj Science of Learning, 8(1), 2023

    Ruxandra I Tivadar, Benedetta Franceschiello, Astrid Minier, and Micah M Murray. Learning and navigating digi- tally rendered haptic spatial layouts.npj Science of Learning, 8(1), 2023. 2

  36. [36]

    Cognitive maps in rats and men.Psycho- logical Review, 55(4):189–208, 1948

    Edward C Tolman. Cognitive maps in rats and men.Psycho- logical Review, 55(4):189–208, 1948. 1

  37. [37]

    ” pray before you step out” describing personal and situational blind navigation behaviors

    Michele A Williams, Amy Hurst, and Shaun K Kane. ” pray before you step out” describing personal and situational blind navigation behaviors. InProceedings of the 15th Inter- national ACM SIGACCESS Conference on Computers and Accessibility, pages 1–8, 2013. 1 7