pith. sign in

arxiv: 2604.03451 · v1 · submitted 2026-04-03 · 💻 cs.RO · cs.CY· cs.HC

Do Robots Need Body Language? Comparing Communication Modalities for Legible Motion Intent in Human-Shared Spaces

Pith reviewed 2026-05-13 18:20 UTC · model grok-4.3

classification 💻 cs.RO cs.CYcs.HC
keywords human-robot interactionlegible motionsignaling modalitiesquadruped robotnavigation intentimplicit vs explicit cuesmultimodal communication
0
0 comments X

The pith

Expressive robot motion predicts navigation intent as well as lights, text, or audio.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a quadruped robot can communicate upcoming navigation actions through its natural movements or needs added explicit signals. Researchers ran an online video study with a Boston Dynamics Spot robot across four common scenarios. They presented expressive motion alone or paired with lights, text, and audio, then measured prediction accuracy, confidence, and trust in safe behavior. Results indicate that implicit motion cues perform comparably to explicit ones, with aligned multimodal signals improving outcomes and conflicts reducing them. A sympathetic reader would care because better intent communication could make robots safer and less disruptive in shared human spaces.

Core claim

Expressive motions enable humans to predict a quadruped robot's upcoming navigation actions with accuracy comparable to lights, text, or audio. Aligned combinations of modalities raise prediction confidence and trust in safe behavior, while conflicting cues lower both. The study supplies initial evidence that implicit signaling strategies can match explicit channels for conveying motion intent in shared environments.

What carries the argument

Video comparison of four signaling modalities (expressive motion, lights, text, audio) on a quadruped robot's navigation behaviors, evaluated by prediction accuracy, confidence, and trust.

Load-bearing premise

Participants' responses to video clips of robot behavior accurately reflect how they would perceive and react during actual physical interactions in shared spaces.

What would settle it

A live physical interaction study using the same scenarios and measures that shows substantially lower prediction accuracy or trust for expressive motion than the video results.

Figures

Figures reproduced from arXiv: 2604.03451 by Allen Song, Jonathan Albert Cohen, Kent Larson, Kye Shimizu, Pattie Maes, Vishnu Bharath.

Figure 2
Figure 2. Figure 2: Accuracy vs. Communication Signal Category produced moderate accuracy (≈58%). Redundancy did not increase accuracy for any explicit channels [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Mean Confidence vs. Communication Signal Type Confidence Ratings: Mean confidence ratings ( [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Mean Trust vs. Communication Signal Type Trust Ratings: Trust ratings ( [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
read the original abstract

Robots in shared spaces often move in ways that are difficult for people to interpret, placing the burden on humans to adapt. High-DoF robots exhibit motion that people read as expressive, intentionally or not, making it important to understand how such cues are perceived. We present an online video study evaluating how different signaling modalities, expressive motion, lights, text, and audio, shape people's ability to understand a quadruped robot's upcoming navigation actions (Boston Dynamics Spot). Across four common scenarios, we measure how each modality influences humans' (1) accuracy in predicting the robot's next navigation action, (2) confidence in that prediction, and (3) trust in the robot to act safely. The study tests how expressive motions compare to explicit channels, whether aligned multimodal cues enhance interpretability, and how conflicting cues affect user confidence and trust. We contribute initial evidence on the relative effectiveness of implicit versus explicit signaling strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports an online video study with a Boston Dynamics Spot quadruped that compares four signaling modalities—expressive motion, lights, text, and audio—across four navigation scenarios. It measures participants' accuracy in predicting the robot's next action, their confidence in those predictions, and their trust in the robot's safe behavior, while also testing effects of aligned versus conflicting multimodal cues.

Significance. If the relative-effectiveness findings hold under more realistic conditions, the work supplies initial empirical data on implicit versus explicit channels that could inform legible-motion design guidelines for robots operating in human-shared spaces.

major comments (2)
  1. [Methods / Discussion] The central claim that the video results inform real shared-space behavior rests on the untested assumption that offline ratings of pre-recorded clips generalize to live physical interactions; the manuscript does not report any validation against embodied or risk-bearing conditions (e.g., §4 or §5).
  2. [Results] Participant counts, exact statistical tests, effect sizes, and raw data or preregistration details are not provided in the abstract or visible sections, leaving the degree of support for modality comparisons unverifiable.
minor comments (2)
  1. [Methods] Clarify the exact wording of the prediction questions and the timing of modality presentation in each video clip.
  2. [Discussion] Add a limitations paragraph explicitly addressing the absence of real-time timing, embodiment, and collision risk.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. Below we provide point-by-point responses to the major comments and describe the revisions made.

read point-by-point responses
  1. Referee: The central claim that the video results inform real shared-space behavior rests on the untested assumption that offline ratings of pre-recorded clips generalize to live physical interactions; the manuscript does not report any validation against embodied or risk-bearing conditions (e.g., §4 or §5).

    Authors: We agree that the study is limited to video-based evaluations and does not provide direct evidence from live physical interactions. The original manuscript already frames the results as initial evidence from an online video study rather than claiming direct generalization. To address this, we have revised the Discussion section (§5) to more explicitly discuss the limitations of video studies and the need for future embodied validation in risk-bearing conditions. revision: yes

  2. Referee: Participant counts, exact statistical tests, effect sizes, and raw data or preregistration details are not provided in the abstract or visible sections, leaving the degree of support for modality comparisons unverifiable.

    Authors: The full manuscript reports participant counts, statistical tests, and effect sizes in the Results section. However, to improve visibility, we have revised the abstract to include these key details. Additionally, we have added explicit references to the preregistration and data availability in the abstract and methods section. We believe this makes the support for the comparisons verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical video study with direct participant data

full rationale

The paper reports an online video-based user study measuring prediction accuracy, confidence, and trust from participant responses to pre-recorded robot navigation clips across modalities. No equations, derivations, fitted parameters, or self-citation chains appear in the provided text or abstract. All claims reduce directly to collected empirical responses rather than any constructed or self-referential quantities, satisfying the criteria for a self-contained non-circular empirical result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical human-subjects study with no mathematical models, free parameters, or invented entities. Relies only on standard domain assumptions from experimental HRI and psychology.

axioms (1)
  • domain assumption Video stimuli sufficiently represent real robot behavior for intent perception
    Common assumption in HRI video studies but may not capture all physical cues or dynamic context.

pith-pipeline@v0.9.0 · 5478 in / 1174 out tokens · 28196 ms · 2026-05-13T18:20:09.775568+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

  1. [1]

    Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. J. Hum.-Robot Interact. 6, 1 (May 2017), 25–63. doi:10.5898/ JHRI.6.1.Admoni

  2. [2]

    Georgios Angelopoulos, Alessandra Rossi, Claudia Di Napoli, and Silvia Rossi

  3. [3]

    In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

    You are in my way: non-verbal social cues for legible robot navigation behaviors. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 657–662

  4. [4]

    Jasmin Blumenkamp and Marco Huber. 2021. Toward the Legibility of Multi- Robot Systems. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI). ACM, 63–71. doi:10.1145/3434073.3444662

  5. [5]

    Boston Dynamics. 2022. Spot: Agile Mobile Robot. https://www.bostondynamics. com/spot. Accessed: 2025-12-11

  6. [6]

    Yuhang Che, Allison Okamura, and Dorsa Sadigh. 2020. Efficient and Trustworthy Social Navigation via Explicit and Implicit Robot–Human Communication. IEEE Transactions on Robotics PP (01 2020), 1–16. doi:10.1109/TRO.2020.2964824

  7. [7]

    Haralambos Dafas, Emma Li, and Emily Cross. 2024. Walking the Line: Assessing the Role of Gait in a Quadruped Robot’s Perception. In 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN). 1339–1345. doi:10.1109/RO-MAN60168.2024.10731163

  8. [8]

    Anca Dragan and Siddhartha Srinivasa. 2018. Generating Legible Motion. (6 2018). doi:10.1184/R1/6554969.v1

  9. [9]

    Dragan, Kenton C

    Anca D. Dragan, Kenton C. T. Lee, and Siddhartha S. Srinivasa. 2013. Legibility and Predictability of Robot Motion. In Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 301–308. doi:10.1109/HRI. 2013.6483603

  10. [10]

    Erlebach, P

    R. Erlebach, P. Rosen, and S. Wischniewski. 2025. Navigating Proximity: Human Comfort Levels with Quadruped Robots in Shared Spaces. In Proceedings of the 2025 IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO). IEEE. To appear

  11. [11]

    Rolando Fernandez, Nathan John, Sean Kirmani, Justin Hart, Jivko Sinapov, and Peter Stone. 2018. Passive demonstrations of light-based robot signals for improved human interpretability. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, 234–239

  12. [12]

    Yuxiang Gao and Chien-Ming Huang. 2021. Evaluation of Socially-Aware Robot Navigation. Frontiers in Robotics and AI 8 (2021), 721317. doi:10.3389/frobt.2021. 721317

  13. [13]

    ACM/IEEE International Conference on Human-Robot Interaction - HRI ’12 p

    Michael J. Gielniak and Andrea L. Thomaz. 2012. Enhancing interaction through exaggerated motion synthesis. In 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI). 375–382. doi:10.1145/2157689.2157813

  14. [14]

    Óscar Gil, Anaís Garrell, and Alberto Sanfeliu. 2021. Social Robot Navigation Tasks: Combining Machine Learning Techniques and Social Force Model. Sensors 21, 21 (2021), 7087. doi:10.3390/s21217087

  15. [15]

    Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psy­ chology. Vol. 52. Elsevier, 139–183

  16. [16]

    Nanami Hashimoto, Emma Hagens, Arkady Zgonnikov, and Maria Luce Lu­ petti. 2024. Safe Spot: Exploring Perceived Safety of Dominant vs Submis­ sive Quadruped Robots. In Proceedings of the 33rd IEEE International Confer­ ence on Robot and Human Interactive Communication (RO-MAN). IEEE, 717–722. doi:10.1109/RO-MAN60168.2024.10731298

  17. [17]

    Dirk Helbing and Peter Molnár. 1995. Social Force Model for Pedestrian Dynamics. Physical Review E 51, 5 (1995), 4282–4286. doi:10.1103/PhysRevE.51.4282

  18. [18]

    Leland Hepler and David Robert. 2023. Teaching a New Dog Old Tricks. XRDS: Crossroads, The ACM Magazine for Students 30, 1 (2023), 46–51. doi:10.1145/ 3611685

  19. [19]

    Guy Hoffman and Wendy Ju. 2014. Designing Robots with Movement in Mind. Journal of Human-Robot Interaction 3, 1 (2014), 89–122. doi:10.5898/JHRI.3.1. Hoffman

  20. [20]

    Heather Knight and Reid Simmons. 2016. Laban head-motions convey robot state: A call for robot body language. In 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2881–2888

  21. [21]

    Huang, and Anca D

    Minae Kwon, Sandy H. Huang, and Anca D. Dragan. 2018. Expressing Robot Incapability. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI). ACM, 87–95. doi:10.1145/3171221.3171276

  22. [22]

    Gregory Lemasurier, Gal Bejerano, Victoria Albanese, Jenna Parrillo, Holly A Yanco, Nicholas Amerson, Rebecca Hetrick, and Elizabeth Phillips. 2021. Methods for expressing robot intent for human–robot collaboration in shared workspaces. ACM Transactions on Human-Robot Interaction (THRI) 10, 4 (2021), 1–27

  23. [23]

    Christina Lichtenthäler and Alexandra Kirsch. 2013. Towards Legible Robot Navigation - How to Increase the Intend Expressiveness of Robot Navigation Behavior. In International Conference on Social Robotics - Workshop Embodied Communication of Goals and Intentions. Bristol, United Kingdom. https://hal. science/hal-01684307

  24. [24]

    Michael Chia-Liang Lin. 2021. Affordable autonomous lightweight personal mobil­ ity. Ph. D. Dissertation. Massachusetts Institute of Technology

  25. [25]

    Reuth Mirsky, Xuesu Xiao, Justin Hart, and Peter Stone. 2024. Conflict Avoidance in Social Navigation—a Survey. 13, 1, Article 13 (March 2024), 36 pages. doi:10. 1145/3647983

  26. [26]

    Max Pascher, Uwe Gruenefeld, Stefan Schneegass, and Jens Gerken. 2023. How to Communicate Robot Motion Intent: A Scoping Review. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, 409:1–409:17. doi:10.1145/3544548.3580857

  27. [27]

    Pennycooke

    N. Pennycooke. 2012. AEVITA: Designing Biomimetic Vehicle-to-Pedestrian Com­ munication Protocols for Autonomously Operating & Parking On-Road Electric Vehicles. Master’s thesis. https://dspace.mit.edu/handle/1721.1/77810

  28. [28]

    Prolific. 2025. Prolific Research Platform. https://www.prolific.com. Accessed: 2025-12-08

  29. [29]

    Jakob Reinhardt. 2022. Parametrization and Evaluation of Legible Motion for Human-Robot Interaction. Ph. D. Dissertation

  30. [30]

    Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, and Stefanie Tellex. 2019. Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays. Int. J. Rob. Res. 38, 12–13 (Oct. 2019), 1513–1526. doi:10.1177/0278364919842925

  31. [31]

    NaVILA: Legged Robot Vision-Language-Action Model for Navigation

    Dorsa Sadigh, Shankar Sastry, Sanjit A. Seshia, and Anca D. Dragan. 2016. Plan­ ning for Autonomous Cars that Leverage Effects on Human Actions. In Proceed­ ings of Robotics: Science and Systems. AnnArbor, Michigan. doi:10.15607/RSS. 2016.XII.029

  32. [32]

    OCTOANTS: A hetero- geneous lightweight intelligent multi-robot collaboration system with resource-constrained iot devices

    Arjun Sripathy, Andreea Bobu, Zhongyu Li, Koushil Sreenath, Daniel S. Brown, and Anca D. Dragan. 2022. Teaching Robots to Span the Space of Functional Expressive Motion. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 13406–13413. doi:10.1109/IROS47612. 2022.9981439

  33. [33]

    Leila Takayama, Doug Dooley, and Wendy Ju. 2011. Expressing Thought: Im­ proving Robot Readability with Animation Principles. In Proceedings of the 6th International Conference on Human-Robot Interaction (HRI). ACM, Lausanne, Switzerland, 69–76. doi:10.1145/1957656.1957674

  34. [34]

    Leila Takayama and Caroline Pantofaru. 2009. Influences on proxemic behav­ iors in human-robot interaction. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. 5495–5502. doi:10.1109/IROS.2009.5354145

  35. [35]

    The Washington Post. 2025. Man on mobility scooter films collision with delivery robot. https://www.washingtonpost.com/video/national/man-on-mobility- scooter-films-collision-with-delivery-robot/2025/09/23/4f235955-4197-41f0-9313- f3819fdf9cdf_video.html Video Figure

  36. [36]

    Peter Trautman, Andreas Krause, Jude Johanson, and Andrew Frost. 2015. Robot Navigation in Dense Human Crowds: Statistical Models and Experimental Studies of Human–Robot Cooperation. The International Journal of Robotics Research 34, 3 (2015), 335–356. doi:10.1177/0278364914557874

  37. [37]

    Sebastian Wallkötter, Mohamed Chetouani, and Ginevra Castellano. 2022. A New Approach to Evaluating Legibility: Comparing Legibility Frameworks Using Framework-Independent Robot Motion Trajectories. arXiv preprint arXiv:2201.05765 (2022). https://arxiv.org/abs/2201.05765 Preprint

  38. [38]

    Sebastian Wallkötter, Mohamed Chetouani, and Ginevra Castellano. 2022. SLOT- V: Supervised Learning of Observer Models for Legible Robot Motion Planning in Manipulation. In Proceedings of the 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 1421–1428. doi:10.1109/RO-MAN53752.2022.9900568

  39. [39]

    Jerry Wei-Hua Yao. 2019. IDK: An Interaction Development Kit to design interactions for lightweight autonomous vehicles. Ph. D. Dissertation. Massachusetts Institute of Technology. Received 2025-12-08; accepted 2026-01-12