Do Robots Need Body Language? Comparing Communication Modalities for Legible Motion Intent in Human-Shared Spaces
Pith reviewed 2026-05-13 18:20 UTC · model grok-4.3
The pith
Expressive robot motion predicts navigation intent as well as lights, text, or audio.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Expressive motions enable humans to predict a quadruped robot's upcoming navigation actions with accuracy comparable to lights, text, or audio. Aligned combinations of modalities raise prediction confidence and trust in safe behavior, while conflicting cues lower both. The study supplies initial evidence that implicit signaling strategies can match explicit channels for conveying motion intent in shared environments.
What carries the argument
Video comparison of four signaling modalities (expressive motion, lights, text, audio) on a quadruped robot's navigation behaviors, evaluated by prediction accuracy, confidence, and trust.
Load-bearing premise
Participants' responses to video clips of robot behavior accurately reflect how they would perceive and react during actual physical interactions in shared spaces.
What would settle it
A live physical interaction study using the same scenarios and measures that shows substantially lower prediction accuracy or trust for expressive motion than the video results.
Figures
read the original abstract
Robots in shared spaces often move in ways that are difficult for people to interpret, placing the burden on humans to adapt. High-DoF robots exhibit motion that people read as expressive, intentionally or not, making it important to understand how such cues are perceived. We present an online video study evaluating how different signaling modalities, expressive motion, lights, text, and audio, shape people's ability to understand a quadruped robot's upcoming navigation actions (Boston Dynamics Spot). Across four common scenarios, we measure how each modality influences humans' (1) accuracy in predicting the robot's next navigation action, (2) confidence in that prediction, and (3) trust in the robot to act safely. The study tests how expressive motions compare to explicit channels, whether aligned multimodal cues enhance interpretability, and how conflicting cues affect user confidence and trust. We contribute initial evidence on the relative effectiveness of implicit versus explicit signaling strategies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports an online video study with a Boston Dynamics Spot quadruped that compares four signaling modalities—expressive motion, lights, text, and audio—across four navigation scenarios. It measures participants' accuracy in predicting the robot's next action, their confidence in those predictions, and their trust in the robot's safe behavior, while also testing effects of aligned versus conflicting multimodal cues.
Significance. If the relative-effectiveness findings hold under more realistic conditions, the work supplies initial empirical data on implicit versus explicit channels that could inform legible-motion design guidelines for robots operating in human-shared spaces.
major comments (2)
- [Methods / Discussion] The central claim that the video results inform real shared-space behavior rests on the untested assumption that offline ratings of pre-recorded clips generalize to live physical interactions; the manuscript does not report any validation against embodied or risk-bearing conditions (e.g., §4 or §5).
- [Results] Participant counts, exact statistical tests, effect sizes, and raw data or preregistration details are not provided in the abstract or visible sections, leaving the degree of support for modality comparisons unverifiable.
minor comments (2)
- [Methods] Clarify the exact wording of the prediction questions and the timing of modality presentation in each video clip.
- [Discussion] Add a limitations paragraph explicitly addressing the absence of real-time timing, embodiment, and collision risk.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. Below we provide point-by-point responses to the major comments and describe the revisions made.
read point-by-point responses
-
Referee: The central claim that the video results inform real shared-space behavior rests on the untested assumption that offline ratings of pre-recorded clips generalize to live physical interactions; the manuscript does not report any validation against embodied or risk-bearing conditions (e.g., §4 or §5).
Authors: We agree that the study is limited to video-based evaluations and does not provide direct evidence from live physical interactions. The original manuscript already frames the results as initial evidence from an online video study rather than claiming direct generalization. To address this, we have revised the Discussion section (§5) to more explicitly discuss the limitations of video studies and the need for future embodied validation in risk-bearing conditions. revision: yes
-
Referee: Participant counts, exact statistical tests, effect sizes, and raw data or preregistration details are not provided in the abstract or visible sections, leaving the degree of support for modality comparisons unverifiable.
Authors: The full manuscript reports participant counts, statistical tests, and effect sizes in the Results section. However, to improve visibility, we have revised the abstract to include these key details. Additionally, we have added explicit references to the preregistration and data availability in the abstract and methods section. We believe this makes the support for the comparisons verifiable. revision: yes
Circularity Check
No circularity: purely empirical video study with direct participant data
full rationale
The paper reports an online video-based user study measuring prediction accuracy, confidence, and trust from participant responses to pre-recorded robot navigation clips across modalities. No equations, derivations, fitted parameters, or self-citation chains appear in the provided text or abstract. All claims reduce directly to collected empirical responses rather than any constructed or self-referential quantities, satisfying the criteria for a self-contained non-circular empirical result.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Video stimuli sufficiently represent real robot behavior for intent perception
Reference graph
Works this paper leans on
-
[1]
Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. J. Hum.-Robot Interact. 6, 1 (May 2017), 25–63. doi:10.5898/ JHRI.6.1.Admoni
work page 2017
-
[2]
Georgios Angelopoulos, Alessandra Rossi, Claudia Di Napoli, and Silvia Rossi
-
[3]
In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
You are in my way: non-verbal social cues for legible robot navigation behaviors. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 657–662
work page 2022
-
[4]
Jasmin Blumenkamp and Marco Huber. 2021. Toward the Legibility of Multi- Robot Systems. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI). ACM, 63–71. doi:10.1145/3434073.3444662
-
[5]
Boston Dynamics. 2022. Spot: Agile Mobile Robot. https://www.bostondynamics. com/spot. Accessed: 2025-12-11
work page 2022
-
[6]
Yuhang Che, Allison Okamura, and Dorsa Sadigh. 2020. Efficient and Trustworthy Social Navigation via Explicit and Implicit Robot–Human Communication. IEEE Transactions on Robotics PP (01 2020), 1–16. doi:10.1109/TRO.2020.2964824
-
[7]
Haralambos Dafas, Emma Li, and Emily Cross. 2024. Walking the Line: Assessing the Role of Gait in a Quadruped Robot’s Perception. In 2024 33rd IEEE International Conference on Robot and Human Interactive Communication (ROMAN). 1339–1345. doi:10.1109/RO-MAN60168.2024.10731163
-
[8]
Anca Dragan and Siddhartha Srinivasa. 2018. Generating Legible Motion. (6 2018). doi:10.1184/R1/6554969.v1
-
[9]
Anca D. Dragan, Kenton C. T. Lee, and Siddhartha S. Srinivasa. 2013. Legibility and Predictability of Robot Motion. In Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 301–308. doi:10.1109/HRI. 2013.6483603
work page doi:10.1109/hri 2013
-
[10]
R. Erlebach, P. Rosen, and S. Wischniewski. 2025. Navigating Proximity: Human Comfort Levels with Quadruped Robots in Shared Spaces. In Proceedings of the 2025 IEEE International Conference on Advanced Robotics and its Social Impacts (ARSO). IEEE. To appear
work page 2025
-
[11]
Rolando Fernandez, Nathan John, Sean Kirmani, Justin Hart, Jivko Sinapov, and Peter Stone. 2018. Passive demonstrations of light-based robot signals for improved human interpretability. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, 234–239
work page 2018
-
[12]
Yuxiang Gao and Chien-Ming Huang. 2021. Evaluation of Socially-Aware Robot Navigation. Frontiers in Robotics and AI 8 (2021), 721317. doi:10.3389/frobt.2021. 721317
-
[13]
ACM/IEEE International Conference on Human-Robot Interaction - HRI ’12 p
Michael J. Gielniak and Andrea L. Thomaz. 2012. Enhancing interaction through exaggerated motion synthesis. In 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI). 375–382. doi:10.1145/2157689.2157813
-
[14]
Óscar Gil, Anaís Garrell, and Alberto Sanfeliu. 2021. Social Robot Navigation Tasks: Combining Machine Learning Techniques and Social Force Model. Sensors 21, 21 (2021), 7087. doi:10.3390/s21217087
-
[15]
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psy chology. Vol. 52. Elsevier, 139–183
work page 1988
-
[16]
Nanami Hashimoto, Emma Hagens, Arkady Zgonnikov, and Maria Luce Lu petti. 2024. Safe Spot: Exploring Perceived Safety of Dominant vs Submis sive Quadruped Robots. In Proceedings of the 33rd IEEE International Confer ence on Robot and Human Interactive Communication (RO-MAN). IEEE, 717–722. doi:10.1109/RO-MAN60168.2024.10731298
-
[17]
Dirk Helbing and Peter Molnár. 1995. Social Force Model for Pedestrian Dynamics. Physical Review E 51, 5 (1995), 4282–4286. doi:10.1103/PhysRevE.51.4282
-
[18]
Leland Hepler and David Robert. 2023. Teaching a New Dog Old Tricks. XRDS: Crossroads, The ACM Magazine for Students 30, 1 (2023), 46–51. doi:10.1145/ 3611685
work page 2023
-
[19]
Guy Hoffman and Wendy Ju. 2014. Designing Robots with Movement in Mind. Journal of Human-Robot Interaction 3, 1 (2014), 89–122. doi:10.5898/JHRI.3.1. Hoffman
-
[20]
Heather Knight and Reid Simmons. 2016. Laban head-motions convey robot state: A call for robot body language. In 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2881–2888
work page 2016
-
[21]
Minae Kwon, Sandy H. Huang, and Anca D. Dragan. 2018. Expressing Robot Incapability. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI). ACM, 87–95. doi:10.1145/3171221.3171276
-
[22]
Gregory Lemasurier, Gal Bejerano, Victoria Albanese, Jenna Parrillo, Holly A Yanco, Nicholas Amerson, Rebecca Hetrick, and Elizabeth Phillips. 2021. Methods for expressing robot intent for human–robot collaboration in shared workspaces. ACM Transactions on Human-Robot Interaction (THRI) 10, 4 (2021), 1–27
work page 2021
-
[23]
Christina Lichtenthäler and Alexandra Kirsch. 2013. Towards Legible Robot Navigation - How to Increase the Intend Expressiveness of Robot Navigation Behavior. In International Conference on Social Robotics - Workshop Embodied Communication of Goals and Intentions. Bristol, United Kingdom. https://hal. science/hal-01684307
work page 2013
-
[24]
Michael Chia-Liang Lin. 2021. Affordable autonomous lightweight personal mobil ity. Ph. D. Dissertation. Massachusetts Institute of Technology
work page 2021
-
[25]
Reuth Mirsky, Xuesu Xiao, Justin Hart, and Peter Stone. 2024. Conflict Avoidance in Social Navigation—a Survey. 13, 1, Article 13 (March 2024), 36 pages. doi:10. 1145/3647983
work page 2024
-
[26]
Max Pascher, Uwe Gruenefeld, Stefan Schneegass, and Jens Gerken. 2023. How to Communicate Robot Motion Intent: A Scoping Review. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. ACM, 409:1–409:17. doi:10.1145/3544548.3580857
-
[27]
N. Pennycooke. 2012. AEVITA: Designing Biomimetic Vehicle-to-Pedestrian Com munication Protocols for Autonomously Operating & Parking On-Road Electric Vehicles. Master’s thesis. https://dspace.mit.edu/handle/1721.1/77810
work page 2012
-
[28]
Prolific. 2025. Prolific Research Platform. https://www.prolific.com. Accessed: 2025-12-08
work page 2025
-
[29]
Jakob Reinhardt. 2022. Parametrization and Evaluation of Legible Motion for Human-Robot Interaction. Ph. D. Dissertation
work page 2022
-
[30]
Eric Rosen, David Whitney, Elizabeth Phillips, Gary Chien, James Tompkin, George Konidaris, and Stefanie Tellex. 2019. Communicating and controlling robot arm motion intent through mixed-reality head-mounted displays. Int. J. Rob. Res. 38, 12–13 (Oct. 2019), 1513–1526. doi:10.1177/0278364919842925
-
[31]
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Dorsa Sadigh, Shankar Sastry, Sanjit A. Seshia, and Anca D. Dragan. 2016. Plan ning for Autonomous Cars that Leverage Effects on Human Actions. In Proceed ings of Robotics: Science and Systems. AnnArbor, Michigan. doi:10.15607/RSS. 2016.XII.029
-
[32]
Arjun Sripathy, Andreea Bobu, Zhongyu Li, Koushil Sreenath, Daniel S. Brown, and Anca D. Dragan. 2022. Teaching Robots to Span the Space of Functional Expressive Motion. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 13406–13413. doi:10.1109/IROS47612. 2022.9981439
-
[33]
Leila Takayama, Doug Dooley, and Wendy Ju. 2011. Expressing Thought: Im proving Robot Readability with Animation Principles. In Proceedings of the 6th International Conference on Human-Robot Interaction (HRI). ACM, Lausanne, Switzerland, 69–76. doi:10.1145/1957656.1957674
-
[34]
Leila Takayama and Caroline Pantofaru. 2009. Influences on proxemic behav iors in human-robot interaction. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. 5495–5502. doi:10.1109/IROS.2009.5354145
-
[35]
The Washington Post. 2025. Man on mobility scooter films collision with delivery robot. https://www.washingtonpost.com/video/national/man-on-mobility- scooter-films-collision-with-delivery-robot/2025/09/23/4f235955-4197-41f0-9313- f3819fdf9cdf_video.html Video Figure
work page 2025
-
[36]
Peter Trautman, Andreas Krause, Jude Johanson, and Andrew Frost. 2015. Robot Navigation in Dense Human Crowds: Statistical Models and Experimental Studies of Human–Robot Cooperation. The International Journal of Robotics Research 34, 3 (2015), 335–356. doi:10.1177/0278364914557874
-
[37]
Sebastian Wallkötter, Mohamed Chetouani, and Ginevra Castellano. 2022. A New Approach to Evaluating Legibility: Comparing Legibility Frameworks Using Framework-Independent Robot Motion Trajectories. arXiv preprint arXiv:2201.05765 (2022). https://arxiv.org/abs/2201.05765 Preprint
-
[38]
Sebastian Wallkötter, Mohamed Chetouani, and Ginevra Castellano. 2022. SLOT- V: Supervised Learning of Observer Models for Legible Robot Motion Planning in Manipulation. In Proceedings of the 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 1421–1428. doi:10.1109/RO-MAN53752.2022.9900568
-
[39]
Jerry Wei-Hua Yao. 2019. IDK: An Interaction Development Kit to design interactions for lightweight autonomous vehicles. Ph. D. Dissertation. Massachusetts Institute of Technology. Received 2025-12-08; accepted 2026-01-12
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.