A user study found that 71% of 24 participants preferred an improved multimodal HRI grasping system over baseline, with significantly higher ratings on three perceptual scales after statistical correction.
An approach to combining video and speech with large language models in human-robot interaction
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
An ablation study isolates the contributions of LLM choice, visual perception configuration, and motion controller to success rate and execution time in a human-robot grasping task.
citing papers explorer
-
From Technical Metrics to User Perception: A User Study of a Multimodal Human-Robot Interaction System for Object Detection and Grasping
A user study found that 71% of 24 participants preferred an improved multimodal HRI grasping system over baseline, with significantly higher ratings on three perceptual scales after statistical correction.
-
Ablation Study of Multimodal Perception, Language Grounding, and Control for Human-Robot Interaction in an Object Detection and Grasping Task
An ablation study isolates the contributions of LLM choice, visual perception configuration, and motion controller to success rate and execution time in a human-robot grasping task.