A hybrid semantic graph and retrieval-augmented system with parameter-efficient VLMs achieves state-of-the-art inference and querying speeds on embodied navigation tasks with competitive accuracy.
Florence-2: Advancing a unified representation for a variety of vision tasks,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.RO 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
An ablation study isolates the contributions of LLM choice, visual perception configuration, and motion controller to success rate and execution time in a human-robot grasping task.
citing papers explorer
-
EmbodiedLGR: Integrating Lightweight Graph Representation and Retrieval for Semantic-Spatial Memory in Robotic Agents
A hybrid semantic graph and retrieval-augmented system with parameter-efficient VLMs achieves state-of-the-art inference and querying speeds on embodied navigation tasks with competitive accuracy.
-
Ablation Study of Multimodal Perception, Language Grounding, and Control for Human-Robot Interaction in an Object Detection and Grasping Task
An ablation study isolates the contributions of LLM choice, visual perception configuration, and motion controller to success rate and execution time in a human-robot grasping task.