An asynchronous architecture decouples incremental voxel-based mapping from VLM-based semantic enrichment to produce queryable open-vocabulary 3D scene graphs that match or exceed prior methods on segmentation and grounding benchmarks.
IEEE Robotics and Automation Letters9(10), 8921–8928 (2024)
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
FARM creates an open-vocabulary relational spatial memory that improves object retrieval recall by 164-224% over prior methods on 44k language queries across 67 scenes while running at 5-10 Hz.
SemanticXR introduces the first device-cloud system for real-time open-vocabulary semantic mapping and querying that organizes work around semantically identifiable objects to meet XR power, bandwidth, and memory limits.
Uses VLMs to detect instance concepts and LLMs to infer abstract relationships, assembling them into 3D scene graph forests that are evaluated on uHumans2 and ScanNet and tested in open-vocabulary retrieval on a Spot robot.
citing papers explorer
-
FARM: Find Anything using Relational Spatial Memory
FARM creates an open-vocabulary relational spatial memory that improves object retrieval recall by 164-224% over prior methods on 44k language queries across 67 scenes while running at 5-10 Hz.
-
From Pixels to Concepts: Growing Rich 3D Semantic Scene Graph Forests utilizing Foundation Models
Uses VLMs to detect instance concepts and LLMs to infer abstract relationships, assembling them into 3D scene graph forests that are evaluated on uHumans2 and ScanNet and tested in open-vocabulary retrieval on a Spot robot.