ShelfAware: Real-Time Semantic Localization in Quasi-Static Environments with Low-Cost Sensors

Alessandro Roncone; Ashutosh Naik; Bradley Hayes; Jake Brawer; Shivendra Agrawal

arxiv: 2512.09065 · v2 · pith:MKR77OCKnew · submitted 2025-12-09 · 💻 cs.RO · cs.AI

ShelfAware: Real-Time Semantic Localization in Quasi-Static Environments with Low-Cost Sensors

Shivendra Agrawal , Jake Brawer , Ashutosh Naik , Alessandro Roncone , Bradley Hayes This is my paper

classification 💻 cs.RO cs.AI

keywords shelfawaresemanticlocalizationdynamicgeometricglobalsemanticsacross

0 comments

read the original abstract

Many indoor workspaces are quasi-static: their global geometric layout is stable, but local semantics change continually, producing repetitive geometry, dynamic clutter, and perceptual noise that defeat standard vision-based localization. We present ShelfAware, a semantic particle filter for robust global localization that treats scene semantics as statistical evidence over object categories rather than fixed quantity landmarks. ShelfAware fuses a depth likelihood with a category-centric semantic similarity and uses a precomputed bank of semantic viewpoints to perform inverse semantic proposals inside Monte Carlo Localization (MCL), yielding fast, targeted hypothesis generation on low-cost, vision-only hardware. To demonstrate perception-agnostic scalability, we evaluate ShelfAware across two domains. In a rigorously controlled mock retail environment, ShelfAware achieves a 97% global localization success rate, maintaining the highest tracking success (66%) across cart, wearable, and dynamic occlusion conditions. Furthermore, in a 3,500 sq. ft. operational grocery store leveraging an open-vocabulary vision pipeline, ShelfAware significantly outperforms both geometric and fixed-quantity semantic baselines. By modeling semantics distributionally and leveraging inverse proposals, ShelfAware resolves geometric aliasing, providing an infrastructure-free building block for mobile and assistive robots in dynamic real-world environments.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

GIST: Multimodal Knowledge Extraction and Spatial Grounding via Intelligent Semantic Topology
cs.AI 2026-04 unverdicted novelty 5.0

GIST extracts a semantically annotated 2D navigation topology from consumer mobile point clouds to improve spatial grounding for embodied AI in dense environments.