Recognition: unknown
Condition-Invariant Multi-View Place Recognition
read the original abstract
Visual place recognition is particularly challenging when places suffer changes in its appearance. Such changes are indeed common, e.g., due to weather, night/day or seasons. In this paper we leverage on recent research using deep networks, and explore how they can be improved by exploiting the temporal sequence information. Specifically, we propose 3 different alternatives (Descriptor Grouping, Fusion and Recurrent Descriptors) for deep networks to use several frames of a sequence. We show that our approaches produce more compact and best performing descriptors than single- and multi-view baselines in the literature in two public databases.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
VidTAG: Temporally Aligned Video to GPS Geolocalization with Denoising Sequence Prediction at a Global Scale
VidTAG achieves fine-grained global video-to-GPS geolocalization via temporal frame alignment and denoising sequence refinement, reporting 20% gains at 1 km over GeoCLIP and 25% on CityGuessr68k.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.