pith. sign in

arxiv: 1812.07976 · v4 · pith:ZIXHVO27new · submitted 2018-12-19 · 💻 cs.RO · cs.CV

MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM

classification 💻 cs.RO cs.CV
keywords cameradynamicobjectexistingobject-levelposesystemestimate
0
0 comments X
read the original abstract

We propose a new multi-instance dynamic RGB-D SLAM system using an object-level octree-based volumetric representation. It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene. For each incoming frame, we perform instance segmentation to detect objects and refine mask boundaries using geometric and motion information. Meanwhile, we estimate the pose of each existing moving object using an object-oriented tracking method and robustly track the camera pose against the static scene. Based on the estimated camera pose and object poses, we associate segmented masks with existing models and incrementally fuse corresponding colour, depth, semantic, and foreground object probabilities into each object model. In contrast to existing approaches, our system is the first system to generate an object-level dynamic volumetric map from a single RGB-D camera, which can be used directly for robotic tasks. Our method can run at 2-3 Hz on a CPU, excluding the instance segmentation part. We demonstrate its effectiveness by quantitatively and qualitatively testing it on both synthetic and real-world sequences.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM

    cs.CV 2019-07 unverdicted novelty 6.0

    DetectFusion combines 2D object detection with 3D geometric segmentation to handle both known and unknown dynamic objects in real-time RGB-D SLAM at about 20 FPS.