Multi-Object Monocular SLAM for Dynamic Environments

Gokul B Nair¹ Swapnil Daga¹ Rahul Sajnani¹ Anirudha Ramesh¹ Junaid Ahmed Ansari² J. Krishna Murthy³ K. Madhava Krishna¹

¹ IIIT Hyderabad, India ² TCS Research and Innovation Labs, Kolkata, India ³ Mila - Quebec AI Institute, Montreal, Canada

In this paper, we tackle the problem of multibody SLAM from a monocular camera. The term multibody, implies that we track the motion of the camera, as well as that of other dynamic participants in the scene. The quintessential challenge in dynamic scenes is unobservability: it is not possible to unambiguously triangulate a moving object from a moving monocular camera. Existing approaches solve restricted variants of the problem, but the solutions suffer relative scale ambiguity (i.e., a family of infinitely many solutions exist for each pair of motions in the scene). We solve this rather intractable problem by leveraging single-view metrology, advances in deep learning, and category-level shape estimation. We propose a multi pose-graph optimization formulation, to resolve the relative and absolute scale factor ambiguities involved. This optimization helps us reduce the average error in trajectories of multiple bodies over real-world datasets, such as KITTI. To the best of our knowledge, our method is the first practical monocular multi-body SLAM system to perform dynamic multi-object and ego localization in a unified framework in metric scale.

Download paper