Understanding Birds-Eye View of Road Semantics using an Onboard Camera

Yigit Baran Can, Alexander Liniger, Ozan Unal, Danda Pani Paudel, Luc Van Gool^1,2

¹ETH Zurich, ²KU Leuven
IEEE Robotics and Automation Letters 2022

Abstract: Autonomous navigation requires scene understanding of the action-space to move or anticipate events. For planner agents moving on the ground plane, such as autonomous vehicles, this translates to scene understanding in the bird's-eye view (BEV). However, the onboard cameras of autonomous cars are customarily mounted horizontally for a better view of the surrounding. In this work, we study scene understanding in the form of online estimation of semantic BEV maps using the video input from a single onboard camera. We study three key aspects of this task, image-level understanding, BEV level understanding, and the aggregation of temporal information. Based on these three pillars we propose a novel architecture that combines these three aspects. In our extensive experiments, we demonstrate that the considered aspects are complementary to each other for BEV understanding. Furthermore, the proposed architecture significantly surpasses the current state-of-the-art. The source code of our method is available at here.