Real-Time 3D Object Detection From Point Cloud Through Foreground Segmentation

2021 
This paper aims to apply real-time light-weight high-precision 3D detection for autonomous driving. We propose LIDAR-based 3D object detection based on foreground segmentation using a fully sparse convolutional network (FS23D). We design a sparse convolutional backbone network and a sparse convolutional detection head to efficiently use the computing and memory resources and accelerate the inference. Instead of using the anchor-based method, we convert the detection problem into a foreground segmentation problem on a bird’s-eye view. The sparse convolutional detection head predicts the objectness and bounding box on each active point on the sparse feature map. We design a new oriented bounding box coding method and corresponding loss functions. We predict the endpoints of two mutually perpendicular lines that pass through the foreground active points and indirectly predict the objects’ oriented bounding box from these four endpoints. We use the indirectly calculated object center, size, and orientation as inputs of loss functions in the training step. Experiments on the KITTI dataset show that the sparse backbone network we designed is 2.2 times faster and 18.4 times fewer FLOPs than the dense backbone network. The average improvement of the loss functions based on the bounding box code is 1.1% and 0.8% on the BEV and 3D detection, respectively, compared to no addition of these losses. Moreover, FS23D outperforms the state-of-the-art LIDAR-based method in speed and precision for both cars and cyclists.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    0
    Citations
    NaN
    KQI
    []