Real-Time Monocular Joint Perception Network for Autonomous Driving

2022 
Comprehensive and accurate perception of the real 3D world is the basis of autonomous driving. However, many perceptual methods focus on a single task or object type, and the accuracy of existing multi-task or multi-object methods is difficult to balance against their real-time performance. This paper presents a unified framework for concurrent dynamic multi-object joint perception, which introduces a real-time monocular joint perception network termed MJPNet. In MJPNet relative weightings are automatically learned by a series of developed network branches. By training an end-to-end deep convolutional neural network on a shared feature encoder and many proposed decoding sub-branches, the information of the 2D category and 3D position/pose/size of an object are reconstructed both simultaneously and accurately. Moreover, the effective information among subtasks is transferred by multi-stream learning, guaranteeing the accuracy of each task. Compared to various state-of-the-arts, comprehensive evaluations on the benchmark of challenging image sequences demonstrate the superior performance of our 2D detection and 3D reconstruction of depth, lateral distance, orientation, and heading angle. Moreover, on the KITTI test set, the real-time runtime (up to 15 fps) of MJPNet significantly outran the public state-of-the-art visual detection methods. Accompanying video: https://youtu.be/Z-goToOlI94 .
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    52
    References
    0
    Citations
    NaN
    KQI
    []