Fusion of images and point clouds for the semantic segmentation of large-scale 3D scenes based on deep learning

2018 
Abstract We address the issue of the semantic segmentation of large-scale 3D scenes by fusing 2D images and 3D point clouds. First, a Deeplab-Vgg16 based Large-Scale and High-Resolution model (DVLSHR) based on deep Visual Geometry Group (VGG16) is successfully created and fine-tuned by training seven deep convolutional neural networks with four benchmark datasets. On the val set in CityScapes, DVLSHR achieves a 74.98% mean Pixel Accuracy ( mPA ) and a 64.17% mean Intersection over Union ( mIoU ), and can be adapted to segment the captured images (image resolution 2832 ∗ 4256 pixels). Second, the preliminary segmentation results with 2D images are mapped to 3D point clouds according to the coordinate relationships between the images and the point clouds. Third, based on the mapping results, fine features of buildings are further extracted directly from the 3D point clouds. Our experiments show that the proposed fusion method can segment local and global features efficiently and effectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    45
    Citations
    NaN
    KQI
    []