Weakly-supervised domain adaptation for built-up region segmentation in aerial and satellite imagery

2020 
Abstract This paper proposes a novel domain adaptation algorithm to handle the challenges posed by the satellite and aerial imagery, and demonstrates its effectiveness on the built-up region segmentation problem. Built-up area estimation is an important component in understanding the human impact on the environment, effect of public policy and in general urban population analysis. The diverse nature of aerial and satellite imagery (capturing different geographical locations, terrains and weather conditions) and lack of labeled data covering this diversity makes machine learning algorithms difficult to generalize for such tasks, especially across multiple domains. Re-training for new domain is both computationally and labor expansive mainly due to the cost of collecting pixel level labels required for the segmentation task. Domain adaptation algorithms have been proposed to enable algorithms trained on images of one domain (source) to work on images from other dataset (target). Unsupervised domain adaptation is a popular choice since it allows the trained model to adapt without requiring any ground-truth information of the target domain. On the other hand, due to the lack of strong spatial context and structure, in comparison to the ground imagery, application of existing unsupervised domain adaptation methods results in the sub-optimal adaptation. We thoroughly study limitations of existing domain adaptation methods and propose a weakly-supervised adaptation strategy where we assume image level labels are available for the target domain. More specifically, we design a built-up area segmentation network (as encoder-decoder), with image classification head added to guide the adaptation. The devised system is able to address the problem of visual differences in multiple satellite and aerial imagery datasets, ranging from high resolution (HR) to very high resolution (VHR), by investigating the latent space as well as the structured output space. A realistic and challenging HR dataset is created by hand-tagging the 73.4 sq-km of Rwanda, capturing a variety of build-up structures over different terrain. The developed dataset is spatially rich compared to existing datasets and covers diverse built-up scenarios including built-up areas in forests and deserts, mud houses, tin and colored rooftops. Extensive experiments are performed by adapting from the single-source domain datasets, such as Massachusetts Buildings Dataset, to segment out the target domain. We achieve high gains ranging 11.6–52% in IoU over the existing state-of-the-art methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    53
    References
    7
    Citations
    NaN
    KQI
    []