A Unified Efficient Pyramid Transformer for Semantic Segmentation
2021
Semantic segmentation is a challenging problem due to difficulties in
modeling context in complex scenes and class confusions along boundaries. Most
literature either focuses on context modeling or boundary refinement, which is
less generalizable in open-world scenarios. In this work, we advocate a unified
framework(UN-EPT) to segment objects by considering both context information
and boundary artifacts. We first adapt a sparse sampling strategy to
incorporate the transformer-based attention mechanism for efficient context
modeling. In addition, a separate spatial branch is introduced to capture image
details for boundary refinement. The whole model can be trained in an
end-to-end manner. We demonstrate promising performance on three popular
benchmarks for semantic segmentation with low memory footprint. Code will be
released soon.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
73
References
3
Citations
NaN
KQI