A Unified Efficient Pyramid Transformer for Semantic Segmentation

Fangrui Zhu,Yi Zhu,Li Zhang,Chongruo Wu,Yanwei Fu,Mu Li

A Unified Efficient Pyramid Transformer for Semantic Segmentation

2021

Fangrui Zhu
Yi Zhu
Li Zhang
Chongruo Wu
Yanwei Fu
Mu Li

Semantic segmentation is a challenging problem due to difficulties in modeling context in complex scenes and class confusions along boundaries. Most literature either focuses on context modeling or boundary refinement, which is less generalizable in open-world scenarios. In this work, we advocate a unified framework(UN-EPT) to segment objects by considering both context information and boundary artifacts. We first adapt a sparse sampling strategy to incorporate the transformer-based attention mechanism for efficient context modeling. In addition, a separate spatial branch is introduced to capture image details for boundary refinement. The whole model can be trained in an end-to-end manner. We demonstrate promising performance on three popular benchmarks for semantic segmentation with low memory footprint. Code will be released soon.

Keywords:

Artificial intelligence
Data mining
Pyramid (image processing)
Memory footprint
Class (computer programming)
transformer
Boundary (topology)
Segmentation
Context model
context
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations