Multi-task Learning for Newspaper Image Segmentation and Baseline Detection Using Attention-Based U-Net Architecture.

Anukriti Bansal,Prerana Mukherjee,Divyansh Joshi,Devashish Tripathi,Arun Singh

Multi-task Learning for Newspaper Image Segmentation and Baseline Detection Using Attention-Based U-Net Architecture.

2021

In this work, we propose an end-to-end language agnostic multi-task learning based U-Net framework for performing text block segmentation and baseline detection in document images. We leverage the performance of U-Net by augmenting attention layers between the contracting and expansive path via skip connections. The generalization ability of the model is validated on handwritten images as well. We perform exhaustive experiments on ICPR2020 challenge dataset and obtain a test accuracy of 96.09% and 99.44% for simple track baseline detection and text block segmentation respectively, 97.47% and 98.51% complex track baseline and text block segmentation respectively. The source code is made publicly available at https://github.com/divyanshjoshi/Attention-U-Net-Newspaper-Text-Block-Segmentation.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations