DP-LinkNet: A convolutional network for historical document image binarization

被引:42
|
作者
Xiong, Wei [1 ,2 ]
Jia, Xiuhong [1 ]
Yang, Dichun [1 ]
Ai, Meihui [1 ]
Li, Lirong [1 ]
Wang, Song [2 ]
机构
[1] Hubei Univ Technol, Sch Elect & Elect Engn, Wuhan 430068, Hubei, Peoples R China
[2] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29201 USA
基金
中国国家自然科学基金;
关键词
Degraded document image binarization; semantic segmentation; DP-LinkNet; encoder-decoder architecture; & nbsp; hybrid dilated convolution (HDC); spatial pyramid pooling (SPP); COMPETITION;
D O I
10.3837/tiis.2021.05.011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.
引用
收藏
页码:1778 / 1797
页数:20
相关论文
共 50 条
  • [21] Binarization of Color Historical Document Images Using Local Image Equalization and XDoG
    Roe, Edward
    Mello, Carlos A. B.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 205 - 209
  • [22] MSIO: MultiSpectral Document Image BinarizatIOn
    Diem, Markus
    Hollaus, Fabian
    Sablatnig, Robert
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 84 - 89
  • [23] A MULTISCALE OPERATOR FOR DOCUMENT IMAGE BINARIZATION
    Dorini, Leyza Baldo
    Leite, Neucimar Jeronimo
    VISAPP 2009: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 1, 2009, : 34 - 39
  • [24] A Hybrid Approach for Document Image Binarization
    Sakila, A.
    Vijayarani, S.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI 2017), 2017, : 645 - 650
  • [25] Adaptive degraded document image binarization
    Gatos, B
    Pratikakis, I
    Perantonis, SJ
    PATTERN RECOGNITION, 2006, 39 (03) : 317 - 327
  • [26] Augment Document Image Binarization by Learning
    Zhu, Yuanping
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1905 - 1908
  • [27] Combination of Document Image Binarization Techniques
    Su, Bolan
    Lu, Shijian
    Tan, Chew Lim
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 22 - 26
  • [28] A Survey on Document Image Binarization Techniques
    Lokhande, Supriya Sunil
    Dawande, N. A.
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 742 - 746
  • [29] Document Image Binarization Based on NFCM
    Tong Li-Jing
    Chen Kan
    Zhang Yan
    Fu Xiao-Ling
    Duan Jian-Yong
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 1769 - 1773
  • [30] Improved binarization algorithm for document image
    Chen, Dan
    Zhang, Feng
    He, Guiming
    Jisuanji Gongcheng/Computer Engineering, 2003, 29 (13):