Degraded Document Image Binarization using Novel Background Estimation Technique

被引:0
|
作者
Jindal, Harshit [1 ]
Kumar, Manoj [1 ]
Tomar, Akhil [1 ]
Malik, Ayush [1 ]
机构
[1] Delhi Technol Univ, Dept Comp Sci Engn, New Delhi, India
关键词
Document Image Processing; Degraded Document Image Binarization; Thresholding; Background estimation; Noise Removal; Otsu Thresholding; Bilateral Filtering;
D O I
10.1109/I2CT51068.2021.9418084
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Over the past few decades, the use of scanned historical document images has increased dramatically, especially with the emergence of online libraries and standard benchmark datasets like DIBCO. The historical documents are usually in very-poor conditions containing noises like large ink stains, bleed-through, liquid spills, uneven-background, spots, faded-ink, weak/thin text that makes the task of binarization very difficult. In this paper, we propose an effective degraded document image binarization algorithm that performs accurate text segmentation. Our method first estimates the background utilizing information from neighboring pixels and filter smoothening. The next step is background subtraction that helps in the compensation of background distortions. The document is segmented using Otsu thresholding, and then we process the image to remove the remaining noise and maximize text content using labelled connected components. Our method outperforms several existing and widely used binarization algorithms on F-measure, PSNR, DRD, and pseudo F-measure when evaluated on H-DIBCO 2016 and H-DIBCO 2018 datasets and can very effectively detect faint characters from a document image.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Improved Degraded Document Image Binarization Using Median Filter for Background Estimation
    Khitas, Mehdi
    Ziet, Lahcene
    Bouguezel, Saad
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2018, 24 (03) : 82 - 87
  • [2] Robust Document Image Binarization Technique for Degraded Document Images
    Su, Bolan
    Lu, Shijian
    Tan, Chew Lim
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (04) : 1408 - 1417
  • [3] Document image binarization using background estimation and stroke edges
    Shijian Lu
    Bolan Su
    Chew Lim Tan
    International Journal on Document Analysis and Recognition (IJDAR), 2010, 13 : 303 - 314
  • [4] Document image binarization using background estimation and stroke edges
    Lu, Shijian
    Su, Bolan
    Tan, Chew Lim
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2010, 13 (04) : 303 - 314
  • [5] Historical document image binarization using background estimation and energy minimization
    Xiong, Wei
    Jia, Xiuhong
    Xu, Jingjing
    Xiong, Zijie
    Liu, Min
    Wang, Juan
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3716 - 3721
  • [6] Adaptive degraded document image binarization
    Gatos, B
    Pratikakis, I
    Perantonis, SJ
    PATTERN RECOGNITION, 2006, 39 (03) : 317 - 327
  • [7] Degraded document image binarization using structural symmetry of strokes
    Jia, Fuxi
    Shi, Cunzhao
    He, Kun
    Wang, Chunheng
    Xiao, Baihua
    PATTERN RECOGNITION, 2018, 74 : 225 - 240
  • [8] Gabor Filters for Degraded Document Image Binarization
    Sehad, Abdenour
    Chibani, Youcef
    Cheriet, Mohamed
    2014 14TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2014, : 702 - 707
  • [9] Hybrid Binarization Technique for Degraded Document Images
    Ranganatha, D.
    Holi, Ganga
    2015 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2015, : 893 - 898
  • [10] A Multistage Binarization Technique for the Degraded Document Images
    Mousa, Usama W. A.
    Abd El Munim, Hossam E.
    Khalil, Mahmoud I.
    PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 332 - 337