Automatic Parameter Tuning of K-Means Algorithm for Document Binarization

被引:6
|
作者
Gattal, Abdeljalil [1 ]
Abbas, Faycel [2 ]
Laouar, Mohamed Ridda [1 ]
机构
[1] Larbi Tebessi Univ, Dept Math & Comp Sci, Tebessa, Algeria
[2] Bouira Univ, Fac Sci & Appl Sci, Comp Sci Dept, LIMPAF Lab, Bouira, Algeria
关键词
Document Binarization; K-Means algorithm; Automatic parameter tuning; H-DIBCO; 2016; dataset;
D O I
10.1145/3330089.3330124
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The document binarization is a primary processing step toward document recognition system. It goals to separate the foreground from the document background. In this paper, we propose an algorithm for the binarization of document images degraded by using the clustering algorithm K-Means with automatic parameter tuning. It uses the K-Means algorithm to classify the document image into three classes as background, foreground and noise labels. Experimental results show that our method is more robust to the state of the art on recent benchmarks on the H-DIBCO 2016 dataset.
引用
下载
收藏
页数:4
相关论文
共 50 条
  • [1] Document binarization with automatic parameter tuning
    Nicholas R. Howe
    International Journal on Document Analysis and Recognition (IJDAR), 2013, 16 : 247 - 258
  • [2] Document binarization with automatic parameter tuning
    Howe, Nicholas R.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2013, 16 (03) : 247 - 258
  • [3] Erratum to: Document binarization with automatic parameter tuning
    Nicholas R. Howe
    International Journal on Document Analysis and Recognition (IJDAR), 2013, 16 (3): : 259 - 259
  • [4] Parameter tuning for document image binarization using a racing algorithm
    Mesquita, Rafael G.
    Silva, Ricardo M. A.
    Mello, Carlos A. B.
    Miranda, Pericles B. C.
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2593 - 2603
  • [5] Non-uniform Illumination Document Image Binarization Using K-Means Clustering Algorithm
    Yang, Xingxin
    Wan, Yi
    2021 IEEE 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2021), 2021, : 555 - 559
  • [6] Handwritten Document Image Binarization: An Adaptive K-Means Based Approach
    Jana, Prithwish
    Ghosh, Soulib
    Bera, Suman Kumar
    Sarkar, Ram
    2017 IEEE CALCUTTA CONFERENCE (CALCON), 2017, : 226 - 230
  • [7] An Improved K-means Algorithm for Document Clustering
    Wu, Guohua
    Lin, Hairong
    Fu, Ershuai
    Wang, Liuyang
    2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA), 2015, : 65 - 69
  • [8] Harmony K-means algorithm for document clustering
    Mahdavi, Mehrdad
    Abolhassani, Hassan
    DATA MINING AND KNOWLEDGE DISCOVERY, 2009, 18 (03) : 370 - 391
  • [9] Harmony K-means algorithm for document clustering
    Mehrdad Mahdavi
    Hassan Abolhassani
    Data Mining and Knowledge Discovery, 2009, 18 : 370 - 391
  • [10] Nonsubsampled contourlet transform and k-means clustering for degraded document image binarization
    Zemouri, ET-Tahir
    Chibani, Youcef
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (04)