Unsupervised Text Binarization in Handwritten Historical Documents Using k-Means Clustering

被引:0
|
作者
Kusetogullari, Huseyin [1 ]
机构
[1] Blekinge Inst Technol, Dept Comp Sci & Engn, S-37141 Karlskrona, Sweden
来源
PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2 | 2018年 / 16卷
关键词
Handwritten text binarization; Image processing; k-means clustering; Document images; IMAGE BINARIZATION; ENHANCEMENT; ALGORITHM;
D O I
10.1007/978-3-319-56991-8_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel technique for unsupervised text binarization in handwritten historical documents using k-means clustering. In the text binarization problem, there are many challenges such as noise, faint characters and bleed-through and it is necessary to overcome these tasks to increase the correct detection rate. To overcome these problems, preprocessing strategy is first used to enhance the contrast to improve faint characters and Gaussian Mixture Model (GMM) is used to ignore the noise and other artifacts in the handwritten historical documents. After that, the enhanced image is normalized which will be used in the postprocessing part of the proposed method. The handwritten binarization image is achieved by partitioning the normalized pixel values of the handwritten image into two clusters using k-means clustering with k = 2 and then assigning each normalized pixel to the one of the two clusters by using the minimum Euclidean distance between the normalized pixels intensity and mean normalized pixel value of the clusters. Experimental results verify the effectiveness of the proposed approach.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [41] Text Grouping in Patent Analysis using Adaptive K-Means Clustering Algorithm
    Shanie, Tiara
    Suprijadi, Jadi
    Zulhanif
    STATISTICS AND ITS APPLICATIONS, 2017, 1827
  • [42] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [43] Unsupervised Patterned Fabric Defect Detection using Texture Filtering and K-Means clustering
    Hamdi, Azhar A.
    Sayed, Mohammed S.
    Fouad, Mohamed M.
    Hadhoud, Mohiy M.
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMPUTER ENGINEERING (ITCE' 2018), 2018, : 130 - 135
  • [44] Fully Unsupervised Clustering in Nonlinearly Separable Data Using Intelligent Kernel K-Means
    Handhayani, Teny
    Wasito, Ito
    2014 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2014, : 450 - 453
  • [45] Enhancement of Advanced Metering Infrastructure Performance Using Unsupervised K-Means Clustering Algorithm
    Molokomme, Daisy Nkele
    Chabalala, Chabalala S.
    Bokoro, Pitshou N.
    ENERGIES, 2021, 14 (09)
  • [46] Unsupervised segmentation of large scale spatial images using K-means clustering approach
    Luo, JC
    Ye, ZM
    Bhattacharya, P
    Proceedings of the Eighth IASTED International Conference on Intelligent Systems and Control, 2005, : 410 - 415
  • [47] Text Line Extraction in Handwritten Historical Documents
    Capobianco, Samuele
    Marinai, Simone
    DIGITAL LIBRARIES AND ARCHIVES, IRCDL 2017, 2017, 733 : 68 - 79
  • [48] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [49] An Application of K-Means Clustering for Improving Video Text Detection
    Aradhya, V. N. Manjunath
    Pavithra, M. S.
    INTELLIGENT INFORMATICS, 2013, 182 : 41 - +
  • [50] Design and application of a text clustering algorithm based on parallelized k-means clustering
    Wang H.
    Zhou C.
    Li L.
    Revue d'Intelligence Artificielle, 2019, 33 (06) : 453 - 460