Unsupervised Text Binarization in Handwritten Historical Documents Using k-Means Clustering

被引:0
|
作者
Kusetogullari, Huseyin [1 ]
机构
[1] Blekinge Inst Technol, Dept Comp Sci & Engn, S-37141 Karlskrona, Sweden
关键词
Handwritten text binarization; Image processing; k-means clustering; Document images; IMAGE BINARIZATION; ENHANCEMENT; ALGORITHM;
D O I
10.1007/978-3-319-56991-8_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel technique for unsupervised text binarization in handwritten historical documents using k-means clustering. In the text binarization problem, there are many challenges such as noise, faint characters and bleed-through and it is necessary to overcome these tasks to increase the correct detection rate. To overcome these problems, preprocessing strategy is first used to enhance the contrast to improve faint characters and Gaussian Mixture Model (GMM) is used to ignore the noise and other artifacts in the handwritten historical documents. After that, the enhanced image is normalized which will be used in the postprocessing part of the proposed method. The handwritten binarization image is achieved by partitioning the normalized pixel values of the handwritten image into two clusters using k-means clustering with k = 2 and then assigning each normalized pixel to the one of the two clusters by using the minimum Euclidean distance between the normalized pixels intensity and mean normalized pixel value of the clusters. Experimental results verify the effectiveness of the proposed approach.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [31] Unsupervised Multi-View K-Means Clustering Algorithm
    Yang, Miin-Shen
    Hussain, Ishtiaq
    IEEE ACCESS, 2023, 11 : 13574 - 13593
  • [32] K-means tree: an optimal clustering tree for unsupervised learning
    Tavallali, Pooya
    Tavallali, Peyman
    Singhal, Mukesh
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (05): : 5239 - 5266
  • [33] K-means tree: an optimal clustering tree for unsupervised learning
    Pooya Tavallali
    Peyman Tavallali
    Mukesh Singhal
    The Journal of Supercomputing, 2021, 77 : 5239 - 5266
  • [34] Unsupervised Bayesian feature selection based on k-means clustering
    Yan, Liu
    Yan, Peng
    IC-BNMT 2007: PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY, 2007, : 352 - 356
  • [35] Atherosclerotic Plaque Pathological Analysis by Unsupervised K-Means Clustering
    Feng, Jianqin
    Zhang, Yongtao
    Yue, Guanghua
    Liu, Xin
    Su, Haijun
    Zhang, Peng-Fei
    IEEE ACCESS, 2018, 6 : 21530 - 21535
  • [36] Nonsubsampled contourlet transform and k-means clustering for degraded document image binarization
    Zemouri, ET-Tahir
    Chibani, Youcef
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (04)
  • [37] Clones Clustering Using K-Means
    Ashish, Aveg
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [38] Clones clustering using K-means
    Ashish, Aveg
    Proceedings of the 10th International Conference on Intelligent Systems and Control, ISCO 2016, 2016,
  • [39] Semi-supervised Text Categorization Using Recursive K-means Clustering
    Gowda, Harsha S.
    Suhil, Mahamad
    Guru, D. S.
    Raju, Lavanya Narayana
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 217 - 227
  • [40] Text Document Clustering on the basis of Inter passage approach by using K-means
    Mishra, Rupesh Kumar
    Saini, Kanika
    Bagri, Sakshi
    2015 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION & AUTOMATION (ICCCA), 2015, : 110 - 113