A New Approach to Extract Text from Images based on DWT and K-means Clustering

被引:3
|
作者
Ghai, Deepika [1 ]
Gera, Divya [1 ]
Jain, Neelu [1 ]
机构
[1] PEC Univ Technol, ECE Dept, Sect 12, Chandigarh 160012, UT, India
关键词
Text extraction; Texture features; DWT; K-means clustering; sliding window; voting decision; VIDEO; LOCALIZATION;
D O I
10.1080/18756891.2016.1237189
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text present in image provides important information for automatic annotation, indexing and retrieval. Therefore, its extraction is a well known research area in computer vision. However, variations of text due to differences in orientation, alignment, font, size, low image contrast and complex background make the problem of text extraction extremely challenging. In this paper, we propose a texture-based text extraction method using DWT with K-means clustering. First, the edges are detected from image by using DWT. Then, a small size overlapped sliding window is used to scan high frequency component sub-bands from which texture features of text and non-text regions are extracted. Based on these features, K-means clustering is employed to classify the image into text, simple background and complex background clusters. Finally, voting decision process and area based filtering are used to locate text regions exactly. Experimentation is carried out using public dataset ICDAR 2013 and our own dataset for English, Hindi and Punjabi text images for different number of clusters. The results show that the proposed method gives promising results with different languages in terms of detection rate (DR), precision rate (PR) and recall rate (RR).
引用
收藏
页码:900 / 916
页数:17
相关论文
共 50 条
  • [1] A New Approach to Extract Text from Images based on DWT and K-means Clustering
    Deepika Ghai
    Divya Gera
    Neelu Jain
    International Journal of Computational Intelligence Systems, 2016, 9 : 900 - 916
  • [2] Distributed Algorithm for Text Documents Clustering Based on k-Means Approach
    Sarnovsky, Martin
    Carnoka, Noema
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2015, PT II, 2016, 430 : 165 - 174
  • [3] Chinese text clustering algorithm based k-means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
  • [4] Text Document Clustering Based on Density K-means
    Wu, Di
    Zeng, Yan
    Qu, Yin-chuan
    INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS AND ELECTRONIC ENGINEERING (CMEE 2016), 2016,
  • [5] Chinese Text Clustering Algorithm Based K-Means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 90 - 93
  • [6] Weighted k-Means Algorithm Based Text Clustering
    Chen, Xiuguo
    Yin, Wensheng
    Tu, Pinghui
    Zhang, Hengxi
    IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 51 - +
  • [7] A Semi-Supervised Text Clustering Approach Based on K-Means Algorithm
    Zhan, Lizhang
    Xu, Hong
    Chen, Xiuguo
    INTERNATIONAL CONFERENCE ON ENGINEERING AND BUSINESS MANAGEMENT (EBM2011), VOLS 1-6, 2011, : 2616 - 2620
  • [8] New k-Means data clustering approach
    College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, China
    不详
    不详
    J. Comput. Inf. Syst., 2008, 2 (565-570):
  • [9] AN APPROACH FOR TEXT CLUSTERING USING MODIFIED K-MEANS ALGORITHM
    Rose, J. Dafni
    Mukherjee, Saswati
    4TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGY AND ENGINEERING (ICSTE 2012), 2012, : 243 - 247
  • [10] A new Chinese text clustering algorithm based on WRD and improved K-means
    Cui, Zicai
    Zhong, Bocheng
    Bai, Chen
    INTELLIGENT DATA ANALYSIS, 2023, 27 (04) : 1205 - 1220