A novel image text extraction method based on k-means clustering

被引:20
|
作者
Song, Yan [1 ]
Liu, Anan [1 ]
Pang, Lin [1 ]
Lin, Shouxun [1 ]
Zhang, Yongdong [1 ]
Tang, Sheng [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
关键词
D O I
10.1109/ICIS.2008.31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Texts in web pages, images and videos contain important clues for information indexing and retrieval. Most existing text extraction methods depend on the language type and text appearance. In this paper, a novel and universal method of image text extraction is proposed A coarse-to-fine text location method is implemented Firstly, a multi-scale approach is adopted to locate texts with different font sizes. Secondly, projection profiles are used in location refinement step. Color-based k-means clustering is adopted in text segmentation. Compared to grayscale image which is used in most existing methods, color image is more suitable for segmentation based on clustering. It treats corner-points, edge-points and other points equally so that it solves the problem of handling multilingual text. It is demonstrated in experimental results that best performance is obtained when k is 3. Comparative experimental results on a large number of images show that our method is accurate and robust in various conditions.
引用
收藏
页码:185 / 190
页数:6
相关论文
共 50 条
  • [31] Similarity matrix-based K-means algorithm for text clustering
    曹奇敏
    郭巧
    吴向华
    Journal of Beijing Institute of Technology, 2015, 24 (04) : 566 - 572
  • [32] A K-means Text Clustering Algorithm Based on Subject Feature Vector
    Duo, Ji
    Zhang, Peng
    Hao, Liu
    JOURNAL OF WEB ENGINEERING, 2021, 20 (06): : 1935 - 1946
  • [33] Comparative Analysis of Multi-scale Wavelet Decomposition and k-Means Clustering Based Text Extraction
    Deepika Ghai
    Neelu Jain
    Wireless Personal Communications, 2019, 109 : 455 - 490
  • [34] Performance Evaluation of New Text Mining Method Based on GA and K-Means Clustering Algorithm
    Garg, Neha
    Gupta, R. K.
    ADVANCED COMPUTING AND COMMUNICATION TECHNOLOGIES, 2018, 562 : 23 - 30
  • [35] Parallel Hierarchical K-means Clustering-based Image Index Construction Method
    Yang, Yuan-feng
    Wu, Jian
    Fang, Jing
    Cui, Zhi-ming
    2012 11TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING & SCIENCE (DCABES), 2012, : 424 - 428
  • [36] An Improved Method for K-Means Clustering
    Cui, Xiaowei
    Wang, Fuxiang
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 756 - 759
  • [37] K-means clustering based on interactive differential evolution for color image clustering
    Liu, Gang
    Li, Shan
    Wen, Bo
    Peng, Junzhe
    Wang, Jingwen
    ICIC Express Letters, 2015, 9 (08): : 2327 - 2334
  • [38] Dominant Color Palette Extraction by K-Means Clustering Algorithm and Reconstruction of Image
    Kumar, Ilia Pavan
    Gopal, V. P. Hara
    Ramasubbareddy, Somula
    Nalluri, Sravani
    Govinda, K.
    DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT-2K19, 2020, 1079 : 921 - 929
  • [39] A novel rough semi-supervised k-means algorithm for text clustering
    Tang, Lei-yu
    Wang, Zhen-hao
    Wang, Shu-dong
    Fan, Jian-cong
    Yue, Guo-wei
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2023, 21 (02) : 57 - 68
  • [40] A Novel SVC Method based on K-means
    Sun, Ying
    Wang, Yan
    Wang, Juexin
    Du, Wei
    Zhou, Chunguang
    FGCN: PROCEEDINGS OF THE 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING, VOLS 1 AND 2, 2008, : 1007 - 1010