A novel image text extraction method based on k-means clustering

被引:20
|
作者
Song, Yan [1 ]
Liu, Anan [1 ]
Pang, Lin [1 ]
Lin, Shouxun [1 ]
Zhang, Yongdong [1 ]
Tang, Sheng [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100080, Peoples R China
关键词
D O I
10.1109/ICIS.2008.31
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Texts in web pages, images and videos contain important clues for information indexing and retrieval. Most existing text extraction methods depend on the language type and text appearance. In this paper, a novel and universal method of image text extraction is proposed A coarse-to-fine text location method is implemented Firstly, a multi-scale approach is adopted to locate texts with different font sizes. Secondly, projection profiles are used in location refinement step. Color-based k-means clustering is adopted in text segmentation. Compared to grayscale image which is used in most existing methods, color image is more suitable for segmentation based on clustering. It treats corner-points, edge-points and other points equally so that it solves the problem of handling multilingual text. It is demonstrated in experimental results that best performance is obtained when k is 3. Comparative experimental results on a large number of images show that our method is accurate and robust in various conditions.
引用
收藏
页码:185 / 190
页数:6
相关论文
共 50 条
  • [1] A Novel Text Clustering Method Based on TGSOM and Fuzzy K-Means
    Hu, Jinzhu
    Xiong, Chunxiu
    Shu, Jiangbo
    Zhou, Xing
    Zhu, Jun
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL I, 2009, : 26 - 30
  • [2] Binarization by Local K-means Clustering for Korean Text Extraction
    Lai, Anh-Nga
    Lee, GueeSang
    ISSPIT: 8TH IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2008, : 117 - 122
  • [3] A novel method for K-means clustering algorithm
    Zhao, Jinguo, 1600, Transport and Telecommunication Institute, Lomonosova street 1, Riga, LV-1019, Latvia (18):
  • [4] Chinese text clustering algorithm based k-means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
  • [5] Text Document Clustering Based on Density K-means
    Wu, Di
    Zeng, Yan
    Qu, Yin-chuan
    INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS AND ELECTRONIC ENGINEERING (CMEE 2016), 2016,
  • [6] Chinese Text Clustering Algorithm Based K-Means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 90 - 93
  • [7] Weighted k-Means Algorithm Based Text Clustering
    Chen, Xiuguo
    Yin, Wensheng
    Tu, Pinghui
    Zhang, Hengxi
    IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 51 - +
  • [8] Graphical Image Region Extraction with K-Means Clustering and Watershed
    Jardim, Sandra
    Antonio, Joao
    Mora, Carlos
    JOURNAL OF IMAGING, 2022, 8 (06)
  • [9] A Content Based Image Retrieval Method Based on K-Means Clustering Technique
    Ouhda, Mohamed
    El Asnaoui, Khalid
    Ouanan, Mohammed
    Aksasse, Brahim
    JOURNAL OF ELECTRONIC COMMERCE IN ORGANIZATIONS, 2018, 16 (01) : 82 - 96
  • [10] A Novel MapReduce Based k-Means Clustering
    Sinha, Ankita
    Jana, Prasanta K.
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 247 - 255