Character string extraction from color documents

被引:35
|
作者
Hase, H
Shinokawa, T
Yoneda, M
Suen, CY
机构
[1] Toyama Univ, Fac Engn, Dept Intellectual Info Sys Eng, Toyama 9308555, Japan
[2] Toyama Natl Coll Maritime Technol, Toyama, Japan
[3] Concordia Univ, Ctr Pattern Recognit & Machine Intelligence, Montreal, PQ H3G 1M8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
color document; character string extraction; color segmentation; multi-stage relaxation; conflict resolution; likelihood of a character string;
D O I
10.1016/S0031-3203(00)00081-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new algorithm for the extraction of character strings from color documents is proposed. We first divide a full color image into several representative binary color images. Then, character strings are nominated from each binary image by using multi-stage relaxation. However, the nominated strings are not always characters. They may be a part of the background, concatenated holes of characters, or dotted lines, etc. Therefore, when all nominated strings of all binary images are superimposed, some strings overlap each other. So, we selected the appropriate strings from them using the likelihood of a character string and two kinds of conflict resolution. In the experiments, we used color images like magazine covers, posters, etc. After applying color segmentation and the multi-stage relaxation, many character strings were nominated. Next, some adequate strings were selected. Finally, we show the experimental results and discuss some problems of extracting character strings From a color document. (C) 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:1349 / 1365
页数:17
相关论文
共 50 条
  • [31] Lanna Handwritten Character Recognition on Historical Documents Using Feature Extraction
    Khankasikam, Krisda
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 2553 - 2560
  • [32] Binarization, character extraction, and writer identification of historical Hebrew calligraphy documents
    Itay Bar-Yosef
    Isaac Beckman
    Klara Kedem
    Itshak Dinstein
    International Journal of Document Analysis and Recognition (IJDAR), 2007, 9 : 89 - 99
  • [33] String extraction based on statistical analysis method in color space
    Yan Heping
    Wang, Zhiyan
    Guo, Sen
    GRAPHICS RECOGNITION: TEN YEARS REVIEW AND FUTURE PERSPECTIVES, 2006, 3926 : 173 - 181
  • [34] Logos Extraction on Picture Documents Using Shape and Color Density
    Ahmed, Zeggari
    Fella, Hachouf
    2008 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, VOLS 1-5, 2008, : 836 - +
  • [35] A SURVEY ON CHARACTER RECOGNITION FROM HANDWRITTEN DOCUMENTS
    Kaur, Gagandeep
    Singh, Varinder
    Chawla, Sunil Kumar
    Bhasin, Mahima
    ADVANCES AND APPLICATIONS IN MATHEMATICAL SCIENCES, 2020, 19 (05): : 321 - 331
  • [36] Character Extraction by Integrating Color into Edge-based Methods
    Chiba, Naoki
    Liu, Xinhao
    2015 14TH IAPR INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA), 2015, : 73 - 76
  • [37] CHARACTER AND LINE EXTRACTION FROM COLOR MAP IMAGES USING A MULTILAYER NEURAL-NETWORK
    YAN, H
    WU, J
    PATTERN RECOGNITION LETTERS, 1994, 15 (01) : 97 - 103
  • [38] Extraction of chemical information from documents
    Villar, Hugo O.
    Betancort, Juan
    Hansen, Mark R.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [39] Information Extraction from Legal Documents
    Cheng, Tin Tin
    Cua, Jeffrey Leonard
    Tan, Mark Davies
    Yao, Kenneth Gerard
    Roxas, Rachel Edita
    2009 EIGHTH INTERNATIONAL SYMPOSIUM ON NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2009, : 157 - +
  • [40] Metadata extraction from office documents
    Stumbo, WK
    Handley, JC
    Archiving 2005, Final Program and Proceedings, 2005, : 184 - 187