Character string extraction from color documents

被引:35
|
作者
Hase, H
Shinokawa, T
Yoneda, M
Suen, CY
机构
[1] Toyama Univ, Fac Engn, Dept Intellectual Info Sys Eng, Toyama 9308555, Japan
[2] Toyama Natl Coll Maritime Technol, Toyama, Japan
[3] Concordia Univ, Ctr Pattern Recognit & Machine Intelligence, Montreal, PQ H3G 1M8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
color document; character string extraction; color segmentation; multi-stage relaxation; conflict resolution; likelihood of a character string;
D O I
10.1016/S0031-3203(00)00081-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A new algorithm for the extraction of character strings from color documents is proposed. We first divide a full color image into several representative binary color images. Then, character strings are nominated from each binary image by using multi-stage relaxation. However, the nominated strings are not always characters. They may be a part of the background, concatenated holes of characters, or dotted lines, etc. Therefore, when all nominated strings of all binary images are superimposed, some strings overlap each other. So, we selected the appropriate strings from them using the likelihood of a character string and two kinds of conflict resolution. In the experiments, we used color images like magazine covers, posters, etc. After applying color segmentation and the multi-stage relaxation, many character strings were nominated. Next, some adequate strings were selected. Finally, we show the experimental results and discuss some problems of extracting character strings From a color document. (C) 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:1349 / 1365
页数:17
相关论文
共 50 条
  • [1] Extraction of characters from color documents
    Kasuga, H.
    Okamoto, M.
    Yamamoto, H.
    Proceedings of SPIE - The International Society for Optical Engineering, 2000, 3967 : 278 - 285
  • [2] Extraction of characters from color documents
    Kasuga, H
    Okamoto, M
    Yamamoto, H
    DOCUMENT RECOGNITION AND RETRIEVAL VII, 2000, 3967 : 278 - 285
  • [3] Character pattern extraction from documents with complex backgrounds
    Goto H.
    Aso H.
    International Journal on Document Analysis and Recognition, 2002, 4 (04) : 258 - 268
  • [4] Character extraction from documents using wavelet maxima
    Hwang, WL
    Chang, F
    WAVELET APPLICATIONS IN SIGNAL AND IMAGE PROCESSING IV, PTS 1 AND 2, 1996, 2825 : 1003 - 1015
  • [5] Character extraction from documents using wavelet maxima
    Hwang, WL
    Chang, F
    IMAGE AND VISION COMPUTING, 1998, 16 (05) : 307 - 315
  • [6] Character pattern extraction from colorful documents with complex backgrounds
    Goto, H
    Aso, H
    16TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL III, PROCEEDINGS, 2002, : 180 - 183
  • [7] Character string extraction on unconstrained maps
    Xu, JZ
    AM/FM INTERNATIONAL CONFERENCE XIX, PROCEEDINGS - THRIVING IN AN AGE OF COMPETITION, 1996, : 29 - 35
  • [8] Character String Extraction from Scene Images by Eliminating Non-character Elements
    Takagi, Noboru
    Chen, Jianjun
    2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 3685 - 3690
  • [9] Text string extraction from images of colour-printed documents
    Suen, HM
    Wang, JF
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1996, 143 (04): : 210 - 216
  • [10] A Robust Technique for Character String Extraction from Complex Document Images
    Chen, Yen-Lin
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1742 - 1750