A COMPLETE SYSTEM FOR DETECTION AND RECOGNITION OF TEXT IN GRAPHICAL DOCUMENTS USING BACKGROUND INFORMATION

被引:0
|
作者
Pratim Roy, Partha [1 ]
Llados, Josep [1 ]
Pal, Umapada [2 ]
机构
[1] Univ Autonoma Barcelona, Comp Vis Ctr, E-08193 Barcelona, Spain
[2] Indian Stat Inst, Kolkata, India
关键词
Graphics Recognition; Optical Character Recognition; Convex Hull; Skeleton Analysis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Text/symbols retrieval in graphical documents (map, engineering drawing) involves many challenges because they are not usually parallel to each other. They are multi-oriented and curve in nature to annotate the graphical curve lines and hence follow a curvi-linear way too. Sometimes, text and symbols frequently touch/overlap with graphical components (river, street, border line) which enhances the problem. For OCR of such documents we need to extract individual text lines and their corresponding words/characters. In this paper, we propose a methodology to extract individual text lines and an approach for recognition of the extracted text characters from such complex graphical documents. The methodology is based on the foreground and background information of the text components. To take care of background information, water reservoir concept and convex hull have been used. For recognition of multi-font, multi-scale and multi-oriented characters, Support Vector Machine (SVM) based classifier is applied. Circular ring and convex hull have been used along with angular information of the contour pixels of the characters to make the feature rotation and scale invariant.
引用
收藏
页码:209 / +
页数:3
相关论文
共 50 条
  • [1] Text line extraction in graphical documents using background and foreground information
    Partha Pratim Roy
    Umapada Pal
    Josep Lladós
    International Journal on Document Analysis and Recognition (IJDAR), 2012, 15 : 227 - 241
  • [2] Text line extraction in graphical documents using background and foreground information
    Pratim Roy, Partha
    Pal, Umapada
    Llados, Josep
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2012, 15 (03) : 227 - 241
  • [3] Multi-Oriented Text Recognition in Graphical Documents using HMM
    Roy, Partha Pratim
    Roy, Sangheeta
    Pal, Umapada
    2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 136 - 140
  • [4] Novelty detection for text documents using named entity recognition
    Ng, Kok Wah
    Tsai, Flora S.
    Chen, Lihui
    Goh, Kiat Chong
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1663 - +
  • [5] Processing of Chinese language and text information system under the background of speech recognition
    Cao, Huiqin
    He, Peng
    Wang, Chengjin
    SOFT COMPUTING, 2023,
  • [6] Touching Text Character Localization in Graphical Documents Using SIFT
    Pratim Roy, Partha
    Pal, Umapada
    Llados, Josep
    GRAPHICS RECOGNITION: ACHIEVEMENTS, CHALLENGES, AND EVOLUTION, 2010, 6020 : 199 - +
  • [7] An Intelligent Information System for Organizing Online Text Documents
    Han-joon Kim
    Sang-goo Lee
    Knowledge and Information Systems, 2004, 6 : 125 - 149
  • [8] An intelligent information system for organizing online text documents
    Han-joon Kim
    Sang-goo Lee
    Knowledge and Information Systems, 2004, 6 (2) : 125 - 149
  • [9] An intelligent information system for organizing online text documents
    Kim, HJ
    Lee, SG
    KNOWLEDGE AND INFORMATION SYSTEMS, 2004, 6 (02) : 125 - 149
  • [10] Using linguistic information to classify Portuguese text documents
    Goncalves, Teresa
    Quaresma, Paulo
    PROCEEDINGS OF THE SPECIAL SESSION OF THE SEVENTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE - MICAI 2008, 2008, : 94 - 100