Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

被引:0
|
作者
D. Chitrakala Gopalan
机构
[1] Anna University,Department of Computer Science and Engineering, Easwari Engineering College
[2] Anna University,Department of Computer Science and Engineering, College of Engineering
来源
关键词
Text extraction; Non sub sampled Contourlet Transform; Gray level run length matrix; Caption text; Scene text; Document image;
D O I
暂无
中图分类号
学科分类号
摘要
Discriminating between the text and non text regions of an image is a complex and challenging task. In contrast to Caption text, Scene text can have any orientation and may be distorted by the perspective projection. Moreover, it is often affected by variations in scene and camera parameters such as illumination, focus, etc. These variations make the design of unified text extraction from various kinds of images extremely difficult. This paper proposes a statistical unified approach for the extraction of text from hybrid textual images (both Scene text and Caption text in an image) and Document images with variations in text by using carefully selected features with the help of multi level feature priority (MLFP) algorithm. The selected features are combinedly found to be the good choice of feature vectors and have the efficacy to discriminate between text and non text regions for Scene text, Caption text and Document images and the proposed system is robust to illumination, transformation/perspective projection, font size and radially changing/angular text. MLFP feature selection algorithm is evaluated with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based K-nearest neighbour learner and effectiveness of MLFP is shown by comparing with three feature selection methods with benchmark dataset. The proposed text extraction system is compared with the Edge based method, Connected component method and Texture based method and shown encouraging result and finds its major application in preprocessing for optical character recognition technique and multimedia processing, mobile robot navigation, vehicle license detection and recognition, page segmentation and text-based image indexing, etc.
引用
收藏
页码:165 / 183
页数:18
相关论文
共 50 条
  • [1] Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme
    Gopalan, Chitrakala
    Manjula, D.
    SIGNAL IMAGE AND VIDEO PROCESSING, 2011, 5 (02) : 165 - 183
  • [2] Multi Level Feature Priority algorithm based text extraction from heterogeneous and hybrid textual images
    Chitrakala, Gopalan
    Manjula, D.
    INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2009, 2 (04) : 183 - 195
  • [3] Text Extraction from Scene Images using Statistical Distributions
    Ghoshal, Ranjit
    Roy, Anandarup
    Parui, Swapan K.
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 187 - 190
  • [4] Text Detection in Medical Images Using Local Feature Extraction and Supervised Learning
    Ma, Yu
    Wang, Yuanyuan
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 953 - 958
  • [5] Text detection in images using texture feature from Strokes
    Zhu, Caifeng
    Wang, Weiqiang
    Ning, Qianhui
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2006, PROCEEDINGS, 2006, 4261 : 295 - +
  • [6] A novel statistical feature extraction method for textual images: Optical font recognition
    Bataineh, Bilal
    Abdullah, Siti Norul Huda Sheikh
    Omar, Khairuddin
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 5470 - 5477
  • [7] Extraction of text lines and text blocks on document images based on statistical modeling
    Chen, S
    Haralick, RM
    Phillips, IT
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 1996, 7 (04) : 343 - 356
  • [8] Extraction of text lines and text blocks on document images based on statistical modeling
    Chen, S
    Haralick, RM
    Phillips, IT
    DOCUMENT RECOGNITION III, 1996, 2660 : 138 - 149
  • [9] TEXT DETECTION IN NATURAL SCENE IMAGES BY HIERARCHICAL LOCALIZATION AND GROWING OF TEXTUAL COMPONENTS
    Ding, Wenjun
    Shan, Susu
    Su, Feng
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 775 - 780
  • [10] An investigation on feature and text extraction from images using image recognition in Android
    Panchal, Brijeshkumar Y.
    Chauhan, Gaurang
    Panchal, Sandipkumar R.
    Chaudhari, Urvashi M.
    MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 798 - 802