Script Identification for Printed and Handwritten Indian Documents: An Empirical Study of Different Feature Classifier Combinations

被引:5
|
作者
Rani, Rajneesh [1 ]
Dhir, Renu [1 ]
Kakkar, Deepti [2 ]
Sharma, Nonita [1 ]
机构
[1] Dr BR Ambedkar Natl Inst Technol, Dept Comp Sci & Engn, Jalandhar 144011, Punjab, India
[2] Dr BR Ambedkar Natl Inst Technol, Dept Elect & Commun Engn, Jalandhar 144011, Punjab, India
关键词
Script identification; page level; texture features; machine learning; Gabor; wavelet; INVARIANT TEXTURE FEATURES; ROTATION;
D O I
10.1142/S0219467821400118
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The identification of script in a document page image is the first step for an OCR system processing multi-script documents. In this multilingual/multiscript world, document processing systems relying on the OCR that need human involvement to select the appropriate OCR package is definitely undesirable and inefficient. The development of robust and efficient methods for automatic script identification of a document is a subject of major importance for automatic document processing in a multilingual/multiscript environment. Thus, the basic objective is to come up with some intuitive methods having straightforward implementation without compromising with efficiency. The aim of this work is to evaluate state-of-the-art feature extraction and classification techniques in the field of automatic script identification of printed and handwritten documents and to propose the best combination for the same.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [31] Word-wise script identification from Indian documents
    Sinha, S
    Pal, U
    Chaudhuri, BB
    DOCUMENT ANALYSIS SYSTEMS VI, PROCEEDINGS, 2004, 3163 : 310 - 321
  • [32] Comparison of Different Classifiers for Script Identification from Handwritten Document
    Obaidullah, Sk Md
    Roy, Kaushik
    Das, Nibaran
    2013 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC), 2013,
  • [33] Fractal-Based System for Arabic/Latin, Printed/Handwritten Script Identification
    Ben Moussa, S.
    Zahour, A.
    Benabdelhafid, A.
    Alimi, A. M.
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2643 - 2646
  • [34] Script Identification from Printed Indian Document Images and Performance Evaluation Using Different Classifiers
    Obaidullah, Sk Md
    Mondal, Anamika
    Das, Nibaran
    Roy, Kaushik
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2014, 2014
  • [35] A Texture based approach to Word-level Script Identification from Multi-script Handwritten Documents
    Singh, Pawan Kumar
    Khan, Aparajita
    Sarkar, Ram
    Nasipuri, Mita
    2014 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS, 2014, : 228 - 232
  • [36] A system for word-wise handwritten script identification for Indian postal automation
    Roy, K
    Banerjee, A
    Pal, U
    PROCEEDINGS OF THE IEEE INDICON 2004, 2004, : 266 - 271
  • [37] Feature Selection Using Harmony Search for Script Identification from Handwritten Document Images
    Singh, Pawan Kumar
    Das, Supratim
    Sarkar, Ram
    Nasipuri, Mita
    JOURNAL OF INTELLIGENT SYSTEMS, 2018, 27 (03) : 465 - 488
  • [38] Novel script line identification method for script normalization and feature extraction in on-line handwritten whiteboard note recognition
    Schenk, Joachim
    Lenz, Johannes
    Rigoll, Gerhard
    PATTERN RECOGNITION, 2009, 42 (12) : 3383 - 3393
  • [39] A study of different kinds of degradation in printed Gurmukhi script
    Jindal, M. K.
    Sharma, R. K.
    Lehal, G. S.
    ICCTA 2007: INTERNATIONAL CONFERENCE ON COMPUTING: THEORY AND APPLICATIONS, PROCEEDINGS, 2007, : 538 - +
  • [40] Handwritten Indic Script Identification from Document Images-A Statistical Comparison of Different Attribute Selection Techniques in Multi-classifier Environment
    Obaidullah, Sk Md
    Halder, Chayan
    Das, Nibaran
    Roy, Kaushik
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 491 - 500