An effective feature set for enhancing printed Tamil character recognition

被引:1
|
作者
Shafana, M. S. [1 ,2 ]
Ragel, R. G. [3 ]
Kumara, T. N. [4 ]
机构
[1] South Eastern Univ Sri Lanka, Dept Informat & Commun Technol, Fac Technol, Univ Pk, Oluvil, Sri Lanka
[2] Univ Peradeniya, Post Grad Inst Sci, Peradeniya, Sri Lanka
[3] Univ Peradeniya, Dept Comp Engn, Fac Engn, Peradeniya, Sri Lanka
[4] Western Sydney Univ, MARCS Inst Brain Behav & Dev, Sydney, NSW, Australia
关键词
Basic features; feature extraction; OCR; OVA SVM; Tamil character recognition; UDT SVM;
D O I
10.4038/jnsfsr.v49i2.9466
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Selection of features for extraction and classification are the essential factors in achieving high performance in character recognition. Feature extraction process produces feature vectors that define the shape and characteristics of the pattern to identify them uniquely. Many feature extraction and classification approaches are available for Tamil and other languages, but there is still room to identify a better set of features for extraction to obtain higher recognition rate of Optical Character Recognition (OCR) for Tamil printed text. This research aims at producing an efficient set of features for extraction, which is capable of increasing the accuracy and reducing the runtime to improve the performance of the best OCR system to classify isolated Tamil printed characters. The proposed set of features is experimented on a large dataset using One-versus-All (OVA) Support Vector Machine (SVM). Two types of the pool of different feature vectors are created with features used in this study such as basic, density, histogram oriented gradients (HOG), and transition. In comparison with the current best approach, the testing results of Pool 1 gives better recognition accuracy of 94.87 % for OVA SVM and 97.07 % for the Unbalanced Decision Tree (UDT) SVM algorithms, but could not reach an improved recognition speed. Likewise, the results of Pool 2 improves the performance of the system by giving not only better recognition accuracy of 94.30 % for OVA SVM and 96.35% for the UDT SVM algorithms but also reached an improved recognition speed than the selected best OCR approach. The proposed set of features improves the recognition rate by 2.57-3.14% on OVA SVM and 3.22-3.94% on UDT SVM.
引用
收藏
页码:195 / 208
页数:14
相关论文
共 50 条
  • [1] A High precision Printed Character Recognition method for Tamil script
    Sundar, K. Ajay
    John, Mala
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 632 - 636
  • [2] Optical Character Recognition for printed Tamil text using Unicode
    Seethalakshmi R.
    Sreeranjani T.R.
    Balachandar T.
    Singh A.
    Singh M.
    Ratan R.
    Kumar S.
    [J]. Journal of Zhejiang University-SCIENCE A, 2005, 6 (11): : 1297 - 1305
  • [3] Optical Character Recognition for printed Tamil text using Unicode
    SEETHALAKSHMI R.
    SREERANJANI T.R.
    BALACHANDAR T.
    Abnikant Singh
    Markandey Singh
    Ritwaj Ratan
    Sarvesh Kumar
    [J]. Journal of Zhejiang University-Science A(Applied Physics & Engineering), 2005, (11) : 131 - 139
  • [4] A Hybrid Decision Tree for Printed Tamil Character Recognition Using SVMs
    Ramanan, M.
    Ramanan, A.
    Charles, E. Y. A.
    [J]. 2015 FIFTEENTH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER), 2015, : 176 - 181
  • [5] Handwritten Tamil Character Recognition
    Wahi, Amitabh
    Sundaramurthy, S.
    Poovizhi, P.
    [J]. 2013 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2013, : 389 - 394
  • [6] Effective Printed Tamil Text Segmentation and Recognition Using Bayesian Classifier
    Manisha, S.
    Sharmila, T. Sree
    [J]. COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM 2016, 2017, 556 : 729 - 738
  • [7] A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning
    Manoj, K.
    Iyapparaja, M.
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2024, 18 (06): : 1540 - 1561
  • [8] Newton Algorithm Based DELM for Enhancing Offline Tamil Handwritten Character Recognition
    Shanmugam, K.
    Vanathi, B.
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (05)
  • [9] A COMPUTER SEARCHING CRITERION FOR BEST FEATURE SET IN CHARACTER RECOGNITION
    CHEN, CH
    [J]. PROCEEDINGS OF THE INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, 1965, 53 (12): : 2128 - &
  • [10] A Printed Chinese Character Recognition Method Based on Area Brightness Feature
    Ke, Yonghong
    [J]. CHINESE LEXICAL SEMANTICS (CLSW 2019), 2020, 11831 : 329 - 336