OCR of printed Telugu text with high recognition accuracies

被引:0
|
作者
Lakshmi, C. Vasantha [1 ]
Jain, Ritu [1 ]
Patvardhan, C. [1 ]
机构
[1] Dayalbagh Educ Inst, Agra 282005, Uttar Pradesh, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Telugu is one of the oldest and popular languages of India spoken by more than 66 million people especially in South India. Development of Optical Character Recognition systems for Telugu text is an area of current research. OCR of Indian scripts is much more complicated than the OCR of Roman script because of the use of huge number of combinations of characters and modifiers. Basic Symbols are identified as the unit of recognition in Telugu script. Edge Histograms are used for a feature based recognition scheme for these basic symbols. During recognition, it is observed that, in many cases, the recognizer incorrectly outputs a very similar looking symbol. Special logic and algorithms are developed using simple structural features for improving recognition accuracies considerably without too much additional computational effort. It is shown that recognition accuracies of 98.5% can be achieved on laser quality prints with such a procedure.
引用
收藏
页码:786 / +
页数:3
相关论文
共 50 条
  • [1] A novel approach for improving recognition accuracies in OCR of printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    Prasad, M
    [J]. 2004 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING & COMMUNICATIONS (SPCOM), 2004, : 255 - 259
  • [2] A high accuracy OCR system for printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    [J]. IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 725 - 729
  • [3] A multi-font OCR system for printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    [J]. LANGUAGE ENGINEERING CONFERENCE, PROCEEDINGS, 2003, : 7 - 17
  • [4] Determination of Optimal Features Database for OCR of Printed Telugu Text
    Lakshmi, C. Vasantha
    Singh, Sarika
    Patvardhan, C.
    [J]. PROCEEDINGS OF THE 2015 39TH NATIONAL SYSTEMS CONFERENCE (NSC), 2015,
  • [5] An optical character recognition system for printed Telugu text
    C. Vasantha Lakshmi
    C. Patvardhan
    [J]. Pattern Analysis and Applications, 2004, 7 : 190 - 204
  • [6] An optical character recognition system for printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2004, 7 (02) : 190 - 204
  • [7] Hierarchical OCR for Printed Tamil Text
    Noordeen, Aarif
    Kannan, Kawshik
    Ravi, Harish
    Venkatraman, Bhaskar
    Milton, R. S.
    [J]. ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), 2019, 11041
  • [8] An OCR system for Telugu
    Negi, A
    Bhagvati, C
    Krishna, B
    [J]. SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 1110 - 1114
  • [9] OPTICAL CHARACTER RECOGNITION (OCR) FOR TELUGU: DATABASE, ALGORITHM AND APPLICATION
    Prakash, Konkimalla Chandra
    Srikar, Y. M.
    Trishal, Gayam
    Mandal, Souraj
    Channappayya, Sumohana S.
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3963 - 3967
  • [10] Printed Text Image Database for Sindhi OCR
    Hakro, Dil Nawaz
    Talib, Abdullah Zawawi
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2016, 15 (04)