OCR of printed Telugu text with high recognition accuracies

被引:0
|
作者
Lakshmi, C. Vasantha [1 ]
Jain, Ritu [1 ]
Patvardhan, C. [1 ]
机构
[1] Dayalbagh Educ Inst, Agra 282005, Uttar Pradesh, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Telugu is one of the oldest and popular languages of India spoken by more than 66 million people especially in South India. Development of Optical Character Recognition systems for Telugu text is an area of current research. OCR of Indian scripts is much more complicated than the OCR of Roman script because of the use of huge number of combinations of characters and modifiers. Basic Symbols are identified as the unit of recognition in Telugu script. Edge Histograms are used for a feature based recognition scheme for these basic symbols. During recognition, it is observed that, in many cases, the recognizer incorrectly outputs a very similar looking symbol. Special logic and algorithms are developed using simple structural features for improving recognition accuracies considerably without too much additional computational effort. It is shown that recognition accuracies of 98.5% can be achieved on laser quality prints with such a procedure.
引用
收藏
页码:786 / +
页数:3
相关论文
共 50 条
  • [11] Neural network approach for recognition of printed Telugu characters
    Rani, KS
    Reddy, PR
    [J]. 6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VII, PROCEEDINGS: INFORMATION SYSTEMS DEVELOPMENT II, 2002, : 459 - 462
  • [12] Localization, extraction and recognition of text in Telugu document images
    Negi, A
    Shanker, KN
    Chereddi, CK
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 1193 - 1197
  • [13] Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam
    Mathew, Minesh
    Jain, Mohit
    Jawahar, C. V.
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 7, 2017, : 42 - 46
  • [14] A complete OCR for printed Hindi text in Devanagari script
    Bansal, V
    Sinha, RMK
    [J]. SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 800 - 804
  • [15] On developing high accuracy OCR systems for Telugu and other Indian scripts
    Bhagvati, C
    Ravi, T
    Kumar, SM
    Negi, A
    [J]. LANGUAGE ENGINEERING CONFERENCE, PROCEEDINGS, 2003, : 18 - 23
  • [16] Candidate search and elimination approach for Telugu OCR
    Negi, A
    Chereddi, CK
    [J]. IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 745 - 748
  • [17] Research on Video Text Recognition Technology Based on OCR
    Ding Jie
    Zhao Guotao
    Xu Fang
    [J]. 2018 10TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA), 2018, : 457 - 462
  • [18] PRINTED ARABIC TEXT RECOGNITION
    HASSAN, FH
    ALI, WH
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 1991, 16 (04): : 511 - 518
  • [19] Towards Improving the Accuracy of Telugu OCR Systems
    Kumar, P. Pavan
    Bhagvati, Chakravarthy
    Negi, Atul
    Agarwal, Arun
    Deekshatulu, B. L.
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 910 - 914
  • [20] Fringe map based text line segmentation of printed Telugu document images
    Department of CSE, CMR College of Engineering and Technology, Hyderabad 501401, India
    不详
    [J]. Proc. Int. Conf. Doc. Anal. Recognit., (1294-1298):