HMM-Based Lexicon-Driven and Lexicon-Free Word Recognition for Online Handwritten Indic Scripts

被引:74
|
作者
Bharath, A. [1 ]
Madhvanath, Sriganesh [2 ]
机构
[1] Genesys Telecom Labs, Madras, Tamil Nadu, India
[2] Hewlett Packard Labs, Bangalore 560030, Karnataka, India
关键词
Online handwriting recognition; word recognition; lexicon driven; lexicon free; bag of symbols; symbol order variation; Devanagari; Tamil;
D O I
10.1109/TPAMI.2011.234
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research for recognizing online handwritten words in Indic scripts is at its early stages when compared to Latin and Oriental scripts. In this paper, we address this problem specifically for two major Indic scripts-Devanagari and Tamil. In contrast to previous approaches, the techniques we propose are largely data driven and script independent. We propose two different techniques for word recognition based on Hidden Markov Models (HMM): lexicon driven and lexicon free. The lexicon-driven technique models each word in the lexicon as a sequence of symbol HMMs according to a standard symbol writing order derived from the phonetic representation. The lexicon-free technique uses a novel Bag-of-Symbols representation of the handwritten word that is independent of symbol order and allows rapid pruning of the lexicon. On handwritten Devanagari word samples featuring both standard and nonstandard symbol writing orders, a combination of lexicon-driven and lexicon-free recognizers significantly outperforms either of them used in isolation. In contrast, most Tamil word samples feature the standard symbol order, and the lexicon-driven recognizer outperforms the lexicon free one as well as their combination. The best recognition accuracies obtained for 20,000 word lexicons are 87.13 percent for Devanagari when the two recognizers are combined, and 91.8 percent for Tamil using the lexicon-driven technique.
引用
收藏
页码:670 / 682
页数:13
相关论文
共 37 条
  • [1] Lexicon-free, novel segmentation of online handwritten Indic words.
    Sundaram, Suresh
    Ramakrishnan, A. G.
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1175 - 1179
  • [2] A PHOC Decoder for Lexicon-free Handwritten Word Recognition
    Sfikas, Giorgos
    Retsinas, George
    Gatos, Basilis
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 513 - 518
  • [3] Lexicon-driven handwritten word recognition using Choquet fuzzy integral
    Shi, HC
    Gader, PD
    [J]. INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 412 - 417
  • [4] HMM-based Indic handwritten word recognition using zone segmentation
    Roy, Partha Pratim
    Bhunia, Ayan Kumar
    Das, Ayan
    Dey, Prasenjit
    Pal, Umapada
    [J]. PATTERN RECOGNITION, 2016, 60 : 1057 - 1075
  • [5] Lexicon-driven handwritten word recognition using optimal linear combinations of order statistics
    Chen, WT
    Gader, P
    Shi, HC
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1999, 21 (01) : 77 - 82
  • [6] Lexicon-free handwritten word spotting using character HMMs
    Fischer, Andreas
    Keller, Andreas
    Frinken, Volkmar
    Bunke, Horst
    [J]. PATTERN RECOGNITION LETTERS, 2012, 33 (07) : 934 - 942
  • [7] Handwritten word recognition using lexicon free and lexicon directed word recognition algorithms
    Shridhar, M
    Houle, G
    Kimura, F
    [J]. PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, 1997, : 861 - 865
  • [8] A distributed scheme for lexicon-driven handwritten word recognition and its application to large vocabulary problems
    Koerich, AL
    Sabourin, R
    Suen, CY
    [J]. SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 660 - 664
  • [9] Lexicon-driven handwritten character string recognition for Japanese address reading
    Liu, CL
    Koga, M
    Fujisawa, H
    [J]. SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 877 - 881
  • [10] Lexicon-driven handwritten character string recognition for Japanese address reading
    Central Research Laboratory, Hitachi, Ltd., 1-280, Higashi-Koigakubo, Kokubunji-shi, Tokyo
    185-8601, Japan
    [J]. Proc. Int. Conf. Doc. Anal. Recognit., 1600, (877-881):