Hypothesis Preservation Approach to Scene Text Recognition with Weighted Finite-State Transducer

被引:9
|
作者
Yamazoe, Takafumi [1 ]
Etoh, Minoru [1 ]
Yoshimura, Takeshi [1 ]
Tsujino, Kousuke [1 ]
机构
[1] NTT DOCOMO, Serv & Solut Dev Dept & Res Labs, Tokyo 2398536, Japan
关键词
scene text; natural scene; character recognition; text extraction; WFST; Kanji character;
D O I
10.1109/ICDAR.2011.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper shows that the use of Weighted Finite-State Transducer (WFST) significantly eliminates large-scale ambiguity in scene text recognition, especially for Japanese Kanji characters. The proposed method consists of two WFSTs called WFST-OCR and WFST-Lexicon. WFST-OCR handles the multiple hypotheses caused by erroneous text location, character segmentation and character recognition processes. The following WFST-Lexicon and its convolution of WFST-OCR resolve the hypotheses. The WFSTs integrate the conventional OCR and post-processing processes into one process. The benefit from the proposed method is that all the ambiguities are held as WFST data, and solved in one integrated step; the system outputs texts that are statistically consistent with regard to segmentation possibilities and the given language model An experimental system demonstrates practical performance in spite of the hypothesis complexity inherent in the ICDAR test set and Kanji character texts.
引用
下载
收藏
页码:359 / 363
页数:5
相关论文
共 50 条
  • [31] Pronunciation modeling using a finite-state transducer representation
    Hazen, TJ
    Hetherington, IL
    Shu, H
    Livescu, K
    SPEECH COMMUNICATION, 2005, 46 (02) : 189 - 203
  • [32] Silence Models in Weighted Finite-State Transducers
    Garner, Philip N.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1817 - 1820
  • [33] Music identification with weighted finite-state transducers
    Weinstein, Eugene
    Moreno, Pedro
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 689 - +
  • [34] PARALLEL COMPOSITION OF WEIGHTED FINITE-STATE TRANSDUCERS
    Sengupta, Shubho
    Pratap, Vineel
    Hannun, Awni
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6542 - 6546
  • [35] The Kleene Language for Weighted Finite-State Programming
    Beesley, Kenneth R.
    FINITE-STATE METHODS AND NATURAL LANGUAGE PROCESSING, 2009, 191 : 27 - 38
  • [36] Collapsing ε-loops in weighted finite-state machines
    Johnson, J. Howard
    FINITE-STATE METHODS AND NATURAL LANGUAGE PROCESSING, 2006, 4002 : 110 - 119
  • [37] APPLICATION OF A FINITE-STATE MODEL TO TEXT COMPRESSION
    TEUHOLA, J
    RAITA, T
    COMPUTER JOURNAL, 1993, 36 (07): : 607 - 614
  • [38] Finite-State Text Processing, volume 50
    De Santo, Aniello
    COMPUTATIONAL LINGUISTICS, 2023, 49 (01) : 245 - 247
  • [39] Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition
    Kubo, Yotaro
    Watanabe, Shinji
    Hori, Takaaki
    Nakamura, Atsushi
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2240 - 2251
  • [40] Automated Error Detection and Correction of Chinese Characters in Written Essays Based on Weighted Finite-State Transducer
    Hao, Shudong
    Gao, Zongtian
    Zhang, Mingqing
    Xu, Yanyan
    Peng, Hengli
    Su, Kaile
    Ke, Dengfeng
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 763 - 767