Hypothesis Preservation Approach to Scene Text Recognition with Weighted Finite-State Transducer

被引:9
|
作者
Yamazoe, Takafumi [1 ]
Etoh, Minoru [1 ]
Yoshimura, Takeshi [1 ]
Tsujino, Kousuke [1 ]
机构
[1] NTT DOCOMO, Serv & Solut Dev Dept & Res Labs, Tokyo 2398536, Japan
关键词
scene text; natural scene; character recognition; text extraction; WFST; Kanji character;
D O I
10.1109/ICDAR.2011.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper shows that the use of Weighted Finite-State Transducer (WFST) significantly eliminates large-scale ambiguity in scene text recognition, especially for Japanese Kanji characters. The proposed method consists of two WFSTs called WFST-OCR and WFST-Lexicon. WFST-OCR handles the multiple hypotheses caused by erroneous text location, character segmentation and character recognition processes. The following WFST-Lexicon and its convolution of WFST-OCR resolve the hypotheses. The WFSTs integrate the conventional OCR and post-processing processes into one process. The benefit from the proposed method is that all the ambiguities are held as WFST data, and solved in one integrated step; the system outputs texts that are statistically consistent with regard to segmentation possibilities and the given language model An experimental system demonstrates practical performance in spite of the hypothesis complexity inherent in the ICDAR test set and Kanji character texts.
引用
收藏
页码:359 / 363
页数:5
相关论文
共 50 条
  • [1] Learning a Discriminative Weighted Finite-State Transducer for Speech Recognition
    Lehr, Maider
    Shafran, Izhak
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1360 - 1367
  • [2] Weighted Finite-State Transducer Approach to German Compound Words Reconstruction for Speech Recognition
    Shamraev, Nickolay
    Batalshchikov, Alexander
    Zulkarneev, Mikhail
    Repalov, Sergey
    Shirokova, Anna
    [J]. 2015 ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE AND INFORMATION EXTRACTION, SOCIAL MEDIA AND WEB SEARCH FRUCT CONFERENCE (AINL-ISMW FRUCT), 2015, : 96 - 101
  • [3] Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
    Bakhturina, Evelina
    Zhang, Yang
    Ginsburg, Boris
    [J]. INTERSPEECH 2022, 2022, : 491 - 495
  • [4] The design principles of a weighted finite-state transducer library
    Mohri, M
    Pereira, F
    Riley, M
    [J]. THEORETICAL COMPUTER SCIENCE, 2000, 231 (01) : 17 - 32
  • [5] A rational design for a weighted finite-state transducer library
    Mohri, M
    Pereira, F
    Riley, M
    [J]. AUTOMATA IMPLEMENTATION, 1998, 1436 : 144 - 158
  • [6] Juicer: A weighted finite-state transducer speech decoder
    Moore, Darren
    Dines, John
    Doss, Mathew Magimai
    Vepa, Jithendra
    Cheng, Octavian
    Hain, Thomas
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 285 - +
  • [7] Weighted finite-state transducers in speech recognition
    Mohri, M
    Pereira, F
    Riley, M
    [J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 69 - 88
  • [8] OpenFst: A general and efficient weighted finite-state transducer library
    Allauzen, Cyril
    Riley, Michael
    Schalkwyk, Johan
    Skut, Wojciech
    Mohri, Mehryar
    [J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, 2007, 4783 : 11 - +
  • [9] A study of biasing technical terms in medical speech recognition using weighted finite-state transducer
    Kojima, Atsushi
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2022, 43 (01) : 66 - 68
  • [10] A Weighted Finite-State Transducer (WFST)-based Language Model for Online Indic Script Handwriting Recognition
    Chowdhury, Suhan
    Garain, Utpal
    Chattopadhyay, Tanushyam
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 599 - 602