Hypothesis Preservation Approach to Scene Text Recognition with Weighted Finite-State Transducer

被引：9

作者：

Yamazoe, Takafumi ^{[1
]}

Etoh, Minoru ^{[1
]}

Yoshimura, Takeshi ^{[1
]}

Tsujino, Kousuke ^{[1
]}

机构：

[1] NTT DOCOMO, Serv & Solut Dev Dept & Res Labs, Tokyo 2398536, Japan

来源：

11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011) | 2011年

关键词：

scene text; natural scene; character recognition; text extraction; WFST; Kanji character;

D O I：

10.1109/ICDAR.2011.80

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper shows that the use of Weighted Finite-State Transducer (WFST) significantly eliminates large-scale ambiguity in scene text recognition, especially for Japanese Kanji characters. The proposed method consists of two WFSTs called WFST-OCR and WFST-Lexicon. WFST-OCR handles the multiple hypotheses caused by erroneous text location, character segmentation and character recognition processes. The following WFST-Lexicon and its convolution of WFST-OCR resolve the hypotheses. The WFSTs integrate the conventional OCR and post-processing processes into one process. The benefit from the proposed method is that all the ambiguities are held as WFST data, and solved in one integrated step; the system outputs texts that are statistically consistent with regard to segmentation possibilities and the given language model An experimental system demonstrates practical performance in spite of the hypothesis complexity inherent in the ICDAR test set and Kanji character texts.

引用

页码：359 / 363

页数：5

共 50 条

[1] Learning a Discriminative Weighted Finite-State Transducer for Speech Recognition
Lehr, Maider
Shafran, Izhak
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1360 - 1367
[2] Weighted Finite-State Transducer Approach to German Compound Words Reconstruction for Speech Recognition
Shamraev, Nickolay
Batalshchikov, Alexander
Zulkarneev, Mikhail
Repalov, Sergey
Shirokova, Anna
[J]. 2015 ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE AND INFORMATION EXTRACTION, SOCIAL MEDIA AND WEB SEARCH FRUCT CONFERENCE (AINL-ISMW FRUCT), 2015, : 96 - 101
[3] Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
Bakhturina, Evelina
Zhang, Yang
Ginsburg, Boris
[J]. INTERSPEECH 2022, 2022, : 491 - 495
[4] The design principles of a weighted finite-state transducer library
Mohri, M
Pereira, F
Riley, M
[J]. THEORETICAL COMPUTER SCIENCE, 2000, 231 (01) : 17 - 32
[5] A rational design for a weighted finite-state transducer library
Mohri, M
Pereira, F
Riley, M
[J]. AUTOMATA IMPLEMENTATION, 1998, 1436 : 144 - 158
[6] Juicer: A weighted finite-state transducer speech decoder
Moore, Darren
Dines, John
Doss, Mathew Magimai
Vepa, Jithendra
Cheng, Octavian
Hain, Thomas
[J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 285 - +
[7] Weighted finite-state transducers in speech recognition
Mohri, M
Pereira, F
Riley, M
[J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01): : 69 - 88
[8] OpenFst: A general and efficient weighted finite-state transducer library
Allauzen, Cyril
Riley, Michael
Schalkwyk, Johan
Skut, Wojciech
Mohri, Mehryar
[J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, 2007, 4783 : 11 - +
[9] A study of biasing technical terms in medical speech recognition using weighted finite-state transducer
Kojima, Atsushi
[J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2022, 43 (01) : 66 - 68
[10] A Weighted Finite-State Transducer (WFST)-based Language Model for Online Indic Script Handwriting Recognition
Chowdhury, Suhan
Garain, Utpal
Chattopadhyay, Tanushyam
[J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 599 - 602

← 1 2 3 4 5 →