Japanese Personal Name and Location Search for Spoken Utterances by Using Hierarchical Language Model of Speech Recognition

被引：0

作者：

Hu, Xinhui ^{[1
]}

Wu, Youzheng ^{[1
]}

Kashioka, Hideki ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol, Seika, Kyoto 6190228, Japan

来源：

RECENT ADVANCES OF ASIAN LANGUAGE PROCESSING TECHNOLOGIES | 2008年

关键词：

spoken document retrieval; OOV; hierarchical language model; confusion network;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose a new scheme for searching Japanese personal and location names, the two main sources of out-of-vocabulary (OOV) words, in spoken documents. We use a hierarchical language model for recognition and indexing, which is composed of two independently trained layers. Retrieval experiments performed using a Japanese spontaneous speech corpus reveal that the retrieval performance for OOV words is significantly improved, while that for in-vocabulary (IV) words is not greatly influenced. Further, the retrieval performance of using confusion network is better than the 1-best of recognition results, particularly for OOV words.

引用

页码：193 / 198

页数：6

共 50 条

[31] Backoff hierarchical class n-gram language models:: effectiveness to model unseen events in speech recognition
Zitouni, Imed
[J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01): : 88 - 104
[32] ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL
Liu, Alexander H.
Lee, Hung-yi
Lee, Lin-shan
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6176 - 6180
[33] IMPROVED MIXED LANGUAGE SPEECH RECOGNITION USING ASYMMETRIC ACOUSTIC MODEL AND LANGUAGE MODEL WITH CODE-SWITCH INVERSION CONSTRAINTS
Li, Ying
Fung, Pascale
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7368 - 7372
[34] Development of Multi-Language Practicing System Using Speech Recognition ∼Evaluated Focus on Japanese and English Practicing∼
Tansuriyavong, Suriyon
Higa, Motoki
[J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS), 2018, : 355 - 358
[35] Using model-theoretic semantic interpretation to guide statistical parsing and word recognition in a spoken language interface
Schuler, W
[J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 529 - 536
[36] N-gram Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech for Speech Recognition
Hatami, Ali
Akbari, Ahmad
Nasersharif, Babak
[J]. 2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
[37] Unsupervised Language Model Adaptation for Automatic Speech Recognition of Broadcast News Using Web 2.0
Schlippe, Tim
Gren, Lukasz
Vu, Ngoc Thang
Schultz, Tanja
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2697 - 2701
[38] Using Dependency Grammar Features in Whole Sentence Maximum Entropy Language Model for Speech Recognition
Ruokolainen, Teemu
Alumaee, Tanel
Dobrinkat, Marcus
[J]. HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 73 - 79
[39] Speech recognition model design for Sundanese language using WAV2VEC 2.0
Cryssiover A.
Zahra A.
[J]. International Journal of Speech Technology, 2024, 27 (1) : 171 - 177
[40] Improving Spoken Document Retrieval by. Unsupervised Language Model Adaptation Using Utterance-based Web Search
Herms, Robert
Ritter, Marc
Wilhelm-Stein, Thomas
Eibl, Maximilian
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1430 - 1433

← 1 2 3 4 5 →