Japanese Personal Name and Location Search for Spoken Utterances by Using Hierarchical Language Model of Speech Recognition

被引:0
|
作者
Hu, Xinhui [1 ]
Wu, Youzheng [1 ]
Kashioka, Hideki [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Seika, Kyoto 6190228, Japan
关键词
spoken document retrieval; OOV; hierarchical language model; confusion network;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a new scheme for searching Japanese personal and location names, the two main sources of out-of-vocabulary (OOV) words, in spoken documents. We use a hierarchical language model for recognition and indexing, which is composed of two independently trained layers. Retrieval experiments performed using a Japanese spontaneous speech corpus reveal that the retrieval performance for OOV words is significantly improved, while that for in-vocabulary (IV) words is not greatly influenced. Further, the retrieval performance of using confusion network is better than the 1-best of recognition results, particularly for OOV words.
引用
收藏
页码:193 / 198
页数:6
相关论文
共 50 条
  • [31] Backoff hierarchical class n-gram language models:: effectiveness to model unseen events in speech recognition
    Zitouni, Imed
    [J]. COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01): : 88 - 104
  • [32] ADVERSARIAL TRAINING OF END-TO-END SPEECH RECOGNITION USING A CRITICIZING LANGUAGE MODEL
    Liu, Alexander H.
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6176 - 6180
  • [33] IMPROVED MIXED LANGUAGE SPEECH RECOGNITION USING ASYMMETRIC ACOUSTIC MODEL AND LANGUAGE MODEL WITH CODE-SWITCH INVERSION CONSTRAINTS
    Li, Ying
    Fung, Pascale
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7368 - 7372
  • [34] Development of Multi-Language Practicing System Using Speech Recognition ∼Evaluated Focus on Japanese and English Practicing∼
    Tansuriyavong, Suriyon
    Higa, Motoki
    [J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS), 2018, : 355 - 358
  • [35] Using model-theoretic semantic interpretation to guide statistical parsing and word recognition in a spoken language interface
    Schuler, W
    [J]. 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 529 - 536
  • [36] N-gram Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech for Speech Recognition
    Hatami, Ali
    Akbari, Ahmad
    Nasersharif, Babak
    [J]. 2013 21ST IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2013,
  • [37] Unsupervised Language Model Adaptation for Automatic Speech Recognition of Broadcast News Using Web 2.0
    Schlippe, Tim
    Gren, Lukasz
    Vu, Ngoc Thang
    Schultz, Tanja
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2697 - 2701
  • [38] Using Dependency Grammar Features in Whole Sentence Maximum Entropy Language Model for Speech Recognition
    Ruokolainen, Teemu
    Alumaee, Tanel
    Dobrinkat, Marcus
    [J]. HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 73 - 79
  • [39] Speech recognition model design for Sundanese language using WAV2VEC 2.0
    Cryssiover A.
    Zahra A.
    [J]. International Journal of Speech Technology, 2024, 27 (1) : 171 - 177
  • [40] Improving Spoken Document Retrieval by. Unsupervised Language Model Adaptation Using Utterance-based Web Search
    Herms, Robert
    Ritter, Marc
    Wilhelm-Stein, Thomas
    Eibl, Maximilian
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1430 - 1433