Improving Spoken Document Retrieval by. Unsupervised Language Model Adaptation Using Utterance-based Web Search

被引:0
|
作者
Herms, Robert [1 ]
Ritter, Marc [1 ]
Wilhelm-Stein, Thomas [1 ]
Eibl, Maximilian [1 ]
机构
[1] Tech Univ Chemnitz, Chemnitz, Germany
关键词
language modeling; unsupervised adaptation; out-of-vocabulary; spoken document retrieval;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information retrieval systems facilitate the search for annotated audiovisual documents from different corpora. One of the main problems is to determine domain-specific vocabulary like names, brands, technical terms etc. by using general language models (LM) especially in broadcast news. Our approach consists of two steps to overcome the out-of-vocabulary (OOV) problem to improve the spoken document retrieval performance. Therefore, we first separate the resulting transcript of a speech recognizer into blocks. Keywords are extracted from each transcribed utterance of a block for the search of web resources in an unsupervised manner in order to obtain adaptation data. These data are used to perform a block-specific adaptation of a general pronunciation dictionary and a general LM. The second step comprises the utilization of a certain adapted dictionary and LM in the speech recognizer to improve the vocabulary coverage and to regard the perplexity for a corresponding block at once. We evaluate this strategy on a dataset of summarized German broadcast news. Our experimental results show improvements of up to 11.7% for MAP of 18 different topics and 7.5% of WER in comparison to the base LM.
引用
收藏
页码:1430 / 1433
页数:4
相关论文
共 13 条
  • [1] LANGUAGE MODEL ADAPTATION USING WWW DOCUMENTS OBTAINED BY UTTERANCE-BASED QUERIES
    Tsiartas, Andreas
    Georgiou, Panayiotis
    Narayanan, Shrikanth
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5406 - 5409
  • [2] Language model expansion using webdata for spoken document retrieval
    Masumura, Ryo
    Hahm, Seongjun
    Ito, Akinori
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2144 - 2147
  • [3] Cluster-based Language Model for Spoken Document Retrieval Using NMF-Based Document Clustering
    Hu, Xinhui
    Isotani, Ryosuke
    Kawai, Hisashi
    Nakamura, Satoshi
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 705 - 708
  • [4] UNSUPERVISED NEURAL ADAPTATION MODEL BASED ON OPTIMAL TRANSPORT FOR SPOKEN LANGUAGE IDENTIFICATION
    Lu, Xugang
    Shen, Peng
    Tsao, Yu
    Kawai, Hisashi
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7228 - 7232
  • [5] An unsupervised Web-based topic language model adaptation method
    Lecorve, Gwenole
    Gravier, Guillaume
    Sebillot, Pascale
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5081 - 5084
  • [6] Supervised and unsupervised Web-based language model domain adaptation
    Lecorve, Gwenole
    Dines, John
    Hain, Thomas
    Motlicek, Petr
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 182 - 185
  • [7] Web-based Language Model Domain Adaptation for Real World Voice Retrieval
    Chen, Mengzhe
    Zhang, Qingqing
    Wang, Zhichao
    Pan, Jielin
    Yan, Yonghong
    [J]. 2013 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2013, : 100 - 104
  • [8] Unsupervised Language Model Adaptation for Automatic Speech Recognition of Broadcast News Using Web 2.0
    Schlippe, Tim
    Gren, Lukasz
    Vu, Ngoc Thang
    Schultz, Tanja
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2697 - 2701
  • [9] Open-Vocabulary Spoken-Document Retrieval Based on Query Expansion Using Related Web Documents
    Terao, Makoto
    Koshinaka, Takafumi
    Ando, Shinichi
    Isotani, Ryosuke
    Okumura, Akitoshi
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2171 - 2174
  • [10] Unsupervised Cross-Adaptation Using Language Model and Deep Learning Based Acoustic Model Adaptations
    Takagi, Akira
    Konno, Kazuki
    Kato, Masaharu
    Kosaka, Tetsuo
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,