Unsupervised Language Model Adaptation by Data Selection for Speech Recognition

被引:2
|
作者
Khassanov, Yerbolat [1 ]
Chong, Tze Yuang [1 ]
Bigot, Benjamin [1 ]
Chng, Eng Siong [1 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Language model adaptation; Unsupervised adaptation; Data selection; Speech recognition;
D O I
10.1007/978-3-319-54472-4_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a language model (LM) adaptation framework based on data selection to improve the recognition accuracy of automatic speech recognition systems. Previous approaches of LM adaptation usually require additional data to adapt the existing background LM. In this work, we propose a novel two-pass decoding approach that uses no additional data, but instead, selects relevant data from the existing background corpus that is used to train the background LM. The motivation is that the background corpus consists of data from the different domains and as such, the LM trained from it is generic and not discriminative. To make the LM more discriminative, we will select sentences from the background corpus that are similar in some linguistic characteristics to the utterances recognized in the first-pass and use them to train a new LM which is employed during the second-pass decoding. In this work, we examine the use of n-gram and bag-of-words features as linguistic characteristics of selection criteria. Evaluated on the 11 talks in the test-set of TED-LIUM corpus, the proposed adaptation framework produced a LM that reduced the word error rate by up to 10% relatively and the perplexity by up to 47% relatively. When the LM was adapted for each talk individually, further word error rate reduction was achieved.
引用
收藏
页码:508 / 517
页数:10
相关论文
共 50 条
  • [1] Unsupervised class-based language model adaptation for spontaneous speech recognition
    Yokoyama, T
    Shinozaki, T
    Iwano, K
    Furui, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 236 - 239
  • [2] Unsupervised language model adaptation for meeting recognition
    Tur, Gokhan
    Stolcke, Andreas
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 173 - +
  • [3] Language Model Adaptation for Emotional Speech Recognition using Tweet data
    Saeki, Kazuya
    Kato, Masaharu
    Kosaka, Tetsuo
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 371 - 375
  • [4] Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition
    Kim, Jae-Bok
    Park, Jeong-Sik
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 126 - 134
  • [5] Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
    Ito, Akinori
    Kajiura, Yasutomo
    Suzuki, Motoyuki
    Makino, Shozo
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
  • [6] Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
    Akinori Ito
    Yasutomo Kajiura
    Motoyuki Suzuki
    Shozo Makino
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2009
  • [7] Unsupervised Language Model Adaptation for Automatic Speech Recognition of Broadcast News Using Web 2.0
    Schlippe, Tim
    Gren, Lukasz
    Vu, Ngoc Thang
    Schultz, Tanja
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2697 - 2701
  • [8] UNSUPERVISED DATA SELECTION FOR SPEECH RECOGNITION WITH CONTRASTIVE LOSS RATIOS
    Park, Chanho
    Ahmad, Rehan
    Hain, Thomas
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8587 - 8591
  • [9] Unsupervised language model adaptation for handwritten Chinese text recognition
    Wang, Qiu-Feng
    Yin, Fei
    Liu, Cheng-Lin
    [J]. PATTERN RECOGNITION, 2014, 47 (03) : 1202 - 1216
  • [10] Boosting of speech recognition performance by language model adaptation
    Korkmazsky, Filipp
    Jojic, Oliver
    Shevade, Bageshree
    [J]. 2007 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2007, : 1592 - 1601