Language model adaptation using WFST-based speaking-style translation

被引:0
|
作者
Hori, T [1 ]
Willett, D [1 ]
Minami, Y [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Speech Open Lab, Seika, Kyoto, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a new approach to language model adaptation for speech recognition based on the statistical framework of speech translation. The main idea of this approach is to compose a weighted finite-state transducer (WFST) that translates sentence styles from in-domain to out-of-domain. It enables to integrate language models of different styles of speaking or dialects and even of different vocabularies. The WFST is built by combining in-domain and out-of-domain models through the translation, while each model and the translation itself is expressed as a WFST. We apply this technique to building language models for spontaneous speech recognition using large written-style corpora. We conducted experiments on a 20k-word Japanese spontaneous speech recognition task. With a small in-domain corpus, a 2.9% absolute improvement in word error rate is achieved over the in-domain model.
引用
收藏
页码:228 / 231
页数:4
相关论文
共 50 条
  • [1] Language model adaptation using WFST-based speaking-style translation
    Hori, Takaaki
    Willett, Daniel
    Minami, Yasuhiro
    ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (228-231):
  • [2] A WFST-based Log-linear Framework for Speaking-style Transformation
    Neubig, Graham
    Mori, Shinsuke
    Kawahara, Tatsuya
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1503 - 1506
  • [3] Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts
    Akita, Yuya
    Watanabe, Makoto
    Kawahara, Tatsuya
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2323 - 2326
  • [4] Tied-State Mixture Language Model for WFST-based Speech Recognition
    Yamamoto, Hitoshi
    Dixon, Paul R.
    Matsuda, Shigeki
    Hori, Chiori
    Kashioka, Hideki
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 174 - 177
  • [5] Topic-independent speaking-style transformation of language model for spontaneous speech recognition
    Akita, Yuya
    Kawahara, Tatsuya
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 33 - +
  • [6] ROBUSTNESS OF PHONEME-BASED HMMS AGAINST SPEAKING-STYLE VARIATIONS
    MATSUOKA, T
    SHIKANO, K
    IEICE TRANSACTIONS ON COMMUNICATIONS ELECTRONICS INFORMATION AND SYSTEMS, 1991, 74 (07): : 1761 - 1767
  • [7] WFST-BASED STRUCTURAL CLASSIFICATION INTEGRATING DNN ACOUSTIC FEATURES AND RNN LANGUAGE FEATURES FOR SPEECH RECOGNITION
    Quoc Truong Do
    Nakamura, Satoshi
    Delcroix, Marc
    Hori, Takaaki
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4959 - 4963
  • [8] Score normalization-based speaking-style variation robust speaker recognition
    State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
    不详
    Qinghua Daxue Xuebao, 2009, SUPPL. 1 (1278-1282):
  • [9] Large Vocabulary Continuous Speech Recognition Using WFST-based Linear Classifier for Structured Data
    Watanabe, Shinji
    Hori, Takaaki
    Nakamura, Atsushi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 346 - 349
  • [10] FEATURE BASED ADAPTATION FOR SPEAKING STYLE SYNTHESIS
    Wu, Xixin
    Sun, Lifa
    Kang, Shiyin
    Liu, Songxiang
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5304 - 5308