Acoustic models of the elderly for large-vocabulary continuous speech recognition

被引:22
|
作者
Baba, A [1 ]
Yoshizawa, S
Yamada, M
Lee, A
Shikano, K
机构
[1] Labs Image Informat Sci & Technol, Ikoma 6300101, Japan
[2] Matsushita Elect Works Ltd, Kadoma, Osaka 5718686, Japan
[3] Matsushita Elect Ind Co Ltd, Kyoto 6190237, Japan
[4] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma 6300101, Japan
关键词
elderly; large-vocabulary continuous speech recognition; acoustic model; speaker adaptation;
D O I
10.1002/ecjb.20101
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Widespread use of large-vocabulary continuous speech recognition systems has recently occurred, encouraging the application of speech recognition techniques to various problems. One of the factors that adversely affect the performance of speech recognition systems is a mismatch between the acoustic properties of the speech of the system user and the acoustic model. The speech of young Z or middle-aged adults is generally used in constructing the acoustic model. Thus, a mismatch occurs between the model and the acoustic properties of the speech of the elderly, which may degrade the recognition rate. In this study, a large-scale elderly speech database (200 sentences x 301 subjects) is used to train the acoustic model, and the resulting elderly acoustic model is evaluated by using a large-vocabulary continuous speech recognition system. In the experiments, the word recognition rate was improved by 3 to 5% compared to the recognition results of an acoustic model trained by young or middle-aged adult speech, namely, by the JNAS speech database (150 sentences x 260 subjects, average 28.6 years). It is also verified experimentally that the recognition rate is further improved in speaker adaptation to elderly speech by making use of an acoustic model trained by elderly speech. (C) 22004 Wiley Periodicals, Inc.
引用
收藏
页码:49 / 57
页数:9
相关论文
共 50 条
  • [1] Large-Vocabulary Continuous Speech Recognition Systems
    Saon, George
    Chien, Jen-Tzung
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 18 - 33
  • [2] A review of large-vocabulary continuous-speech recognition
    Young, S
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 1996, 13 (05) : 45 - 57
  • [3] Large-Vocabulary Continuous Speech Recognition of Lhasa Tibetan
    Li, Guanyu
    Yu, Hongzhi
    [J]. COMPUTER AND INFORMATION TECHNOLOGY, 2014, 519-520 : 802 - 806
  • [4] Large-vocabulary continuous speech recognition: Advances and applications
    Gauvain, JL
    Lamel, L
    [J]. PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1181 - 1200
  • [5] A large-vocabulary continuous speech recognition system for Hindi
    Kumar, M
    Rajput, N
    Verma, A
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2004, 48 (5-6) : 703 - 715
  • [6] Combining spectral representations for large-vocabulary continuous speech recognition
    Garau, Giulia
    Renals, Steve
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 508 - 518
  • [7] ON LARGE-VOCABULARY SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
    LEE, KF
    [J]. SPEECH COMMUNICATION, 1988, 7 (04) : 375 - 379
  • [8] Large-vocabulary speech recognition algorithms
    Padmanabhan, M
    Picheny, M
    [J]. COMPUTER, 2002, 35 (04) : 42 - +
  • [9] SPEECH RECOGNITION FOR LARGE-VOCABULARY SYSTEMS
    JACOB, B
    ANDREOBRECHT, R
    [J]. JOURNAL DE PHYSIQUE IV, 1994, 4 (C5): : 489 - 492
  • [10] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
    Wessel, F
    Ney, H
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310