DNN-based Feature Transformation for Speech Recognition Using Throat Microphone

被引:0
|
作者
Lin, Shengke [1 ]
Tsunakawa, Takashi [1 ]
Nishida, Masafumi [1 ]
Nishimura, Masafumi [1 ]
机构
[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Shizuoka, Japan
来源
2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017) | 2017年
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we focus on utilizing a throat microphone as noise robust device because its signal is much less affected by surrounding noise than a conventional acoustic microphone signal. However, it can only record narrow frequency bands, and the microphone characteristics are also different from characteristics of acoustic microphone. Therefore, speech recognition performance is greatly degraded when a throat microphone is used as it is instead of a conventional acoustic microphone. To overcome this problem, we propose using a deep neural network (DNN)-based feature transformation method while also using model adaptation. We conducted a continuous digit recognition experiment. The result revealed that the proposed method improved the word error rate (WER) of using the throat microphone from 41.4% to 17.6%.
引用
收藏
页码:596 / 599
页数:4
相关论文
共 50 条
  • [31] Bilinear map of filter-bank outputs for DNN-based speech recognition
    Ogawa, Tetsuji
    Ueda, Kenshiro
    Katsurada, Kouichi
    Kobayashi, Tetsunori
    Nitta, Tsuneo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 16 - 20
  • [32] DNN-BASED ENHANCEMENT OF NOISY AND REVERBERANT SPEECH
    Zhao, Yan
    Wang, DeLiang
    Merks, Ivo
    Zhang, Tao
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6525 - 6529
  • [33] BATCH-NORMALIZED JOINT TRAINING FOR DNN-BASED DISTANT SPEECH RECOGNITION
    Ravanelli, Mirco
    Brakel, Philemon
    Omologo, Maurizio
    Bengio, Yoshua
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 28 - 34
  • [34] Robust Beam forming for Speech Recognition Using DNN-Based Time-Frequency Masks Estimation
    Jiang, Wenbin
    Wen, Fei
    Liu, Peilin
    IEEE ACCESS, 2018, 6 : 52385 - 52392
  • [35] ADAPTING AND CONTROLLING DNN-BASED SPEECH SYNTHESIS USING INPUT CODES
    Luong, Hieu-Thi
    Takaki, Shinji
    Hente, Gustav Eje
    Yamagishi, Junichi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4905 - 4909
  • [36] INTEGRATED DNN-BASED MODEL ADAPTATION TECHNIQUE FOR NOISE-ROBUST SPEECH RECOGNITION
    Lee, Kang Hyun
    Kang, Woo Hyun
    Kang, Tae Gyoon
    Kim, Nam Soo
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5245 - 5249
  • [37] DNN-based Speech Recognition System dealing with Motor State as Auxiliary Information of DNN for Head Shaking Robot
    Lee, Moa
    Chang, Joon-Hyuk
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 1859 - 1863
  • [38] Improving phoneme recognition of throat microphone speech recordings using transfer learning
    Turan, M. A. Tugtekin
    Erzin, Engin
    SPEECH COMMUNICATION, 2021, 129 : 25 - 32
  • [39] Semi-Supervised Training of DNN-Based Acoustic Model for ATC Speech Recognition
    Smidl, Lubos
    Svec, Jan
    Prazak, Ales
    Trmal, Jan
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 646 - 655
  • [40] Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition
    Deng, Yu-Chih
    Lin, Cheng-Hsin
    Liao, Yuan-Fu
    Wang, Yih-Ru
    Chen, Sin-Horng
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 134 - 138