DNN-based Feature Transformation for Speech Recognition Using Throat Microphone

被引:0
|
作者
Lin, Shengke [1 ]
Tsunakawa, Takashi [1 ]
Nishida, Masafumi [1 ]
Nishimura, Masafumi [1 ]
机构
[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Shizuoka, Japan
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we focus on utilizing a throat microphone as noise robust device because its signal is much less affected by surrounding noise than a conventional acoustic microphone signal. However, it can only record narrow frequency bands, and the microphone characteristics are also different from characteristics of acoustic microphone. Therefore, speech recognition performance is greatly degraded when a throat microphone is used as it is instead of a conventional acoustic microphone. To overcome this problem, we propose using a deep neural network (DNN)-based feature transformation method while also using model adaptation. We conducted a continuous digit recognition experiment. The result revealed that the proposed method improved the word error rate (WER) of using the throat microphone from 41.4% to 17.6%.
引用
收藏
页码:596 / 599
页数:4
相关论文
共 50 条
  • [21] Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition
    Abdelaziz, Ahmed Hussen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (03) : 475 - 484
  • [22] DNN-Based Arabic Speech Synthesis
    Amrouche, Aissa
    Bentrcia, Youssouf
    Boubakeur, Khadidja Nesrine
    Abed, Ahcene
    2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 378 - 382
  • [23] Knowledge Distillation for Throat Microphone Speech Recognition
    Suzuki, Takahito
    Ogata, Jun
    Tsunakawa, Takashi
    Nishida, Masafumi
    Nishimura, Masafumi
    INTERSPEECH 2019, 2019, : 461 - 465
  • [24] Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings
    Erzin, Engin
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (07): : 1316 - 1324
  • [25] Effects of microphone mounting location and gender on accuracy in speech recognition using a throat microphone
    Konuma, Y.
    Asakura, T.
    JASA EXPRESS LETTERS, 2023, 3 (09):
  • [26] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Tammen, Marvin
    Fischer, Doerte
    Meyer, Bernd T.
    Doclo, Simon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
  • [27] An Investigation of DNN-Based Speech Synthesis Using Speaker Codes
    Hojo, Nobukatsu
    Ijima, Yusuke
    Mizuno, Hideyuki
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2278 - 2282
  • [28] DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus
    Yamashita, Yuki
    Koriyama, Tomoki
    Saito, Yuki
    Takamichi, Shinnosuke
    Ijima, Yusuke
    Masumura, Ryo
    Saruwatari, Hiroshi
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6438 - 6443
  • [29] SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
    Schwarz, Andreas
    Huemmer, Christian
    Maas, Roland
    Kellermann, Walter
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4380 - 4384
  • [30] A DNN-Based Accurate Masking Using Significant Feature Sets
    Sivapatham, Shoba
    Goel, Pankaj
    Burra, Srikanth
    Sooraksa, Pitikhate
    Kar, Asutosh
    2022 20TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2022, : 11 - 16