DNN-based Feature Transformation for Speech Recognition Using Throat Microphone

被引:0
|
作者
Lin, Shengke [1 ]
Tsunakawa, Takashi [1 ]
Nishida, Masafumi [1 ]
Nishimura, Masafumi [1 ]
机构
[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Shizuoka, Japan
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we focus on utilizing a throat microphone as noise robust device because its signal is much less affected by surrounding noise than a conventional acoustic microphone signal. However, it can only record narrow frequency bands, and the microphone characteristics are also different from characteristics of acoustic microphone. Therefore, speech recognition performance is greatly degraded when a throat microphone is used as it is instead of a conventional acoustic microphone. To overcome this problem, we propose using a deep neural network (DNN)-based feature transformation method while also using model adaptation. We conducted a continuous digit recognition experiment. The result revealed that the proposed method improved the word error rate (WER) of using the throat microphone from 41.4% to 17.6%.
引用
收藏
页码:596 / 599
页数:4
相关论文
共 50 条
  • [1] Bottleneck feature-mediated DNN-based feature mapping for throat microphone speech recognition
    Suzuki, Takahito
    Ogata, Jun
    Tsunakawa, Takashi
    Nishida, Masafumi
    Nishimura, Masafumi
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1738 - 1741
  • [2] DNN-BASED SPEECH RECOGNITION FOR GLOBALPHONE LANGUAGES
    Tachbelie, Martha Yifiru
    Abulimiti, Ayimunishagu
    Abate, Solomon Teferra
    Schultz, Tanja
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8269 - 8273
  • [3] DNN-based Feature Enhancement using Joint Training Framework for Robust Multichannel Speech Recognition
    Lee, Kang Hyun
    Kang, Tae Gyoon
    Kang, Woo Hyun
    Kim, Nam Soo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3027 - 3031
  • [4] DNN-Based Feature Enhancement Using DOA-Constrained ICA for Robust Speech Recognition
    Lee, Ho-Yong
    Cho, Ji-Won
    Kim, Minook
    Park, Hyung-Min
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1091 - 1095
  • [5] DNN-based feature enhancement using joint training framework for robust multichannel speech recognition
    Lee, Kang Hyun
    Kang, Tae Gyoon
    Kang, Woo Hyun
    Kim, Nam Soo
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016, 08-12-September-2016 : 3027 - 3031
  • [6] Throat Microphone Speech Recognition using MFCC
    Vijayan, Amritha
    Mathai, Bipil Mary
    Valsalan, Karthik
    Johnson, Riyanka Raji
    Mathew, Lani Rachel
    Gopakumar, K.
    2017 INTERNATIONAL CONFERENCE ON NETWORKS & ADVANCES IN COMPUTATIONAL TECHNOLOGIES (NETACT), 2017, : 392 - 395
  • [7] DNN-Based Acoustic Modeling for Russian Speech Recognition Using Kaldi
    Kipyatkova, Irina
    Karpov, Alexey
    SPEECH AND COMPUTER, 2016, 9811 : 246 - 253
  • [8] DNN-Based Semantic Rescoring Models for Speech Recognition
    Illina, Irina
    Fohr, Dominique
    TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 357 - 370
  • [9] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
    Furnon, Nicolas
    Serizel, Romain
    Illina, Irina
    Essid, Slim
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
  • [10] ON USING HETEROGENEOUS DATA FOR VEHICLE-BASED SPEECH RECOGNITION: A DNN-BASED APPROACH
    Feng, Xue
    Richardson, Brigitte
    Amman, Scott
    Glass, James
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4385 - 4389