DNN-based Feature Transformation for Speech Recognition Using Throat Microphone

被引：0

作者：

Lin, Shengke ^{[1
]}

Tsunakawa, Takashi ^{[1
]}

Nishida, Masafumi ^{[1
]}

Nishimura, Masafumi ^{[1
]}

机构：

[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Shizuoka, Japan

来源：

2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we focus on utilizing a throat microphone as noise robust device because its signal is much less affected by surrounding noise than a conventional acoustic microphone signal. However, it can only record narrow frequency bands, and the microphone characteristics are also different from characteristics of acoustic microphone. Therefore, speech recognition performance is greatly degraded when a throat microphone is used as it is instead of a conventional acoustic microphone. To overcome this problem, we propose using a deep neural network (DNN)-based feature transformation method while also using model adaptation. We conducted a continuous digit recognition experiment. The result revealed that the proposed method improved the word error rate (WER) of using the throat microphone from 41.4% to 17.6%.

引用

页码：596 / 599

页数：4

共 50 条

[41] AN INVESTIGATION OF AUGMENTING SPEAKER REPRESENTATIONS TO IMPROVE SPEAKER NORMALISATION FOR DNN-BASED SPEECH RECOGNITION
Huang, Hengguan
Sim, Khe Chai
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4610 - 4613
[42] A Low Power DNN-based Speech Recognition Processor with Precision Recoverable Approximate Computing
Liu, Bo
Wang, Xuetao
Zhang, Renyuan
Xue, Anfeng
Wang, Ziyu
Wu, Haige
Cai, Hao
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2102 - 2106
[43] GRAPH-BASED SEMI-SUPERVISED ACOUSTIC MODELING IN DNN-BASED SPEECH RECOGNITION
Liu, Yuzong
Kirchhoff, Katrin
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 177 - 182
[44] A DNN-based emotional speech synthesis by speaker adaptation
Yang, Hongwu
Zhang, Weizhao
Zhi, Pengpeng
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 633 - 637
[45] DNN-based phase estimation for online speech enhancement
Nguyen, Binh Thien
Wakabayashi, Yukoh
Geng, Yuting
Iwai, Kenta
Nishiura, Takanobu
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2025, 46 (02) : 186 - 190
[46] Prediction of speech intelligibility with DNN-based performance measures
Martinez, Angel Mario Castro
Spille, Constantin
Rossbach, Jana
Kollmeier, Birger
Meyer, Bernd T.
COMPUTER SPEECH AND LANGUAGE, 2022, 74
[47] DNN-BASED SPEECH QUALITY ASSESSMENT FOR BINAURAL SIGNALS
Reimes, Jan
2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
[48] DNN-Based Cepstral Excitation Manipulation for Speech Enhancement
Elshamy, Samy
Fingscheidt, Tim
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1803 - 1814
[49] DNN-Based Speech Synthesis for Arabic: Modelling and Evaluation
Houidhek, Amal
Colotte, Vincent
Mnasri, Zied
Jouvet, Denis
STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 9 - 20
[50] DNN-BASED SPEECH MASK ESTIMATION FOR EIGENVECTOR BEAMFORMING
Pfeifenberger, Lukas
Zoehrer, Matthias
Pernkopf, Franz
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 66 - 70

← 1 2 3 4 5 →