DNN-based Feature Transformation for Speech Recognition Using Throat Microphone

被引：0

作者：

Lin, Shengke ^{[1
]}

Tsunakawa, Takashi ^{[1
]}

Nishida, Masafumi ^{[1
]}

Nishimura, Masafumi ^{[1
]}

机构：

[1] Shizuoka Univ, Grad Sch Integrated Sci & Technol, Shizuoka, Japan

来源：

2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we focus on utilizing a throat microphone as noise robust device because its signal is much less affected by surrounding noise than a conventional acoustic microphone signal. However, it can only record narrow frequency bands, and the microphone characteristics are also different from characteristics of acoustic microphone. Therefore, speech recognition performance is greatly degraded when a throat microphone is used as it is instead of a conventional acoustic microphone. To overcome this problem, we propose using a deep neural network (DNN)-based feature transformation method while also using model adaptation. We conducted a continuous digit recognition experiment. The result revealed that the proposed method improved the word error rate (WER) of using the throat microphone from 41.4% to 17.6%.

引用

页码：596 / 599

页数：4

共 50 条

[1] Bottleneck feature-mediated DNN-based feature mapping for throat microphone speech recognition
Suzuki, Takahito
Ogata, Jun
Tsunakawa, Takashi
Nishida, Masafumi
Nishimura, Masafumi
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1738 - 1741
[2] DNN-BASED SPEECH RECOGNITION FOR GLOBALPHONE LANGUAGES
Tachbelie, Martha Yifiru
Abulimiti, Ayimunishagu
Abate, Solomon Teferra
Schultz, Tanja
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8269 - 8273
[3] DNN-based Feature Enhancement using Joint Training Framework for Robust Multichannel Speech Recognition
Lee, Kang Hyun
Kang, Tae Gyoon
Kang, Woo Hyun
Kim, Nam Soo
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3027 - 3031
[4] DNN-Based Feature Enhancement Using DOA-Constrained ICA for Robust Speech Recognition
Lee, Ho-Yong
Cho, Ji-Won
Kim, Minook
Park, Hyung-Min
IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1091 - 1095
[5] DNN-based feature enhancement using joint training framework for robust multichannel speech recognition
Lee, Kang Hyun
Kang, Tae Gyoon
Kang, Woo Hyun
Kim, Nam Soo
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016, 08-12-September-2016 : 3027 - 3031
[6] Throat Microphone Speech Recognition using MFCC
Vijayan, Amritha
Mathai, Bipil Mary
Valsalan, Karthik
Johnson, Riyanka Raji
Mathew, Lani Rachel
Gopakumar, K.
2017 INTERNATIONAL CONFERENCE ON NETWORKS & ADVANCES IN COMPUTATIONAL TECHNOLOGIES (NETACT), 2017, : 392 - 395
[7] DNN-Based Acoustic Modeling for Russian Speech Recognition Using Kaldi
Kipyatkova, Irina
Karpov, Alexey
SPEECH AND COMPUTER, 2016, 9811 : 246 - 253
[8] DNN-Based Semantic Rescoring Models for Speech Recognition
Illina, Irina
Fohr, Dominique
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 357 - 370
[9] DNN-BASED DISTRIBUTED MULTICHANNEL MASK ESTIMATION FOR SPEECH ENHANCEMENT IN MICROPHONE ARRAYS
Furnon, Nicolas
Serizel, Romain
Illina, Irina
Essid, Slim
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4672 - 4676
[10] ON USING HETEROGENEOUS DATA FOR VEHICLE-BASED SPEECH RECOGNITION: A DNN-BASED APPROACH
Feng, Xue
Richardson, Brigitte
Amman, Scott
Glass, James
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4385 - 4389

← 1 2 3 4 5 →