An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition

被引：0

作者：

Nishiura, T ^{[1
]}

Nakayama, M ^{[1
]}

Nakamura, S ^{[1
]}

机构：

[1] ATR Spoken Language Translat Res Labs, Kyoto 6190288, Japan

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Distant-talking speech recognition in noisy environments is indispensable for self-moving robots or tele-conference systems. However, background noise and room reverberations seriously degrade the sound-capture quality in real acoustic environments. A microphone array is an ideal candidate as an effective method for capturing distant-talking speech. AMNOR (Adaptive Microphone-array for NOise Reduction) was proposed as an adaptive beamformer for capturing the desired distant signals in noisy environments by Kaneda et al. Although the AMNOR has been proven effective, it can be further improved if we know the spectrum characteristics of the desired distant signals in advance. Therefore, we regarded speech as a desired distant signal and designed an AMNOR based on the average speech spectrum. In this paper, we particularly focused on the performance of AMNOR based on the average speech spectrum for distant-talking speech capture and recognition. As a result of evaluation experiments in real acoustic environments, we confirmed that the ASR (Automatic Speech Recognition) performance was improved 5 - 10% by using an AMNOR based on the average speech spectrum in noisy environments. In addition, the proposed AMNOR provides better noise reduction performance than that of conventional AMNOR.

引用

页码：668 / 671

页数：4

共 50 条

[1] An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition
Nishiura, T
Nakayama, M
Nakamura, S
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 209 - 212
[2] Noisy speech recognition based on speech enhancement
Wang, Xia
Tang, Hongmei
Zhao, Xiaoqun
[J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 713 - +
[3] SPEECH RECOGNITION WITH NO SPEECH OR WITH NOISY SPEECH
Krishna, Gautam
Co Tran
Yu, Jianguo
Tewfik, Ahmed H.
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1090 - 1094
[4] Effect of Steering Vector Estimation on MVDR Beamformer for Noisy Speech Recognition
Sun, Xingwei
Wang, Ziteng
Xia, Risheng
Li, Junfeng
Yan, Yonghong
[J]. 2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[5] Advancing Speech Recognition With No Speech Or With Noisy Speech
Krishna, Gautam
Tran, Co
Carnahan, Mason
Tewfik, Ahmed
[J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
[6] Noise Reduction and Evaluation of Speech Recognition Performance by the Adaptive Beamformer with Four Microphone for Wearable Speech Translation Device
Hayashida, Kohhei
Nishikawa, Tsuyoki
Nomura, Kazuya
Kanamori, Takeo
Aoyama, Takanori
[J]. 2017 IEEE 6TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2017,
[7] Noise reduction and evaluation of speech recognition performance by the adaptive beamformer with four microphone for wearable speech translation device
[J]. 2017, Institute of Electrical and Electronics Engineers Inc., United States (2017-January):
[8] A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION
Nian, Zhaoxu
Tu, Yan-Hui
Du, Jun
Lee, Chin-Hui
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6913 - 6917
[9] Near-field adaptive beamformer for robust speech recognition
McCowan, IA
Moore, DC
Sridharan, S
[J]. DIGITAL SIGNAL PROCESSING, 2002, 12 (01) : 87 - 106
[10] Perceptual speech modeling for noisy speech recognition
Wu, CH
Chiu, YH
Lim, H
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 385 - 388

← 1 2 3 4 5 →