A Dual-Channel Three-Stage Model for DoA and Speech Enhancement

被引：0

作者：

Wu, Meng-Hsuan ^{[1
,2
]}

Shen, Yih-Liang ^{[1
]}

Chou, Hsuan-Cheng ^{[1
]}

Shih, Bo-Wun ^{[2
]}

Chi, Tai-Shih ^{[1
]}

机构：

[1] Natl Yang Ming Chiao Tung Univ, Dept Elect & Elect Engn, Hsinchu, Taiwan

[2] Realtek Semicond Corp, Hsinchu, Taiwan

来源：

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC | 2023年

关键词：

TIME-FREQUENCY MASKING; DEEP NEURAL-NETWORKS; ALGORITHM;

D O I：

10.1109/APSIPAASC58517.2023.10317282

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

During the pandemic, teleconferencing becomes a necessity to our daily lives. It drives the demand for an integrated system which is not only able to effectively enhance speech sounds, but also to localize the speaker for video enhancement. In this paper, we propose a neural network based composite system which integrates a DoA estimator and a neural beamformer for dual-channel speech enhancement. The proposed system can accomplish two tasks at the same time by using sound signals received from dual microphones. The estimated DoA is converted into a spatial angle related feature, which provides complementary information to boost performance of the neural beamformer. The proposed system is evaluated in simulated far-field conditions with reverberations and noise. Simulation results demonstrate the proposed system outperforms stand-alone baseline systems in either one of the two tasks and achieves comparable results to the best stand-alone models in either one of the two tasks.

引用

页码：1064 / 1068

页数：5

共 50 条

[1] Dual-channel speech enhancement by superdirective beamforming
Lotter, Thomas
Vary, Peter
[J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1) : 1 - 14
[2] Dual-Channel Speech Enhancement by Superdirective Beamforming
Thomas Lotter
Peter Vary
[J]. EURASIP Journal on Advances in Signal Processing, 2006
[3] Three-stage hybrid neural beamformer for multi-channel speech enhancement
Kuang, Kelan
Yang, Feiran
Li, Junfeng
Yang, Jun
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (06): : 3378 - 3389
[4] A Dual-channel Speech Enhancement Method for Cellular Communication
Nabi, Wahbi
Aloui, Noureddine
Cherif, Adnane
[J]. 2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
[5] Dual-channel speech intelligibility enhancement based on the psychoacoustics
Lee, Sang-Hoon
Jeong, Hong
[J]. LECTURE NOTES IN SIGNAL SCIENCE, INTERNET AND EDUCATION (SSIP'07/MIV'07/DIWEB'07), 2007, : 83 - +
[6] Three-stage hybrid system for speech signal enhancement
Kirubagari, B.
Palanivel, S.
[J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2015, 8 (1-2) : 123 - 136
[7] Improved Particle Swarm Optimization for Dual-Channel Speech Enhancement
Asl, Laleh Badri
Nezhad, Vahid Majid
[J]. 2010 INTERNATIONAL CONFERENCE ON SIGNAL ACQUISITION AND PROCESSING: ICSAP 2010, PROCEEDINGS, 2010, : 13 - 17
[8] Dual-channel DNN-based Speech Enhancement for Smartphones
Martin-Donas, Juan M.
Gomez, Angel M.
Lopez-Espejo, Ivan
Peinado, Antonio M.
[J]. 2017 IEEE 19TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2017,
[9] Dual-Channel Speech Enhancement Using Neural Network Adaptive Beamforming
Jiang, Tao
Liu, Hongqing
Shuai, Chenhao
Wang, Mingtian
Zhou, Yi
Gan, Lu
[J]. COMMUNICATIONS AND NETWORKING (CHINACOM 2021), 2022, : 497 - 506
[10] DUAL-CHANNEL ITERATIVE SPEECH ENHANCEMENT WITH CONSTRAINTS ON AN AUDITORY-BASED SPECTRUM
NANDKUMAR, S
HANSEN, JHL
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 22 - 34

← 1 2 3 4 5 →