A Dual-Channel Three-Stage Model for DoA and Speech Enhancement

被引:0
|
作者
Wu, Meng-Hsuan [1 ,2 ]
Shen, Yih-Liang [1 ]
Chou, Hsuan-Cheng [1 ]
Shih, Bo-Wun [2 ]
Chi, Tai-Shih [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Dept Elect & Elect Engn, Hsinchu, Taiwan
[2] Realtek Semicond Corp, Hsinchu, Taiwan
关键词
TIME-FREQUENCY MASKING; DEEP NEURAL-NETWORKS; ALGORITHM;
D O I
10.1109/APSIPAASC58517.2023.10317282
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
During the pandemic, teleconferencing becomes a necessity to our daily lives. It drives the demand for an integrated system which is not only able to effectively enhance speech sounds, but also to localize the speaker for video enhancement. In this paper, we propose a neural network based composite system which integrates a DoA estimator and a neural beamformer for dual-channel speech enhancement. The proposed system can accomplish two tasks at the same time by using sound signals received from dual microphones. The estimated DoA is converted into a spatial angle related feature, which provides complementary information to boost performance of the neural beamformer. The proposed system is evaluated in simulated far-field conditions with reverberations and noise. Simulation results demonstrate the proposed system outperforms stand-alone baseline systems in either one of the two tasks and achieves comparable results to the best stand-alone models in either one of the two tasks.
引用
收藏
页码:1064 / 1068
页数:5
相关论文
共 50 条
  • [1] Dual-channel speech enhancement by superdirective beamforming
    Lotter, Thomas
    Vary, Peter
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1) : 1 - 14
  • [2] Dual-Channel Speech Enhancement by Superdirective Beamforming
    Thomas Lotter
    Peter Vary
    [J]. EURASIP Journal on Advances in Signal Processing, 2006
  • [3] Three-stage hybrid neural beamformer for multi-channel speech enhancement
    Kuang, Kelan
    Yang, Feiran
    Li, Junfeng
    Yang, Jun
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (06): : 3378 - 3389
  • [4] A Dual-channel Speech Enhancement Method for Cellular Communication
    Nabi, Wahbi
    Aloui, Noureddine
    Cherif, Adnane
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [5] Dual-channel speech intelligibility enhancement based on the psychoacoustics
    Lee, Sang-Hoon
    Jeong, Hong
    [J]. LECTURE NOTES IN SIGNAL SCIENCE, INTERNET AND EDUCATION (SSIP'07/MIV'07/DIWEB'07), 2007, : 83 - +
  • [6] Three-stage hybrid system for speech signal enhancement
    Kirubagari, B.
    Palanivel, S.
    [J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2015, 8 (1-2) : 123 - 136
  • [7] Improved Particle Swarm Optimization for Dual-Channel Speech Enhancement
    Asl, Laleh Badri
    Nezhad, Vahid Majid
    [J]. 2010 INTERNATIONAL CONFERENCE ON SIGNAL ACQUISITION AND PROCESSING: ICSAP 2010, PROCEEDINGS, 2010, : 13 - 17
  • [8] Dual-channel DNN-based Speech Enhancement for Smartphones
    Martin-Donas, Juan M.
    Gomez, Angel M.
    Lopez-Espejo, Ivan
    Peinado, Antonio M.
    [J]. 2017 IEEE 19TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2017,
  • [9] Dual-Channel Speech Enhancement Using Neural Network Adaptive Beamforming
    Jiang, Tao
    Liu, Hongqing
    Shuai, Chenhao
    Wang, Mingtian
    Zhou, Yi
    Gan, Lu
    [J]. COMMUNICATIONS AND NETWORKING (CHINACOM 2021), 2022, : 497 - 506
  • [10] DUAL-CHANNEL ITERATIVE SPEECH ENHANCEMENT WITH CONSTRAINTS ON AN AUDITORY-BASED SPECTRUM
    NANDKUMAR, S
    HANSEN, JHL
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01): : 22 - 34