A Mask Estimation Method Integrating Data Field Model for Speech Enhancement

被引:0
|
作者
Wang, Xianyun [1 ]
Bao, Changchun [1 ]
Bao, Feng [2 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
[2] Univ Auckland, Dept Elect & Comp Engn, Auckland 1142, New Zealand
基金
中国国家自然科学基金;
关键词
Data field; CASA; Ratio mask; Speech enhancement; NOISE;
D O I
10.21437/Interspeech.2017-271
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In most approaches based on computational auditory scene analysis (CASA), the ideal binary mask (IBM) is often used for noise reduction. However, it is almost impossible to obtain the IBM result. The error in IBM estimation may greatly violate smooth evolution nature of speech because of the energy absence in many speech-dominated time-frequency (T-F) units. To reduce the error, the ideal ratio mask (IRM) via modeling the spatial dependencies of speech spectrum is used as an optimal target mask because the predictive ratio mask is less sensitive to the error than the predictive binary mask. In this paper, we introduce a data field (DF) to model the spatial dependencies of the cochleagram for obtaining the ratio mask. Firstly, initial T-F units of noise and speech are obtained from noisy speech. Then we can calculate the forms of the potentials of noise and speech. Subsequently, their optimal potentials which reflect their respective distribution of potential field are obtained by the optimal influence factors of speech and noise. Finally, we exploit the potentials of speech and noise to obtain the ratio mask. Experimental results show that the proposed method can obtain a better performance than the reference methods in speech quality.
引用
收藏
页码:1904 / 1908
页数:5
相关论文
共 50 条
  • [1] IRM estimation based on data field of cochleagram for speech enhancement
    Wang Xianyun
    Feng Bao
    Bao Changchun
    [J]. SPEECH COMMUNICATION, 2018, 97 : 19 - 31
  • [2] Auditory Mask Estimation by RPCA for Monaural Speech Enhancement
    Shi, Wenhua
    Zhang, Xiongwei
    Zou, Xia
    Han, Wei
    Min, Gang
    [J]. 2017 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2017), 2017, : 179 - 184
  • [3] Mask estimation method in the spherical harmonic domain used by adaptive beamforming for speech enhancement
    Ke, Yuxuan
    Li, Jian
    Peng, Renhua
    Zheng, Chengshi
    Li, Xiaodong
    [J]. Shengxue Xuebao/Acta Acustica, 2021, 46 (01): : 67 - 80
  • [4] Integrating Binary Mask Estimation With MRF Priors of Cochleagram for Speech Separation
    Liang, Shan
    Liu, Wenju
    Jiang, Wei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (10) : 627 - 630
  • [5] Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
    Pfeifenberger, Lukas
    Zoehrer, Matthias
    Pernkopf, Franz
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2162 - 2172
  • [6] Speech Enhancement Method with Geometric Phase Estimation By Incorporating MIXMAX Model
    Wang, Xianyun
    Bao, Changchun
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [7] Mask estimation incorporating phase-sensitive information for speech enhancement
    Wang, Xianyun
    Bao, Changchun
    [J]. APPLIED ACOUSTICS, 2019, 156 : 101 - 112
  • [8] A Dual Microphone Speech Enhancement Method with A Smoothing Parameter Mask
    Jiang, Yi
    Liu, Runsheng
    [J]. 2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [9] Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment
    Choi, Gab-Keun
    Kim, Soon-Hyob
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2010, 29 (07): : 468 - 474
  • [10] Variance based time-frequency mask estimation for unsupervised speech enhancement
    Nasir Saleem
    Muhammad Irfan Khattak
    Gunawan Witjaksono
    Gulzar Ahmad
    [J]. Multimedia Tools and Applications, 2019, 78 : 31867 - 31891