A CONVEX OPTIMIZATION APPROACH FOR TIME-FREQUENCY MASK ESTIMATION

被引：0

作者：

Bao, Feng ^{[1
]}

Abdulla, Waleed H. ^{[1
]}

机构：

[1] Univ Auckland, Elect & Comp Engn Dept, 20 Symond St, Auckland 1010, New Zealand

来源：

2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2017年

关键词：

Computational auditory scene analysis (CASA); Ideal binary mask (IBM); Convex optimization; Speech enhancement; SPEECH; NOISE; ENHANCEMENT;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a new time-frequency mask method for computational auditory scene analysis (CASA) based on convex optimization of the binary mask. In the proposed method, the pitch estimation and segment segregation in conventional CASA are completely replaced by the convex optimization of speech power. Considering the cross-correlation between the power spectra of noisy speech and noise in each of a Gammatone filterbank channel, the objective function of speech power used for convex optimization is built. The speech power is estimated by gradient descent method. Thus, the time-frequency units dominated by speech and noise are labeled by comparing the powers of noisy and estimated speech, and noise. The erroneous local masks are also removed by using the Teager energy of the estimated speech and time-frequency unit smoothing. The results from the average segmental signal-to-noise ratio improvement, HIT-False Alarm rate and subjective test show that the performance of the proposed method outperforms the reference methods.

引用

页码：31 / 35

页数：5

共 50 条

[1] A new time-frequency binary mask estimation method based on convex optimization of speech power
Bao, Feng
Abdulla, Waleed H.
[J]. SPEECH COMMUNICATION, 2018, 97 : 51 - 65
[2] Variance based time-frequency mask estimation for unsupervised speech enhancement
Nasir Saleem
Muhammad Irfan Khattak
Gunawan Witjaksono
Gulzar Ahmad
[J]. Multimedia Tools and Applications, 2019, 78 : 31867 - 31891
[3] Variance based time-frequency mask estimation for unsupervised speech enhancement
Saleem, Nasir
Khattak, Muhammad Irfan
Witjaksono, Gunawan
Ahmad, Gulzar
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31867 - 31891
[4] A data-driven approach for estimating the time-frequency binary mask
Kim, Gibak
Loizou, Philipos C.
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 884 - 887
[5] Spectrographic Speech Mask Estimation Using the Time-Frequency Correlation of Speech Presence
Zhan, Ge
Huang, Zhaoqiong
Ying, Dongwen
Pan, Jielin
Yan, Yonghong
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2287 - 2291
[6] ON TIME-FREQUENCY MASK ESTIMATION FOR MVDR BEAMFORMING WITH APPLICATION IN ROBUST SPEECH RECOGNITION
Xiao, Xiong
Zhao, Shengkui
Jones, Douglas L.
Chng, Eng Siong
Li, Haizhou
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 3246 - 3250
[7] Carrier Frequency Estimation of Time-Frequency Overlapped MASK Signals for Underlay Cognitive Radio Network
Liu, Mingqian
Zhang, Junlin
Lin, Yun
Wu, Zhen
Shang, Bodong
Gong, Fengkui
[J]. IEEE ACCESS, 2019, 7 : 58277 - 58285
[8] DIRECTION OF ARRIVAL ESTIMATION IN HIGHLY REVERBERANT ENVIRONMENTS USING SOFT TIME-FREQUENCY MASK
Tourbabin, Vladimir
Donley, Jacob
Rafaely, Boaz
Mehra, Ravish
[J]. 2019 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2019, : 383 - 387
[9] AUGMENTED TIME-FREQUENCY MASK ESTIMATION IN CLUSTER-BASED SOURCE SEPARATION ALGORITHMS
Luo, Yi
Mesgarani, Nima
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 710 - 714
[10] TIME DELAY ESTIMATION IN THE TIME-FREQUENCY DOMAIN BASED ON A LINE DETECTION APPROACH
Sandmair, Andreas
Lietz, Mario
Stefan, Johannes
Leon, Fernando Puente
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2716 - 2719

← 1 2 3 4 5 →