Enhancement and Noise Statistics Estimation for Non-Stationary Voiced Speech

被引：8

作者：

Norholm, Sidsel Marie ^{[1
]}

Jensen, Jesper Rindom ^{[1
]}

Christensen, Mads Grsboll ^{[1
]}

机构：

[1] Aalborg Univ, AD MT, Audio Anal Lab, Dept Architecture Design & Media Technol, DK-9000 Aalborg, Denmark

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2016年 / 24卷 / 04期

关键词：

Chirp model; harmonic signal model; non-stationary speech; speech enhancement; ALGORITHM; SIGNALS;

D O I：

10.1109/TASLP.2016.2514492

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, single channel speech enhancement in the time domain is considered. We address the problem of modelling non-stationary speech by describing the voiced speech parts by a harmonic linear chirp model instead of using the traditional harmonic model. This means that the speech signal is not assumed stationary, instead the fundamental frequency can vary linearly within each frame. The linearly constrained minimum variance (LCMV) filter and the amplitude and phase estimation (APES) filter are derived in this framework and compared to the harmonic versions of the same filters. It is shown through simulations on synthetic and speech signals, that the chirp versions of the filters perform better than their harmonic counterparts in terms of output signal-to-noise ratio (SNR) and signal reduction factor. For synthetic signals, the output SNR for the harmonic chirp APES based filter is increased 3 dB compared to the harmonic APES based filter at an input SNR of 10 dB, and at the same time the signal reduction factor is decreased. For speech signals, the increase is 1.5 dB along with a decrease in the signal reduction factor of 0.7. As an implicit part of the APES filter, a noise covariance matrix estimate is obtained. We suggest using this estimate in combination with other filters such as the Wiener filter. The performance of the Wiener filter and LCMV filter are compared using the APES noise covariance matrix estimate and a power spectral density (PSD) based noise covariance matrix estimate. It is shown that the APES covariance matrix works well in combination with the Wiener filter, and the PSD based covariance matrix works well in combination with the LCMV filter.

引用

页码：645 / 658

页数：14

共 50 条

[1] Robust Estimation of Non-Stationary Noise Power Spectrum for Speech Enhancement
Mai, Van-Khanh
Pastor, Dominique
Aissa-El-Bey, Abdeldjalil
Le-Bidan, Raphael
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) : 670 - 682
[2] Speech enhancement for non-stationary noise environments
Cohen, I
Berdugo, B
[J]. SIGNAL PROCESSING, 2001, 81 (11) : 2403 - 2418
[3] Speech Enhancement in Non-Stationary Noise Using Compressive Sensing
Sulong, Amart
Gunawan, Teddy Surya
Khalifa, Othman O.
Kartiwi, Mira
[J]. PROCEEDINGS OF 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE 2016), 2016, : 489 - 493
[4] USING A REMOTE WIRELESS MICROPHONE FOR SPEECH ENHANCEMENT IN NON-STATIONARY NOISE
Srinivasan, Sriram
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5088 - 5091
[5] Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments
Singh, Sachin
Tripathy, Manoj
Anand, R. S.
[J]. ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2014, 264 : 545 - 555
[6] Speech enhancement for non-stationary noise environment by adaptive wavelet packet
Chang, S
Kwon, Y
Yang, SI
Kim, IJ
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 561 - 564
[7] PITCH ESTIMATION FOR NON-STATIONARY SPEECH
Christensen, Mads Graesboll
Jensen, Jesper Rindom
[J]. CONFERENCE RECORD OF THE 2014 FORTY-EIGHTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2014, : 1400 - 1404
[8] MODEL-BASED NOISE PSD ESTIMATION FROM SPEECH IN NON-STATIONARY NOISE
Nielsen, Jesper Kjaer
Kavalekalam, Mathew Shaji
Christensen, Mads Graesboll
Boldt, Jesper
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5424 - 5428
[9] Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments
Deng, Feng
Bao, Changchun
Kleijn, W. Bastiaan
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1973 - 1987
[10] SPARSE HMM-BASED SPEECH ENHANCEMENT METHOD FOR STATIONARY AND NON-STATIONARY NOISE ENVIRONMENTS
Deng, Feng
Bao, Chang-chun
Kleijn, W. Bastiaan
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5073 - 5077

← 1 2 3 4 5 →