Gamma modeling of speech power and its on-line estimation for statistical speech enhancement

被引：5

作者：

Dat, TH ^{[1
]}

Takeda, K

Itakura, F

机构：

[1] Nagoya Univ, Grad Sch Informat Sci, Nagoya, Aichi 4648603, Japan

[2] Meijo Univ, Grad Sch Informat Engn, Nagoya, Aichi 4688502, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2006年 / E89D卷 / 03期

关键词：

speech enhancement; speech recognition; gamma modeling; fourth-order moment; MMSE; MAP; spectral magnitude; power; log-spectral magnitude;

D O I：

10.1093/ietisy/e89-d.3.1040

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This study shows the effectiveness of using gamma distribution in the speech power domain as a more general prior distribution for the model-based speech enhancement approaches. This model is a super-set of the conventional Gaussian model of the complex spectrum and provides more accurate prior modeling when the optimal parameters are estimated. We develop a method to adapt the modeled distribution parameters from each actual noisy speech in a frame-by-frame manner. Next, we derive and investigate the minimum mean square error (MMSE) and maximum a posterior probability (MAP) estimations in different domains of speech spectral magnitude, generalized power and its logarithm, using the proposed gamma modeling. Finally, a comparative evaluation of the MAP and MMSE filters is conducted. As the MMSE estimations tend to more complicated using more general prior distributions, the MAP estimations are given in closed-form extractions and therefore are suitable in the implementation. The adaptive estimation of the modeled distribution parameters provides more accurate prior modeling and this is the principal merit of the proposed method and the reason for the better performance. From the experiments, the MAP estimation is recommended due to its high efficiency and low complexity. Among the MAP based systems, the estimation in log-magnitude domain is shown to be the best for the speech recognition as the estimation in power domain is superior for the noise reduction.

引用

页码：1040 / 1049

页数：10

共 50 条

[1] Generalized gamma modeling of speech and its online estimation for speech enhancement
Dat, TH
Takeda, K
Itakura, F
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 181 - 184
[2] Speech enhancement with Gamma speech modeling
Zou, Xia
Chen, Liang
Zhang, Xiong-Wei
[J]. Tongxin Xuebao/Journal on Communications, 2006, 27 (10): : 118 - 123
[3] Speech and noise power estimation using Gamma modeling
Chehrehsa, Sarang
Moir, Tom James
[J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2017, 31 (10) : 1491 - 1502
[4] On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement
Dat, Tran Huy
Takeda, Kazuya
Itakura, Fumitada
[J]. SPEECH COMMUNICATION, 2006, 48 (11) : 1515 - 1527
[5] On-line noise coherence estimation algorithm for binaural speech enhancement system
Ji, Youna
Baek, Yong-hyun
Park, Young-cheol
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2016, 35 (03): : 234 - 242
[6] Joint Soft Threshold and Statistical Estimation for Speech Enhancement
Van Khanh Mai
Pastor, Dominique
Aissa-El-Bey, Abdeldjalil
[J]. 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 249 - 253
[7] GENERALIZED CEPSTRAL MODELING OF DEGRADED SPEECH AND ITS APPLICATION TO SPEECH ENHANCEMENT
KANNO, T
KOBAYASHI, T
IMAI, S
[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (08) : 1300 - 1367
[8] NOISE POWER SPECTRUM ESTIMATION BASED ON WEAK SPEECH PROTECTION FOR SPEECH ENHANCEMENT
Feng, Yan
An, Baokun
[J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 484 - 487
[9] Hidden Markov modeling in on-line dysarthric speech recognition
Kostov, A
Chen, FX
Beliveau, C
[J]. ADVANCEMENT OF ASSISTIVE TECHNOLOGY, 1997, 3 : 195 - 199
[10] STATISTICAL MODELING AND AUTOMATIC PARAMETER ESTIMATION IN SPEECH RECOGNITION
BAHL, LR
BAKER, JK
JELINEK, F
MERCER, RL
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S96 - S96

← 1 2 3 4 5 →