Gamma modeling of speech power and its on-line estimation for statistical speech enhancement

被引:5
|
作者
Dat, TH [1 ]
Takeda, K
Itakura, F
机构
[1] Nagoya Univ, Grad Sch Informat Sci, Nagoya, Aichi 4648603, Japan
[2] Meijo Univ, Grad Sch Informat Engn, Nagoya, Aichi 4688502, Japan
来源
关键词
speech enhancement; speech recognition; gamma modeling; fourth-order moment; MMSE; MAP; spectral magnitude; power; log-spectral magnitude;
D O I
10.1093/ietisy/e89-d.3.1040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study shows the effectiveness of using gamma distribution in the speech power domain as a more general prior distribution for the model-based speech enhancement approaches. This model is a super-set of the conventional Gaussian model of the complex spectrum and provides more accurate prior modeling when the optimal parameters are estimated. We develop a method to adapt the modeled distribution parameters from each actual noisy speech in a frame-by-frame manner. Next, we derive and investigate the minimum mean square error (MMSE) and maximum a posterior probability (MAP) estimations in different domains of speech spectral magnitude, generalized power and its logarithm, using the proposed gamma modeling. Finally, a comparative evaluation of the MAP and MMSE filters is conducted. As the MMSE estimations tend to more complicated using more general prior distributions, the MAP estimations are given in closed-form extractions and therefore are suitable in the implementation. The adaptive estimation of the modeled distribution parameters provides more accurate prior modeling and this is the principal merit of the proposed method and the reason for the better performance. From the experiments, the MAP estimation is recommended due to its high efficiency and low complexity. Among the MAP based systems, the estimation in log-magnitude domain is shown to be the best for the speech recognition as the estimation in power domain is superior for the noise reduction.
引用
收藏
页码:1040 / 1049
页数:10
相关论文
共 50 条
  • [1] Generalized gamma modeling of speech and its online estimation for speech enhancement
    Dat, TH
    Takeda, K
    Itakura, F
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 181 - 184
  • [2] Speech enhancement with Gamma speech modeling
    Zou, Xia
    Chen, Liang
    Zhang, Xiong-Wei
    [J]. Tongxin Xuebao/Journal on Communications, 2006, 27 (10): : 118 - 123
  • [3] Speech and noise power estimation using Gamma modeling
    Chehrehsa, Sarang
    Moir, Tom James
    [J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2017, 31 (10) : 1491 - 1502
  • [4] On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement
    Dat, Tran Huy
    Takeda, Kazuya
    Itakura, Fumitada
    [J]. SPEECH COMMUNICATION, 2006, 48 (11) : 1515 - 1527
  • [5] On-line noise coherence estimation algorithm for binaural speech enhancement system
    Ji, Youna
    Baek, Yong-hyun
    Park, Young-cheol
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2016, 35 (03): : 234 - 242
  • [6] Joint Soft Threshold and Statistical Estimation for Speech Enhancement
    Van Khanh Mai
    Pastor, Dominique
    Aissa-El-Bey, Abdeldjalil
    [J]. 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 249 - 253
  • [7] GENERALIZED CEPSTRAL MODELING OF DEGRADED SPEECH AND ITS APPLICATION TO SPEECH ENHANCEMENT
    KANNO, T
    KOBAYASHI, T
    IMAI, S
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (08) : 1300 - 1367
  • [8] NOISE POWER SPECTRUM ESTIMATION BASED ON WEAK SPEECH PROTECTION FOR SPEECH ENHANCEMENT
    Feng, Yan
    An, Baokun
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 484 - 487
  • [9] Hidden Markov modeling in on-line dysarthric speech recognition
    Kostov, A
    Chen, FX
    Beliveau, C
    [J]. ADVANCEMENT OF ASSISTIVE TECHNOLOGY, 1997, 3 : 195 - 199
  • [10] STATISTICAL MODELING AND AUTOMATIC PARAMETER ESTIMATION IN SPEECH RECOGNITION
    BAHL, LR
    BAKER, JK
    JELINEK, F
    MERCER, RL
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S96 - S96