DEEP MIXTURE DENSITY NETWORK FOR STATISTICAL MODEL-BASED FEATURE ENHANCEMENT

被引：0

作者：

Kinoshita, Keisuke ^{[1
]}

Delcroix, Mare ^{[1
]}

Ogawa, Atsunori ^{[1
]}

Higuehi, Takuya ^{[1
]}

Nakatani, Tomohiro ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Tokyo, Japan

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2017年

关键词：

Mixture density network; model-based feature enhancement; conditional density; VECTOR TAYLOR-SERIES; SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose a novel framework designed to extend conventional deep neural network (DNN)-based feature enhancement approaches. In general, the conventional DNN-based feature enhancement framework aims to map input noisy observation to clean speech or a binary! soft mask in a deterministic way, assuming that there is oneto- one mapping between the input and the output without any uncertainty. However, when we consider that the general feature enhancement problem to be an ill-posed inverse problem where the mapping cannot be uniquely determined given an input signal, the assumption in the conventional approaches is not theoretically correct and potentially limits the performance ofDNN-based feature enhancement. To overcome this problem, this paper proposes utilizing a mixture density network (MDN), which is a neural network that maps an input feature to a set of Gaussian mixture model (GMM) parameters representing the distribution of a target variable. By estimating the distribution of clean speech feature based on MDN, we are now able to explicitly consider the uncertainty in the parameter estimation. Then, we further utilizes the estimated GMM to obtain a refined clean speech estimate in the framework of statistical model-based feature enhancement. In this paper, after detailing the proposed framework and the MDN, we show mathematically and experimentally how MDN appropriately models the uncertainty information. We also show that the proposed method can outperform a conventional DNN-based feature enhancement method.

引用

页码：251 / 255

页数：5

共 50 条

[1] IMPROVING STATISTICAL MODEL-BASED SPEECH ENHANCEMENT WITH DEEP NEURAL NETWORKS
Borgstrom, Bengt J.
Brandstein, Michael S.
Dunn, Robert B.
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 471 - 475
[2] Gaussian mixture model-based contrast enhancement
Abdoli, Mohsen
Sarikhani, Hossein
Ghanbari, Mohammad
Brault, Patrice
IET IMAGE PROCESSING, 2015, 9 (07) : 569 - 577
[3] Model-Based Feature Enhancement for Reverberant Speech Recognition
Krueger, Alexander
Haeb-Umbach, Reinhold
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (07): : 1692 - 1707
[4] Model-based feature enhancement for noisy speech recognition
Couvreur, C
Van hamme, H
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1719 - 1722
[5] Gaussian mixture model-based target feature extraction and visualization
Ma, Ji
Chen, Jinjin
Chen, Liye
Zhou, Xingjian
Qin, Xujia
Tang, Ying
Sun, Guodao
Chen, Jiazhou
JOURNAL OF VISUALIZATION, 2021, 24 (03) : 545 - 563
[6] Gaussian mixture model-based target feature extraction and visualization
Ji Ma
Jinjin Chen
Liye Chen
Xingjian Zhou
Xujia Qin
Ying Tang
Guodao Sun
Jiazhou Chen
Journal of Visualization, 2021, 24 : 545 - 563
[7] Deep Photo: Model-Based Photograph Enhancement and Viewing
Kopf, Johannes
Neubert, Boris
Chen, Billy
Cohen, Michael
Cohen-Or, Daniel
Deussen, Oliver
Uyttendaele, Matt
Lischinski, Dani
ACM TRANSACTIONS ON GRAPHICS, 2008, 27 (05):
[8] Gaussian model-based statistical matching for image enhancement and segmentation
Zheng, Yufeng
VISUAL INFORMATION PROCESSING XVII, 2008, 6978
[9] Finite mixture models and model-based clusteringFinite mixture models and model-based clustering
Melnykov, Volodymyr
Maitra, Ranjan
STATISTICS SURVEYS, 2010, 4 : 80 - 116
[10] Model-Based Deep Network for Single Image Deraining
Li, Pengyue
Tian, Jiandong
Tang, Yandong
Wang, Guolin
Wu, Chengdong
IEEE ACCESS, 2020, 8 (14036-14047) : 14036 - 14047

← 1 2 3 4 5 →