Speech Enhancement Using Dynamical Variational AutoEncoder

被引:0
|
作者
Do, Hao D. [1 ]
机构
[1] FPT Univ, Ho Chi Minh City, Vietnam
关键词
speech enhancement; dynamical variational autoEncoder; generative model;
D O I
10.1007/978-981-99-5837-5_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.
引用
收藏
页码:247 / 258
页数:12
相关论文
共 50 条
  • [21] Predicting Head Pose from Speech with a Conditional Variational Autoencoder
    Greenwood, David
    Laycock, Stephen
    Matthews, Iain
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3991 - 3995
  • [22] Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder
    Akuzawa, Kei
    Iwasawa, Yusuke
    Matsuo, Yutaka
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3067 - 3071
  • [23] Variational Bayesian method for speech enhancement
    Huang, Qinghua
    Yang, Jie
    Zhou, Yue
    NEUROCOMPUTING, 2007, 70 (16-18) : 3063 - 3067
  • [24] Modular Dynamic Deep Denoising Autoencoder for Speech Enhancement
    Safari, Razieh
    Ahadi, Seyed Mohammad
    Seyedin, Sanaz
    PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2017, : 254 - 259
  • [25] Towards speech enhancement using a variational U-Net architecture
    Nustede, Eike J.
    Anemueller, Joern
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 481 - 485
  • [26] Predictive Auxiliary Variational Autoencoder for Representation Learning of Global Speech Characteristics
    Springenberg, Sebastian
    Lakomkin, Egor
    Weber, Cornelius
    Wermter, Stefan
    INTERSPEECH 2019, 2019, : 934 - 938
  • [27] Learning and controlling the source-filter representation of speech with a variational autoencoder
    Sadok, Samir
    Leglaive, Simon
    Girin, Laurent
    Alameda-Pineda, Xavier
    Seguier, Renaud
    SPEECH COMMUNICATION, 2023, 148 : 53 - 65
  • [28] Learning robust speech representation with an articulatory-regularized variational autoencoder
    Georges, Marc-Antoine
    Girin, Laurent
    Schwartz, Jean-Luc
    Hueber, Thomas
    INTERSPEECH 2021, 2021, : 3345 - 3349
  • [29] Variational Bayesian learning for speech modeling and enhancement
    Huang, Qinghua
    Yang, Jie
    Wei, Shoushui
    SIGNAL PROCESSING, 2007, 87 (09) : 2026 - 2035
  • [30] A robust variational autoencoder using beta divergence
    Akrami, Haleh
    Joshi, Anand A.
    Li, Jian
    Aydore, Sergul
    Leahy, Richard M.
    KNOWLEDGE-BASED SYSTEMS, 2022, 238