Speech Enhancement Using Dynamical Variational AutoEncoder

被引:0
|
作者
Do, Hao D. [1 ]
机构
[1] FPT Univ, Ho Chi Minh City, Vietnam
关键词
speech enhancement; dynamical variational autoEncoder; generative model;
D O I
10.1007/978-981-99-5837-5_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.
引用
收藏
页码:247 / 258
页数:12
相关论文
共 50 条
  • [1] A RECURRENT VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT
    Leglaive, Simon
    Alameda-Pineda, Xavier
    Girin, Laurent
    Horaud, Radu
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 371 - 375
  • [2] Whisper Speech Enhancement Using Joint Variational Autoencoder for Improved Speech Recognition
    Agrawal, Vikas
    Kumar, Shashi
    Rath, Shakti P.
    [J]. INTERSPEECH 2021, 2021, : 2706 - 2710
  • [3] Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders
    Bie, Xiaoyu
    Leglaive, Simon
    Alameda-Pineda, Xavier
    Girin, Laurent
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2993 - 3007
  • [4] A Disentangled Recurrent Variational Autoencoder for Speech Enhancement
    Yan, Hegen
    Lu, Zhihua
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1697 - 1702
  • [5] Adaptive Neural Speech Enhancement with a Denoising Variational Autoencoder
    Bando, Yoshiaki
    Sekiguchi, Kouhei
    Yoshii, Kazuyoshi
    [J]. INTERSPEECH 2020, 2020, : 2437 - 2441
  • [6] GUIDED VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A SUPERVISED CLASSIFIER
    Carbajal, Guillaume
    Richter, Julius
    Gerkmann, Timo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 681 - 685
  • [7] A multimodal dynamical variational autoencoder for audiovisual speech representation learning
    Sadok, Samir
    Leglaive, Simon
    Girin, Laurent
    Alameda-Pineda, Xavier
    Seguier, Renaud
    [J]. NEURAL NETWORKS, 2024, 172
  • [8] VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A NOISE-AWARE ENCODER
    Fang, Huajian
    Carbajal, Guillaume
    Wermter, Stefan
    Gerkmann, Timo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 676 - 680
  • [9] Vector-quantized Variational Autoencoder for Phase-aware Speech Enhancement
    Tuan Vu Ho
    Quoc Huy Nguyen
    Akagi, Masato
    Unoki, Masashi
    [J]. INTERSPEECH 2022, 2022, : 176 - 180
  • [10] Speech Source Separation Using Variational Autoencoder and Bandpass Filter
    Do, Hao Duc
    Tran, Son Thai
    Chau, Duc Thanh
    [J]. IEEE ACCESS, 2020, 8 : 156219 - 156231