VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A NOISE-AWARE ENCODER

被引:21
|
作者
Fang, Huajian [1 ,2 ]
Carbajal, Guillaume [1 ]
Wermter, Stefan [2 ]
Gerkmann, Timo [1 ]
机构
[1] Univ Hamburg, Signal Proc SP, Hamburg, Germany
[2] Univ Hamburg, Knowledge Technol WTM, Hamburg, Germany
关键词
speech enhancement; generative model; variational autoencoder; semi-supervised learning;
D O I
10.1109/ICASSP39728.2021.9414060
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, a generative variational autoencoder (VAE) has been proposed for speech enhancement to model speech statistics. However, this approach only uses clean speech in the training phase, making the estimation particularly sensitive to noise presence, especially in low signal-to-noise ratios (SNRs). To increase the robustness of the VAE, we propose to include noise information in the training phase by using a noise-aware encoder trained on noisy-clean speech pairs. We evaluate our approach on real recordings of different noisy environments and acoustic conditions using two different noise datasets. We show that our proposed noise-aware VAE outperforms the standard VAE in terms of overall distortion without increasing the number of model parameters. At the same time, we demonstrate that our model is capable of generalizing to unseen noise conditions better than a supervised feedforward deep neural network (DNN). Furthermore, we demonstrate the robustness of the model performance to a reduction of the noisy-clean speech training data size.
引用
收藏
页码:676 / 680
页数:5
相关论文
共 50 条
  • [1] NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement
    Wang, Wen
    Yang, Dongchao
    Ye, Qichen
    Cao, Bowen
    Zou, Yuexian
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2416 - 2423
  • [2] NAAGN: Noise-aware Attention-gated Network for Speech Enhancement
    Deng, Feng
    Jiang, Tao
    Wang, Xiao-Rui
    Zhang, Chen
    Li, Yan
    [J]. INTERSPEECH 2020, 2020, : 2457 - 2461
  • [3] A Noise-aware Enhancement Method for Underexposed Images
    Chien, Chien-Cheng
    Kinoshita, Yuma
    Kiya, Hitoshi
    [J]. 2019 4TH IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - ASIA (IEEE ICCE-ASIA 2019), 2019, : 131 - 134
  • [4] Vector-quantized Variational Autoencoder for Phase-aware Speech Enhancement
    Tuan Vu Ho
    Quoc Huy Nguyen
    Akagi, Masato
    Unoki, Masashi
    [J]. INTERSPEECH 2022, 2022, : 176 - 180
  • [5] A RECURRENT VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT
    Leglaive, Simon
    Alameda-Pineda, Xavier
    Girin, Laurent
    Horaud, Radu
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 371 - 375
  • [6] POSTFILTERING USING AN ADVERSARIAL DENOISING AUTOENCODER WITH NOISE-AWARE TRAINING
    Tawara, Naohiro
    Tanabe, Hikari
    Kobayashi, Tetsunori
    Fujieda, Masaru
    Katagiri, Kazuhiro
    Yazu, Takashi
    Ogawa, Tetsuji
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3282 - 3286
  • [8] Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
    Lee, Hung-Shin
    Chen, Pin-Yuan
    Cheng, Yao-Fei
    Tsao, Yu
    Wang, Hsin-Min
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 145 - 149
  • [9] A Noise-Aware Memory-Attention Network Architecture for Regression-Based Speech Enhancement
    Wang, Yu-Xuan
    Du, Jun
    Chai, Li
    Lee, Chin-Hui
    Pan, Jia
    [J]. INTERSPEECH 2020, 2020, : 4501 - 4505
  • [10] Speech Enhancement Using Dynamical Variational AutoEncoder
    Do, Hao D.
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II, 2023, 13996 : 247 - 258