LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR SPEECH ENHANCEMENT VIA ADVERSARIAL TRAINING

被引:8
|
作者
Hou, Nana [1 ]
Xu, Chenglin [2 ]
Chng, Eng Siong [1 ,3 ]
Li, Haizhou [2 ,4 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[3] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
[4] Univ Bremen, Machine Listening Lab, Bremen, Germany
基金
新加坡国家研究基金会;
关键词
Disentangled feature learning; adversarial training; speech enhancement;
D O I
10.1109/ICASSP39728.2021.9413512
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Neural speech enhancement degrades significantly in face of unseen noise. To address such mismatch, we propose to learn noise-agnostic feature representations by disentanglement learning, which removes the unspecified noise factor, while keeping the specified factors of variation associated with the clean speech. Specifically, a discriminator module is introduced to distinguish the type of noises, which is referred to as the disentangler. With the adversarial training strategy, a gradient reversal layer seeks to disentangle the noise factor and remove it from the feature representation. Experiment results show that the proposed approach achieves 5.8% and 5.2% relative improvements over the best baseline in terms of perceptual evaluation of the speech quality (PESQ) and segmental signal-to-noise ratio (SSNR), respectively. The ablation study indicates that the proposed disentangler module is also effective in other encoder-decoder-like structures.
引用
收藏
页码:666 / 670
页数:5
相关论文
共 50 条
  • [1] Disentangled Feature Learning for Noise-Invariant Speech Enhancement
    Bae, Soo Hyun
    Choi, Inkyu
    Kim, Nam Soo
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (11):
  • [2] Learning Interpretable Disentangled Representations Using Adversarial VAEs
    Sarhan, Mhd Hasan
    Eslami, Abouzar
    Navab, Nassir
    Albarqouni, Shadi
    [J]. DOMAIN ADAPTATION AND REPRESENTATION TRANSFER AND MEDICAL IMAGE LEARNING WITH LESS LABELS AND IMPERFECT DATA, DART 2019, MIL3ID 2019, 2019, 11795 : 37 - 44
  • [3] Adversarial Learning of Disentangled and Generalizable Representations of Visual Attributes
    Oldfield, James
    Panagakis, Yannis
    Nicolaou, Mihalis A.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3498 - 3509
  • [4] LEARNING DISENTANGLED FEATURE REPRESENTATIONS FOR ANOMALY DETECTION
    Lee, Wei-Yu
    Wang, Yu-Chiang Frank
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2156 - 2160
  • [5] Achieving Robustness in the Wild via Adversarial Mixing with Disentangled Representations
    Gowal, Sven
    Qin, Chongli
    Huang, Po-Sen
    Cemgil, Taylan
    Dvijotham, Krishnamurthy
    Mann, Timothy
    Kohli, Pushmeet
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1208 - 1217
  • [6] Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations
    Benaroya, Laurent
    Obin, Nicolas
    Roebel, Axel
    [J]. ENTROPY, 2023, 25 (02)
  • [7] An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations
    Wang, Mengjiao
    Shu, Zhixin
    Cheng, Shiyang
    Panagakis, Yannis
    Samaras, Dimitris
    Zafeiriou, Stefanos
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (6-7) : 743 - 762
  • [8] An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations
    Mengjiao Wang
    Zhixin Shu
    Shiyang Cheng
    Yannis Panagakis
    Dimitris Samaras
    Stefanos Zafeiriou
    [J]. International Journal of Computer Vision, 2019, 127 : 743 - 762
  • [9] Adversarial Training Helps Transfer Learning via Better Representations
    Deng, Zhun
    Zhang, Linjun
    Vodrahalli, Kailas
    Kawaguchi, Kenji
    Zou, James
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] Domain Adversarial Training for Speech Enhancement
    Hou, Nana
    Xu, Chenglin
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 667 - 672