Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction

被引:7
|
作者
Lin, Ju [1 ]
Niu, Sufeng [2 ]
Wei, Zice [1 ]
Lan, Xiang [1 ]
van Wijngaarden, Adriaan J. [3 ]
Smith, Melissa C. [1 ]
Wang, Kuang-Ching [1 ]
机构
[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA
[2] LinkedIn Inc, Mountain View, CA USA
[3] Nokia, Nokia Bell Labs, Murray Hill, NJ USA
来源
关键词
speech enhancement; generative adversarial network; log-power spectra; NOISE;
D O I
10.21437/Interspeech.2019-2954
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech enhancement techniques that use a generative adversarial network (GAN) can effectively suppress noise while allowing models to be trained end-to-end. However, such techniques directly operate on time-domain waveforms, which are often highly-dimensional and require extensive computation. This paper proposes a novel GAN-based speech enhancement method, referred to as S-ForkGAN, that operates on log-power spectra rather than on time-domain speech waveforms, and uses a forked GAN structure to extract both speech and noise information. By operating on log-power spectra, one can seamlessly include conventional spectral subtraction techniques, and the parameter space typically has a lower dimension. The performance of S-ForkGAN is assessed for automatic speech recognition (ASR) using the TIMIT data set and a wide range of noise conditions. It is shown that S-ForkGAN outperforms existing GAN-based techniques and that it has a lower complexity.
引用
收藏
页码:3163 / 3167
页数:5
相关论文
共 50 条
  • [1] Time-domain speech enhancement using generative adversarial networks
    Pascual, Santiago
    Serra, Joan
    Bonafonte, Antonio
    [J]. SPEECH COMMUNICATION, 2019, 114 : 10 - 21
  • [2] Towards Generalized Speech Enhancement with Generative Adversarial Networks
    Pascual, Santiago
    Serra, Joan
    Bonafonte, Antonio
    [J]. INTERSPEECH 2019, 2019, : 1791 - 1795
  • [3] SPEECH ENHANCEMENT VIA GENERATIVE ADVERSARIAL LSTM NETWORKS
    Xiang, Yang
    Bao, Changchun
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 46 - 50
  • [4] EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Donahue, Chris
    Li, Bo
    Prabhavalkar, Rohit
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5024 - 5028
  • [5] SERGAN: SPEECH ENHANCEMENT USING RELATIVISTIC GENERATIVE ADVERSARIAL NETWORKS WITH GRADIENT PENALTY
    Baby, Deepak
    Verhulst, Sarah
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 106 - 110
  • [6] Enhancement of alaryngeal speech using spectral subtraction
    Pandey, PC
    Bhandarkar, SM
    Bachher, GK
    Lehana, PK
    [J]. DSP 2002: 14TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING PROCEEDINGS, VOLS 1 AND 2, 2002, : 591 - 594
  • [7] A New Method for Improving Generative Adversarial Networks in Speech Enhancement
    Yang, Fan
    Li, Junfeng
    Yan, Yonghong
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [8] Speech Enhancement Based On Spectrogram Conditional Generative Adversarial Networks
    Han, Ru
    Liu, Jianming
    Wang, Mingwen
    [J]. ELEVENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2019), 2020, 11373
  • [9] Speech enhancement through improvised conditional generative adversarial networks
    Ram, Saravana Ram
    Kumar, Vinoth M.
    Subramanian, Balambigai
    Bacanin, Nebojsa
    Zivkovic, Miodrag
    Strumberger, Ivana
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 79
  • [10] Multi-scale Generative Adversarial Networks for Speech Enhancement
    Li, Yihang
    Jiang, Ting
    Qin, Shan
    [J]. 2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,