Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction

被引:7
|
作者
Lin, Ju [1 ]
Niu, Sufeng [2 ]
Wei, Zice [1 ]
Lan, Xiang [1 ]
van Wijngaarden, Adriaan J. [3 ]
Smith, Melissa C. [1 ]
Wang, Kuang-Ching [1 ]
机构
[1] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29634 USA
[2] LinkedIn Inc, Mountain View, CA USA
[3] Nokia, Nokia Bell Labs, Murray Hill, NJ USA
来源
关键词
speech enhancement; generative adversarial network; log-power spectra; NOISE;
D O I
10.21437/Interspeech.2019-2954
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Speech enhancement techniques that use a generative adversarial network (GAN) can effectively suppress noise while allowing models to be trained end-to-end. However, such techniques directly operate on time-domain waveforms, which are often highly-dimensional and require extensive computation. This paper proposes a novel GAN-based speech enhancement method, referred to as S-ForkGAN, that operates on log-power spectra rather than on time-domain speech waveforms, and uses a forked GAN structure to extract both speech and noise information. By operating on log-power spectra, one can seamlessly include conventional spectral subtraction techniques, and the parameter space typically has a lower dimension. The performance of S-ForkGAN is assessed for automatic speech recognition (ASR) using the TIMIT data set and a wide range of noise conditions. It is shown that S-ForkGAN outperforms existing GAN-based techniques and that it has a lower complexity.
引用
收藏
页码:3163 / 3167
页数:5
相关论文
共 50 条
  • [41] Modeling Feature Representations for Affective Speech Using Generative Adversarial Networks
    Sahu, Saurabh
    Gupta, Rahul
    Espy-Wilson, Carol
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 1098 - 1110
  • [42] Data augmentation using generative adversarial networks for robust speech recognition
    Qian, Yanmin
    Hu, Hu
    Tan, Tian
    [J]. SPEECH COMMUNICATION, 2019, 114 : 1 - 9
  • [43] VSEGAN: VISUAL SPEECH ENHANCEMENT GENERATIVE ADVERSARIAL NETWORK
    Xu, Xinmeng
    Wang, Yang
    Xu, Dongxiang
    Peng, Yiyuan
    Zhang, Cong
    Jia, Jie
    Chen, Binbin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7307 - 7311
  • [44] GSC Based Speech Enhancement with Generative Adversarial Network
    Zhou, Yao
    Bao, Changchun
    Cheng, Rui
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 901 - 906
  • [45] Tamil Speech Enhancement Using Non-Linear Spectral Subtraction
    Prabhakaran, G.
    Indra, J.
    Kasthuri, N.
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [46] A modified A priori SNR for speech enhancement using spectral subtraction rules
    Hasan, MK
    Salahuddin, S
    Khan, MR
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (04) : 450 - 453
  • [47] Reduction of Noise for Speech Signal Enhancement Using Spectral Subtraction Method
    Saldanha, Jennifer C.
    Shruthi, O. R.
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE (ICIS), 2016, : 44 - 47
  • [48] Alaryngeal speech enhancement using minimum statistics approach to spectral subtraction
    Azarnoush, Hamed
    Mir, Faraz
    Agaian, Sos
    Jamshidi, Mo
    Shadaram, Mehdi
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEM OF SYSTEMS ENGINEERING, VOLS 1 AND 2, 2007, : 625 - 629
  • [49] BScGAN: DEEP BACKGROUND SUBTRACTION WITH CONDITIONAL GENERATIVE ADVERSARIAL NETWORKS
    Bakkay, M. C.
    Rashwan, H. A.
    Salmane, H.
    Khoudour, L.
    Puig, D.
    Ruichek, Y.
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4018 - 4022
  • [50] An efficient speech enhancement method using Kalman filter and spectral subtraction
    Kao, CC
    Lai, YT
    [J]. PROCEEDINGS OF THE 2004 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, VOL 1 AND 2: SOC DESIGN FOR UBIQUITOUS INFORMATION TECHNOLOGY, 2004, : 181 - 184