Speech Enhancement Based On Spectrogram Conditional Generative Adversarial Networks

被引:0
|
作者
Han, Ru [1 ]
Liu, Jianming [1 ]
Wang, Mingwen [1 ]
机构
[1] Jiangxi Normal Univ, Sch Comp & Informat Engn, Nanchang, Jiangxi, Peoples R China
关键词
Speech recognition; Spectrogram-GAN; spatial transformation network; small sample; data enhancement;
D O I
10.1117/12.2557256
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Voice is the main way of communication and information sharing with others, It brings great convenience to human life. The existing speech recognition classification has the problem of considerable performance attenuation to environment noise and accent. Most of these problems can be mitigated by training on large amounts of data. However, collecting large Numbers of high-quality datasets in real life is time-consuming and expensive. In order to solve this problem, this paper proposes a data enhancement method,which is suitable for the speech image extension of small samples. S-GAN is used to generate datasets that conform to the real distribution of samples, and GAN-train and GAN-test methods are used to evaluate the quality and diversity of network generated images. Meanwhile, the spatial transformation network (STN) and CNN framework are combined to get the useful information part of the data for data classification. The results show that this method can significantly improve the classification accuracy of speech recognition and lay a foundation for small sample data enhancement.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Speech enhancement through improvised conditional generative adversarial networks
    Ram, Saravana Ram
    Kumar, Vinoth M.
    Subramanian, Balambigai
    Bacanin, Nebojsa
    Zivkovic, Miodrag
    Strumberger, Ivana
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2020, 79
  • [2] A Model of Emotional Speech Generation Based on Conditional Generative Adversarial Networks
    Jia, Ning
    Zheng, Chunjun
    Sun, Wei
    [J]. 2019 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2019), VOL 1, 2019, : 106 - 109
  • [3] Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification
    Michelsanti, Daniel
    Tan, Zheng-Hua
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2008 - 2012
  • [4] Improved Wasserstein conditional generative adversarial network speech enhancement
    Qin, Shan
    Jiang, Ting
    [J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
  • [5] Improved Wasserstein conditional generative adversarial network speech enhancement
    Shan Qin
    Ting Jiang
    [J]. EURASIP Journal on Wireless Communications and Networking, 2018
  • [6] Speech Enhancement Based on A New Architecture of Wasserstein Generative Adversarial Networks
    Ye, Shuaishuai
    Jiang, Ting
    Qin, Shan
    Zou, Weixia
    Deng, Chengyun
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 399 - 403
  • [7] Towards Generalized Speech Enhancement with Generative Adversarial Networks
    Pascual, Santiago
    Serra, Joan
    Bonafonte, Antonio
    [J]. INTERSPEECH 2019, 2019, : 1791 - 1795
  • [8] SPEECH ENHANCEMENT VIA GENERATIVE ADVERSARIAL LSTM NETWORKS
    Xiang, Yang
    Bao, Changchun
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 46 - 50
  • [9] EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Donahue, Chris
    Li, Bo
    Prabhavalkar, Rohit
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5024 - 5028
  • [10] Research on Medical Image Enhancement Method Based on Conditional Entropy Generative Adversarial Networks
    Li, Hui
    [J]. Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)