An Overview of Speech Enhancement Based on Deep Learning Techniques

被引:1
|
作者
Jannu, Chaitanya [1 ]
Vanambathina, Sunny Dayal [1 ]
机构
[1] AP Univ, VIT, Sch Elect Engn, Vijayawada 522237, Andhra Pradesh, India
关键词
Deep Neural Networks (DNNs); speech enhancement (SE); features; noisy speech; neural network; training targets; NEURAL-NETWORKS; CONVOLUTIONAL NETWORK; DILATED CONVOLUTIONS; SELF-ATTENTION; NOISE; INTELLIGIBILITY; SEPARATION; FEATURES; BACKPROPAGATION; ALGORITHMS;
D O I
10.1142/S0219467825500019
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recent years have seen a significant amount of studies in the area of speech enhancement. This review looks at several speech improvement methods as well as Deep Neural Network (DNN) functions in speech enhancement. Speech transmissions are frequently distorted by ambient noise, background noise, and reverberations. There are processing methods, such as Short-time Fourier Transform, Short-time Autocorrelation, and Short-time Energy (STE), that can be used to enhance speech. To reduce speech noise, features such as the Mel-Frequency Cepstral Coefficients (MFCCs), Logarithmic Power Spectrum (LPS), and Gammatone Frequency Cepstral Coefficients (GFCCs) can be retrieved and input to a DNN. DNN is essential to speech improvement since it builds models using a lot of training data and evaluates the efficacy of the enhanced speech using certain performance metrics. Since the beginning of deep learning publications in 1993, a variety of speech enhancement methods have been examined in this study. This review provides a thorough examination of the several neural network topologies, training algorithms, activation functions, training targets, acoustic features, and databases that were employed for the job of speech enhancement and were gathered from various articles published between 1993 and 2022.
引用
收藏
页数:51
相关论文
共 50 条
  • [1] Speech Enhancement: Traditional and Deep Learning Techniques
    Gaddamedi, Satya Prasad
    Patel, Anuj
    Chandra, Sabyasachi
    Bharati, Puja
    Ghosh, Nirmalya
    Das Mandal, Shyamal Kumar
    [J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 75 - 86
  • [2] An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
    Michelsanti, Daniel
    Tan, Zheng-Hua
    Zhang, Shi-Xiong
    Xu, Yong
    Yu, Meng
    Yu, Dong
    Jensen, Jesper
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1368 - 1396
  • [3] Supervised Speech Separation Based on Deep Learning: An Overview
    Wang, DeLiang
    Chen, Jitong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1702 - 1726
  • [4] On the Robustness of Deep Learning-Based Speech Enhancement
    Chhetri, Amit S.
    Hilmes, Philip
    Athi, Mrudula
    Shankar, Nikhil
    [J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1587 - 1594
  • [5] Increasing Compactness of Deep Learning Based Speech Enhancement Models With Parameter Pruning and Quantization Techniques
    Wu, Jyun-Yi
    Yu, Cheng
    Fu, Szu-Wei
    Liu, Chih-Ting
    Chien, Shao-Yi
    Tsao, Yu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (12) : 1887 - 1891
  • [6] An overview of deep learning techniques
    Vogt, Michael
    [J]. AT-AUTOMATISIERUNGSTECHNIK, 2018, 66 (09) : 690 - 703
  • [7] An Overview of Deep Learning Based Object Detection Techniques
    Bhagya, C.
    Shyna, A.
    [J]. PROCEEDINGS OF 2019 1ST INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION AND COMMUNICATION TECHNOLOGY (ICIICT 2019), 2019,
  • [8] Enhancement of Assamese Speech Signals Using Learning Based Techniques
    Sharma, Mridusmita
    Sarma, Kandarpa Kumar
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2021, 14 (05): : 100 - +
  • [9] Characterization of Deep Learning-Based Speech-Enhancement Techniques in Online Audio Processing Applications
    Rascon, Caleb
    [J]. SENSORS, 2023, 23 (09)
  • [10] Binaural speech enhancement algorithm based on attention and deep learning
    Li, Ruwei
    Li, Qiuyan
    Zhao, Fengnian
    Liu, Shangfeng
    [J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51 (09): : 125 - 131