Convolutional Transformer based Local and Global Feature Learning for Speech Enhancement

被引:0
|
作者
Jannu, Chaitanya [1 ]
Vanambathina, Sunny Dayal [1 ]
机构
[1] VIT AP Univ, Sch Elect Engn, Amaravati, India
关键词
Convolutional neural network; recurrent neural network; speech enhancement; multi-head attention; two-stage convolutional transformer; feed-forward network; NEURAL-NETWORK; DILATED CONVOLUTIONS; RECOGNITION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Speech enhancement (SE) is an important method for improving speech quality and intelligibility in noisy environments where received speech is severely distorted by noise. An efficient speech enhancement system relies on accurately modelling the long-term dependencies of noisy speech. Deep learning has greatly benefited by the use of transformers where long-term dependencies can be modelled more efficiently with multi-head attention (MHA) by using sequence similarity. Transformers frequently outperform recurrent neural network (RNN) and convolutional neural network (CNN) models in many tasks while utilizing parallel processing. In this paper we proposed a two-stage convolutional transformer for speech enhancement in time domain. The transformer considers global information as well as parallel computing, resulting in a reduction of long-term noise. In the proposed work unlike two -stage transformer neural network (TSTNN) different transformer structures for intra and inter transformers are used for extracting the local as well as global features of noisy speech. Moreover, a CNN module is added to the transformer so that short-term noise can be reduced more effectively, based on the ability of CNN to extract local information. The experimental findings demonstrate that the proposed model outperformed the other existing models in terms of STOI (short-time objective intelligibility), and PESQ (perceptual evaluation of the speech quality).
引用
收藏
页码:731 / 743
页数:13
相关论文
共 50 条
  • [41] MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder
    Li, You-Jin
    Wang, Syu-Siang
    Tsao, Yu
    Su, Borching
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1245 - 1250
  • [42] Local and Global Discriminative Learning for Unsupervised Feature Selection
    Du, Liang
    Shen, Zhiyong
    Li, Xuan
    Zhou, Peng
    Shen, Yi-Dong
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 131 - 140
  • [43] Local to Global Feature Learning for Salient Object Detection
    Feng, Xuelu
    Zhou, Sanping
    Zhu, Zixin
    Wang, Le
    Hua, Gang
    PATTERN RECOGNITION LETTERS, 2022, 162 : 81 - 88
  • [44] Wavelet Based Edge Feature Enhancement for Convolutional Neural Networks
    De Silva, D. D. N.
    Fernando, S.
    Piyatilake, I. T. S.
    Karunarathne, A. V. S.
    ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), 2019, 11041
  • [45] SSLCT: A Convolutional Transformer for Synthetic Speech Localization
    Bhagtani, Kratika
    Yadav, Amit Kumar Singh
    Bestagini, Paolo
    Delp, Edward J.
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 134 - 140
  • [46] Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement
    Ye, Moujia
    Wan, Hongjie
    ENTROPY, 2023, 25 (02)
  • [47] Improved Facial Expression Recognition Algorithm Based on Local Feature Enhancement and Global Information Association
    Chen, Zixuan
    Yan, Lingyu
    Wang, Hairu
    Adamyk, Bogdan
    ELECTRONICS, 2024, 13 (14)
  • [48] A transformer-based model with feature compensation and local information enhancement for end-to-end pest detection
    Liu, Honglin
    Zhan, Yongzhao
    Sun, Jun
    Mao, Qirong
    Wu, Tongwang
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 231
  • [49] Video Object Segmentation Algorithm Based on Multi-scale Feature Enhancement and Global-Local Feature Aggregation
    Hou, Zhiqiang
    Dong, Jiale
    Ma, Sugang
    Wang, Chenxu
    Yang, Xiaobao
    Wang, Yunchen
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (11): : 4198 - 4207
  • [50] Local and Global Feature Based Explainable Feature Envy Detection
    Yin, Xin
    Shi, Chongyang
    Zhao, Shuxin
    2021 IEEE 45TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2021), 2021, : 942 - 951