Speech Enhancement Using MLP-Based Architecture With Convolutional Token Mixing Module and Squeeze-and-Excitation Network

被引:0
|
作者
Song, Hyungchan [1 ]
Kim, Minseung [1 ]
Shin, Jong Won [1 ]
机构
[1] Gwangju Inst Sci & Technol, Sch Elect Engn & Comp Sci, Gwangju 61005, South Korea
基金
新加坡国家研究基金会;
关键词
Speech enhancement; local and global information; low computational complexity;
D O I
10.1109/ACCESS.2022.3221440
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Conformer has shown impressive performance for speech enhancement by exploiting the local and global contextual information, although it requires high computational complexity and many parameters. Recently, multi-layer perceptron (MLP)-based models such as MLP-mixer and gMLP have demonstrated comparable performances with much less computational complexity in the computer vision area. These models showed that all-MLP architectures may perform as good as more advanced structures, but the nature of the MLP limits the application of these architectures to the input with a variable length such as speech and audio. In this paper, we propose the cgMLP-SE model, which is a gMLP-based architecture with convolutional token mixing modules and squeeze-and-excitation network to utilize both local and global contextual information as in the Conformer. Specifically, the token-mixing modules in gMLP are replaced by convolutional layers, squeeze-and-excitation network-based gating is applied on top of the convolutional gating module, and additional feed-forward layers are added to make the cgMLP-SE module a macaron-like structure sandwiched by feed-forward layers like a Conformer block. Experimental results on the TIMIT-DNS noise dataset and the Voice Bank-DEMAND dataset showed that the proposed method exhibited similar speech quality and intelligibility to the Conformer with a smaller model size and less computational complexity.
引用
收藏
页码:119283 / 119289
页数:7
相关论文
共 21 条
  • [2] Biometric Fish Classification of Temperate Species Using Convolutional Neural Network with Squeeze-and-Excitation
    Olsvik, Erlend
    Trinh, Christian M. D.
    Knausgard, Kristian Muri
    Wiklund, Arne
    Sordalen, Tonje Knutsen
    Kleiven, Alf Ring
    Jiao, Lei
    Goodwin, Morten
    [J]. ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: FROM THEORY TO PRACTICE, 2019, 11606 : 89 - 101
  • [3] Smartwatch-based Eating Detection and Cutlery Classification using a Deep Residual Network with Squeeze-and-Excitation Module
    Mekruksavanich, Sakorn
    Jantawong, Ponnipa
    Jitpattanakul, Anuchit
    [J]. 2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 301 - 304
  • [4] Squeeze-and-excitation 3D convolutional attention recurrent network for end-to-end speech emotion recognition
    Saleem, Nasir
    Elmannai, Hela
    Bourouis, Sami
    Trigui, Aymen
    [J]. APPLIED SOFT COMPUTING, 2024, 161
  • [5] Automatic Multilabel Classification of Multiple Fundus Diseases Based on Convolutional Neural Network With Squeeze-and-Excitation Attention
    Lu, Zhenzhen
    Miao, Jingpeng
    Dong, Jingran
    Zhu, Shuyuan
    Wu, Penghan
    Wang, Xiaobing
    Feng, Jihong
    [J]. TRANSLATIONAL VISION SCIENCE & TECHNOLOGY, 2023, 12 (01):
  • [6] Weak abnormal acoustic signal enhancement and recognition using squeeze-and-excitation attention based denoising convolutional neural network during high-dam flood discharging
    Lian, Jijian
    Xu, Wenliang
    Liang, Chao
    Liu, Fang
    Wang, Runxi
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
  • [7] SE_SPnet: Rice leaf disease prediction using stacked parallel convolutional neural network with squeeze-and-excitation
    Bhuyan, Parag
    Singh, Pranav Kumar
    Das, Sujit Kumar
    Kalla, Anshuman
    [J]. EXPERT SYSTEMS, 2023, 40 (07)
  • [8] Unbalanced fault diagnosis of rolling bearings using transfer adaptive boosting with squeeze-and-excitation attention convolutional neural network
    Zhao, Ke
    Jia, Feng
    Shao, Haidong
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2023, 34 (04)
  • [9] Classification of lung nodules based on CT images using squeeze-and-excitation network and aggregated residual transformations
    Zhang, Guobin
    Yang, Zhiyong
    Gong, Li
    Jiang, Shan
    Wang, Lu
    Zhang, Hongyun
    [J]. RADIOLOGIA MEDICA, 2020, 125 (04): : 374 - 383
  • [10] Classification of lung nodules based on CT images using squeeze-and-excitation network and aggregated residual transformations
    Guobin Zhang
    Zhiyong Yang
    Li Gong
    Shan Jiang
    Lu Wang
    Hongyun Zhang
    [J]. La radiologia medica, 2020, 125 : 374 - 383