Two-Stage Refinement of Magnitude and Complex Spectra for Real-Time Speech Enhancement

被引：3

作者：

Lee, Jinyoung ^{[1
]}

Kang, Hong-Goo ^{[1
]}

机构：

[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul 03722, South Korea

来源：

IEEE SIGNAL PROCESSING LETTERS | 2022年 / 29卷

关键词：

Convolution; Speech enhancement; Noise measurement; Estimation; Training; Time-frequency analysis; Kernel; Complex spectra refinement; high-level feature transfer; magnitude spectral masking; real-time speech enhancement; two-stage network; NOISE;

D O I：

10.1109/LSP.2022.3215100

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this letter, we propose a two-stage network for performing speech enhancement that predicts magnitude spectra in the first stage and complex spectra in the second stage. To maximize the model's performance at each stage, we propose two convolutional modules: magnitude spectral masking (MSM) and complex spectra refinement (CSR). Each module is designed to take into account the specific characteristics of the signal type it handles. The MSM estimates multiplicative masks to remove noise in the magnitude component of the convolutional features, and the CSR refines the complex component of the convolutional features using additive features. By using these modules, our proposed two-stage enhancement model shows higher performance than previously proposed state-of-the-art algorithms. In addition, the number of parameters of our model is only 2.63 million, and it can operate in real time thanks to its causal characteristics and low computational complexity.

引用

页码：2188 / 2192

页数：5

共 50 条

[1] Real-time fingerprint image enhancement with a two-stage algorithm and block–local normalization
Marko Kočevar
Bojan Kotnik
Amor Chowdhury
Zdravko Kačič
[J]. Journal of Real-Time Image Processing, 2017, 13 : 773 - 782
[2] Real-time fingerprint image enhancement with a two-stage algorithm and block-local normalization
Kocevar, Marko
Kotnik, Bojan
Chowdhury, Amor
Kacic, Zdravko
[J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2017, 13 (04) : 773 - 782
[3] A two-stage algorithm for enhancement of reverberant speech
Wu, MY
Wang, D
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1085 - 1088
[4] Optimizing resource speed for two-stage real-time tasks
Melani, Alessandra
Mancuso, Renato
Cullina, Daniel
Caccamo, Marco
Thiele, Lothar
[J]. REAL-TIME SYSTEMS, 2017, 53 (01) : 82 - 120
[5] A Two-Stage Framework for Real-Time Guidewire Endpoint Localization
Li, Rui-Qi
Bian, Guibin
Zhou, Xiaohu
Xie, Xiaoliang
Ni, ZhenLiang
Hou, Zengguang
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT V, 2019, 11768 : 357 - 365
[6] Optimizing resource speed for two-stage real-time tasks
Alessandra Melani
Renato Mancuso
Daniel Cullina
Marco Caccamo
Lothar Thiele
[J]. Real-Time Systems, 2017, 53 : 82 - 120
[7] Two-stage complex action recognition framework for real-time surveillance automatic violence detection
Lopez D.J.D.
Lien C.-C.
[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (12) : 15983 - 15996
[8] A two-stage frequency-time dilated dense network for speech enhancement
Huang, Xiangdong
Chen, Honghong
Lu, Wei
[J]. APPLIED ACOUSTICS, 2022, 201
[9] REAL-TIME DISTRIBUTED SPEECH ENHANCEMENT WITH TWO COLLABORATING MICROPHONE ARRAYS
Hassani, Amin
Bertrand, Alexander
Moonen, Marc
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 6586 - 6587
[10] A TWO-STAGE ALGORITHM FOR NOISY AND REVERBERANT SPEECH ENHANCEMENT
Zhao, Yan
Wang, Zhong-Qiu
Wang, DeLiang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5580 - 5584

← 1 2 3 4 5 →