A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

被引：31

作者：

Valin, Jean-Marc ^{[1
]}

Isik, Umut ^{[2
]}

Phansalkar, Neerad ^{[2
]}

Giri, Ritwik ^{[2
]}

Helwani, Karim ^{[2
]}

Krishnaswamy, Arvindh ^{[2
]}

机构：

[1] Amazon Web Serv, Toronto, ON, Canada

[2] Amazon Web Serv, Seattle, WA USA

来源：

INTERSPEECH 2020 | 2020年

关键词：

speech enhancement; pitch filtering; postfilter; MASKING; NOISE;

D O I：

10.21437/Interspeech.2020-2730

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation. Many of these new techniques operate directly in the the short-time Fourier transform (STFT) domain, resulting in a high computational complexity. In this work, we propose PercepNet, an efficient approach that relies on human perception of speech by focusing on the spectral envelope and on the periodicity of the speech. We demonstrate high-quality, real-time enhancement of fullband (48 kHz) speech with less than 5% of a CPU core.

引用

页码：2482 / 2486

页数：5

共 50 条

[1] A Perceptually Motivated Approach for Low-Complexity Speech Semantic Communication
Chen, Xiaojiao
Wang, Jing
Xu, Liang
Huang, Jingxuan
Fei, Zesong
[J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (12): : 22054 - 22065
[2] Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations
Wan, Xucheng
Liu, Kai
Du, Ziqing
Zhou, Huan
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 461 - 467
[3] PERCEPTUALLY-MOTIVATED ENVIRONMENT-SPECIFIC SPEECH ENHANCEMENT
Su, Jiaqi
Finkelstein, Adam
Jin, Zeyu
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7015 - 7019
[4] A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification
Poblete, Victor
Espic, Felipe
King, Simon
Stem, Richard M.
Huenupan, Fernando
Fredes, Josue
Yoma, Nestor Becerra
[J]. COMPUTER SPEECH AND LANGUAGE, 2015, 31 (01): : 1 - 27
[5] A perceptually motivated approach for speech enhancement
Hu, Y
Loizou, PC
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05): : 457 - 465
[6] Improving low-complexity and real-time DeepFilterNet2 for personalized speech enhancement
Shanghai Engineering Research Center of Intelligent Education and Bigdata, Shanghai Normal University, Shanghai
200234, China
不详
[J]. Int J Speech Technol, 2024, 2 (299-306): : 299 - 306
[7] Distributed multichannel speech enhancement based on perceptually-motivated Bayesian estimators of the spectral amplitude
Trawicki, Marek B.
Johnson, Michael T.
[J]. IET SIGNAL PROCESSING, 2013, 7 (04) : 337 - 344
[8] Distributed multichannel speech enhancement based on perceptually-motivated Bayesian estimators of the spectral amplitude
[J]. 1600, Institution of Engineering and Technology, United States (07):
[9] LOW-COMPLEXITY, REAL-TIME JOINT NEURAL ECHO CONTROL AND SPEECH ENHANCEMENT BASED ON PERCEPNET
Valin, Jean-Marc
Tenneti, Srikanth
Helwani, Karim
Isik, Umut
Krishnaswamy, Arvindh
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7133 - 7137
[10] Speech enhancement using Bayesian estimators of the perceptually-motivated short-time spectral amplitude (STSA) with Chi speech priors
Trawicki, Marek B.
Johnson, Michael T.
[J]. SPEECH COMMUNICATION, 2014, 57 : 101 - 113

← 1 2 3 4 5 →