A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

被引:31
|
作者
Valin, Jean-Marc [1 ]
Isik, Umut [2 ]
Phansalkar, Neerad [2 ]
Giri, Ritwik [2 ]
Helwani, Karim [2 ]
Krishnaswamy, Arvindh [2 ]
机构
[1] Amazon Web Serv, Toronto, ON, Canada
[2] Amazon Web Serv, Seattle, WA USA
来源
关键词
speech enhancement; pitch filtering; postfilter; MASKING; NOISE;
D O I
10.21437/Interspeech.2020-2730
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Over the past few years, speech enhancement methods based on deep learning have greatly surpassed traditional methods based on spectral subtraction and spectral estimation. Many of these new techniques operate directly in the the short-time Fourier transform (STFT) domain, resulting in a high computational complexity. In this work, we propose PercepNet, an efficient approach that relies on human perception of speech by focusing on the spectral envelope and on the periodicity of the speech. We demonstrate high-quality, real-time enhancement of fullband (48 kHz) speech with less than 5% of a CPU core.
引用
收藏
页码:2482 / 2486
页数:5
相关论文
共 50 条
  • [1] A Perceptually Motivated Approach for Low-Complexity Speech Semantic Communication
    Chen, Xiaojiao
    Wang, Jing
    Xu, Liang
    Huang, Jingxuan
    Fei, Zesong
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (12): : 22054 - 22065
  • [2] Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations
    Wan, Xucheng
    Liu, Kai
    Du, Ziqing
    Zhou, Huan
    [J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 461 - 467
  • [3] PERCEPTUALLY-MOTIVATED ENVIRONMENT-SPECIFIC SPEECH ENHANCEMENT
    Su, Jiaqi
    Finkelstein, Adam
    Jin, Zeyu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7015 - 7019
  • [4] A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification
    Poblete, Victor
    Espic, Felipe
    King, Simon
    Stem, Richard M.
    Huenupan, Fernando
    Fredes, Josue
    Yoma, Nestor Becerra
    [J]. COMPUTER SPEECH AND LANGUAGE, 2015, 31 (01): : 1 - 27
  • [5] A perceptually motivated approach for speech enhancement
    Hu, Y
    Loizou, PC
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05): : 457 - 465
  • [6] Improving low-complexity and real-time DeepFilterNet2 for personalized speech enhancement
    Shanghai Engineering Research Center of Intelligent Education and Bigdata, Shanghai Normal University, Shanghai
    200234, China
    不详
    [J]. Int J Speech Technol, 2024, 2 (299-306): : 299 - 306
  • [7] Distributed multichannel speech enhancement based on perceptually-motivated Bayesian estimators of the spectral amplitude
    Trawicki, Marek B.
    Johnson, Michael T.
    [J]. IET SIGNAL PROCESSING, 2013, 7 (04) : 337 - 344
  • [9] LOW-COMPLEXITY, REAL-TIME JOINT NEURAL ECHO CONTROL AND SPEECH ENHANCEMENT BASED ON PERCEPNET
    Valin, Jean-Marc
    Tenneti, Srikanth
    Helwani, Karim
    Isik, Umut
    Krishnaswamy, Arvindh
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7133 - 7137
  • [10] Speech enhancement using Bayesian estimators of the perceptually-motivated short-time spectral amplitude (STSA) with Chi speech priors
    Trawicki, Marek B.
    Johnson, Michael T.
    [J]. SPEECH COMMUNICATION, 2014, 57 : 101 - 113