A Real-Time Speech Enhancement Processor for Hearing Aids in 28-nm CMOS

被引:0
|
作者
Park, Sungjin [1 ,2 ]
Lee, Sunwoo [1 ,2 ]
Park, Jeongwoo [3 ]
Choi, Hyeong-Seok [4 ]
Lee, Kyogu [5 ,6 ]
Jeon, Dongsuk [1 ,2 ]
机构
[1] Seoul Natl Univ, Res Inst Convergence Sci, Dept Intelligence & Informat, Seoul, South Korea
[2] Seoul Natl Univ, Interuniv Semicond Res Ctr, Seoul, South Korea
[3] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon 16419, South Korea
[4] ElevenLabs, New York, NY USA
[5] Seoul Natl Univ, Dept Intelligence & Informat, Seoul 08826, South Korea
[6] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Neural networks; Hearing aids; Real-time systems; Convolution; Optimization; Decoding; Computational modeling; Digital hearing aids; multiplier-less processing element (PE); neural network processor; reconfigurable architecture; speech enhancement (SE); PERFORMANCE; DEVICES;
D O I
10.1109/JSSC.2024.3460426
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech enhancement (SE) plays a key role in many audio-related applications by removing noise and enhancing the quality of human voice. Recent deep learning-based approaches provide high-quality SE, but real-time processing of those algorithms is challenging in resource-constrained devices due to high computational complexity. In this article, we present an energy-efficient real-time SE processor aimed at hearing aids. To implement high-quality SE with a very limited power budget, various algorithm and hardware optimization techniques are proposed. Our SE algorithm adaptively allocates computational resources to each region in the input feature domain depending on their importance, reducing overall computations by 29.7%. Along with 4-bit channel-wise logarithmic quantization, the processor adopts a reconfigurable multiplier-less processing element (PE) that supports both pre-/post-processing and neural network layers, resulting in a 21.5% area reduction. In addition, the design employs efficient scheduling and input buffering schemes to reduce on-chip memory access by 70.8%. Fabricated in a 28-nm CMOS process, our design consumes only 740 mu W at 2.5 MHz with a total latency of 39.96 ms, satisfying the real-time processing constraints. In addition, our approach demonstrated higher SE quality than prior art in both objective and subjective evaluations.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] A Gain Enhancement Structure Using 28-nm CMOS Process for V-band Power Amplifier Applications
    Chuang, Kai-Jie
    Bai, Wei-Ting
    Chen, Yu-Chun
    Lin, Wen-Jie
    Tsai, Jeng-Han
    Huang, Tian-Wei
    2021 IEEE INTERNATIONAL SYMPOSIUM ON RADIO-FREQUENCY INTEGRATION TECHNOLOGY (RFIT), 2021,
  • [32] 5G-IoT Cloud based Demonstration of Real-Time Audio-Visual Speech Enhancement for Multimodal Hearing-aids
    Gupta, Ankit
    Bishnu, Abhijeet
    Gogate, Mandar
    Dashtipour, Kia
    Arslan, Tughrul
    Adeel, Ahsan
    Hussain, Amir
    Ratnarajah, Tharmalingam
    Sellathurai, Mathini
    INTERSPEECH 2023, 2023, : 686 - 687
  • [33] A 40 nm 144 mW VLSI Processor for Real-Time 60-kWord Continuous Speech Recognition
    He, Guangji
    Sugahara, Takanobu
    Miyamoto, Yuki
    Fujinaga, Tsuyoshi
    Noguchi, Hiroki
    Izumi, Shintaro
    Kawaguchi, Hiroshi
    Yoshimoto, Masahiko
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2012, 59 (08) : 1656 - 1666
  • [34] A 56-GS/s 8-bit Time-Interleaved ADC With ENOB and BW Enhancement Techniques in 28-nm CMOS
    Sun, Kexu
    Wang, Guanhua
    Zhang, Qing
    Elahmadi, Salam
    Gui, Ping
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (03) : 821 - 833
  • [35] Cache Resiliency Techniques for a Low-Voltage RISC-V Out-of-Order Processor in 28-nm CMOS
    Chiu, Pi-Feng
    Celio, Christopher
    Asanovic, Krste
    Nikolic, Borivoje
    Patterson, David
    IEEE SOLID-STATE CIRCUITS LETTERS, 2018, 1 (12): : 229 - 232
  • [36] Real-time Speech Enhancement with GCC-NMF
    Wood, Sean U. N.
    Rouat, Jean
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
  • [37] REAL-TIME SPEECH ENHANCEMENT USING EQUILIBRIATED RNN
    Takeuchi, Daiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 851 - 855
  • [38] DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
    Schroeter, Hendrik
    Escalante-B, Alberto N.
    Rosenkranz, Tobias
    Maier, Andreas
    INTERSPEECH 2023, 2023, : 2008 - 2009
  • [39] Real-Time Contrast Enhancement to Improve Speech Recognition
    Alexander, Joshua M.
    Jenison, Rick L.
    Kluender, Keith R.
    PLOS ONE, 2011, 6 (09):
  • [40] A Real-Time Convolutional Neural Network Based Speech Enhancement for Hearing Impaired Listeners Using Smartphone
    Bhat, Gautam S.
    Shankar, Nikhil
    Reddy, Chandan K. A.
    Panahi, Issa M. S.
    IEEE ACCESS, 2019, 7 : 78421 - 78433