A Real-Time Speech Enhancement Processor for Hearing Aids in 28-nm CMOS

被引:0
|
作者
Park, Sungjin [1 ,2 ]
Lee, Sunwoo [1 ,2 ]
Park, Jeongwoo [3 ]
Choi, Hyeong-Seok [4 ]
Lee, Kyogu [5 ,6 ]
Jeon, Dongsuk [1 ,2 ]
机构
[1] Seoul Natl Univ, Res Inst Convergence Sci, Dept Intelligence & Informat, Seoul, South Korea
[2] Seoul Natl Univ, Interuniv Semicond Res Ctr, Seoul, South Korea
[3] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon 16419, South Korea
[4] ElevenLabs, New York, NY USA
[5] Seoul Natl Univ, Dept Intelligence & Informat, Seoul 08826, South Korea
[6] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Neural networks; Hearing aids; Real-time systems; Convolution; Optimization; Decoding; Computational modeling; Digital hearing aids; multiplier-less processing element (PE); neural network processor; reconfigurable architecture; speech enhancement (SE); PERFORMANCE; DEVICES;
D O I
10.1109/JSSC.2024.3460426
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speech enhancement (SE) plays a key role in many audio-related applications by removing noise and enhancing the quality of human voice. Recent deep learning-based approaches provide high-quality SE, but real-time processing of those algorithms is challenging in resource-constrained devices due to high computational complexity. In this article, we present an energy-efficient real-time SE processor aimed at hearing aids. To implement high-quality SE with a very limited power budget, various algorithm and hardware optimization techniques are proposed. Our SE algorithm adaptively allocates computational resources to each region in the input feature domain depending on their importance, reducing overall computations by 29.7%. Along with 4-bit channel-wise logarithmic quantization, the processor adopts a reconfigurable multiplier-less processing element (PE) that supports both pre-/post-processing and neural network layers, resulting in a 21.5% area reduction. In addition, the design employs efficient scheduling and input buffering schemes to reduce on-chip memory access by 70.8%. Fabricated in a 28-nm CMOS process, our design consumes only 740 mu W at 2.5 MHz with a total latency of 39.96 ms, satisfying the real-time processing constraints. In addition, our approach demonstrated higher SE quality than prior art in both objective and subjective evaluations.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Real-time digital speech processing strategies for the hearing impaired
    Magotra, N
    Sirivara, S
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1211 - 1214
  • [42] A 1.06 μW Smart ECG Processor in 65 nm CMOS for Real-Time Biometric Authentication and Personal Cardiac Monitoring
    Yin, Shihui
    Kim, Minkyu
    Kadetotad, Deepak
    Liu, Yang
    Bae, Chisung
    Kim, Sang Joon
    Cao, Yu
    Seo, Jae-sun
    2017 SYMPOSIUM ON VLSI CIRCUITS, 2017, : C102 - C103
  • [43] A Digital Signal Processor Implementation of Silent/Electrolaryngeal Speech Enhancement based on Real-Time Statistical Voice Conversion
    Moriguchi, Takuto
    Toda, Tomoki
    Sano, Motoaki
    Sato, Hiroshi
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3071 - 3075
  • [44] APPLICATION OF A VLSI VECTOR QUANTIZATION PROCESSOR TO REAL-TIME SPEECH CODING
    DAVIDSON, G
    GERSHO, A
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1986, 4 (01) : 112 - 124
  • [45] SRAM Assist Techniques for Operation in a Wide Voltage Range in 28-nm CMOS
    Zimmer, Brian
    Toh, Seng Oon
    Vo, Huy
    Lee, Yunsup
    Thomas, Olivier
    Asanovic, Krste
    Nikolic, Borivoje
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2012, 59 (12) : 853 - 857
  • [46] Characterization and Modeling of 28-nm Bulk CMOS Technology Down to 4.2 K
    Beckers, Arnout
    Jazaeri, Farzan
    Enz, Christian
    IEEE JOURNAL OF THE ELECTRON DEVICES SOCIETY, 2018, 6 (01): : 1007 - 1018
  • [47] Characterization of Multilayer Metal Gate Fuse in 28-nm CMOS Logic Technology
    Hsieh, Min-Che
    Lin, Yu-Cheng
    Chin, Yung-Wen
    Chang, Tzong-Sheng
    King, Ya-Chin
    Lin, Chrong-Jung
    IEEE ELECTRON DEVICE LETTERS, 2013, 34 (09) : 1088 - 1090
  • [48] Real-time implementation of speech recognition using RISC processor core
    Chang, CT
    Chang, CT
    Yang, HL
    Chang, HT
    NINTH ANNUAL IEEE INTERNATIONAL ASIC CONFERENCE AND EXHIBIT, PROCEEDINGS, 1996, : 231 - 234
  • [49] Impact of Tap Cell on Single Event Transient in 28-nm CMOS Technology
    Zhang, Chenyu
    Tan, Chiyu
    Li, Yan
    Cheng, Xu
    Han, Jun
    Zeng, Xiaoyang
    2022 22ND EUROPEAN CONFERENCE ON RADIATION AND ITS EFFECTS ON COMPONENTS AND SYSTEMS, RADECS, 2022, : 55 - 58
  • [50] Mismatch Analysis of DTCs With an Improved BIST-TDC in 28-nm CMOS
    Chen, Peng
    Yin, Jun
    Zhang, Feifei
    Mak, Pui-In
    Martins, Rui P.
    Staszewski, Robert Bogdan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (01) : 196 - 206