Modeling human auditory perception for noise-robust speech recognition

被引:0
|
作者
Lee, SY [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept BioSyst, Taejon 305701, South Korea
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several bio-inspired models of human auditory perception are reported for robust speech recognition in realworld noisy environment. The developed mathematical models of the human auditory pathway are integrated into a speech recognition system, of which 3 components are. (1) the nonlinear feature extraction model from cochlea to auditory cortex, (2) the binaural processing model at superior olivery complex, and (3) the top-down attention model from higher brain to the cochlea. The unsupervised Independent Component Analysis shows that some auditory feature extraction and binaural processing mechanisms follow information theory with sparse representation. The ICA-based features resemble frequency-limited features extracted from the cochlea and also more complex time-frequency features from the inferior colliculus and auditory cortex. The top-down attention model shows how the pre-acquired knowledge in our brain filters out irrelevant features or fills in missing features in the sensory data. Both the top-down attention and bottomup binaural processing are combined into a single system for high-noisy cases. This auditory model requires extensive computing, and several VLSI implementations had been developed for real-time applications. Experimental results demonstrate much better recognition performance in realworld noisy environments.
引用
收藏
页码:PL72 / PL74
页数:3
相关论文
共 50 条
  • [21] Unsupervised modulation filter learning for noise-robust speech recognition
    Agrawal, Purvi
    Ganapathy, Sriram
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (03): : 1686 - 1692
  • [22] A companding front end for noise-robust automatic speech recognition
    Guinness, J
    Raj, B
    Schmidt-Nielsen, B
    Turicchia, L
    Sarpeshkar, R
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
  • [23] Speech Envelope Dynamics for Noise-Robust Auditory Scene Analysis in Robotics
    Rea, Francesco
    Kothig, Austin
    Grasse, Lukas
    Tata, Matthew
    [J]. INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2020, 17 (06)
  • [24] Noise-robust speech triage
    Bartos, Anthony L.
    Cipr, Tomas
    Nelson, Douglas J.
    Schwarz, Petr
    Banowetz, John
    Jerabek, Ladislav
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (04): : 2313 - 2320
  • [25] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    [J]. PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208
  • [26] A speech emphasis method for noise-robust speech recognition by using repetitive phrase
    Hirai, Takanori
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    Fattah, Mohamed Abdel
    [J]. 2006 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, 2006, : 1269 - +
  • [27] Analog auditory perception model for robust speech recognition
    Deng, YB
    Chakrabartty, S
    Cauwenberghs, G
    [J]. 2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 1705 - 1709
  • [28] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Sara Ahmadi
    Seyed Mohammad Ahadi
    Bert Cranen
    Lou Boves
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [29] Improved model parameter compensation methods for noise-robust speech recognition
    Chang, YH
    Chung, YJ
    Park, SU
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 561 - 564
  • [30] GAUSSIAN POWER FLOW ORIENTATION COEFFICIENTS FOR NOISE-ROBUST SPEECH RECOGNITION
    Gerazov, Branislav
    Ivanovski, Zoran
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1467 - 1471