Speech discrimination based on multiscale spectro-temporal modulations

被引:0
|
作者
Mesgarani, N [1 ]
Shamma, S [1 ]
Slaney, M [1 ]
机构
[1] Univ Maryland, Neural Syst Lab, College Pk, MD 20742 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A novel approach for content based audio classification is presented based on multiscale spectro-temporal modulation features extracted using a model of auditory cortex. The task is to discriminate speech from non-speech which consists of animal vocalizations, music and environmental sounds. Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches. The results demonstrate the advantages of the auditory model over the other two Systems, especially at low SNRs and high reverberation.
引用
收藏
页码:601 / 604
页数:4
相关论文
共 50 条
  • [1] Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
    Mesgarani, N
    Slaney, M
    Shamma, SA
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 920 - 930
  • [2] Spectro-Temporal Modulations for Robust Speech Emotion Recognition
    Yeh, Lan-Ying
    Chi, Tai-Shih
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 789 - 792
  • [3] Methods for capturing spectro-temporal modulations in automatic speech recognition
    Kleinschmidt, M
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2002, 88 (03) : 416 - 422
  • [4] Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds
    Sarah M N Woolley
    Thane E Fremouw
    Anne Hsu
    Frédéric E Theunissen
    [J]. Nature Neuroscience, 2005, 8 : 1371 - 1379
  • [5] Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds
    Woolley, SMN
    Fremouw, TE
    Hsu, A
    Theunissen, FE
    [J]. NATURE NEUROSCIENCE, 2005, 8 (10) : 1371 - 1379
  • [6] Aging and Spectro-Temporal Integration of Speech
    Grose, John H.
    Porter, Heather L.
    Buss, Emily
    [J]. TRENDS IN HEARING, 2016, 20
  • [7] Neural responses to speech-specific modulations derived from a spectro-temporal filter bank
    Frye, Marina
    Micheli, Cristiano
    Schepers, Inga M.
    Schalk, Gerwin
    Rieger, Jochem W.
    Meyer, Bernd T.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1368 - 1372
  • [8] Spectro-temporal discrimination in cochlear implant users
    Molin, E
    Leijon, A
    Wallsten, H
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 25 - 28
  • [9] Development of spectro-temporal features of speech in children
    Gautam S.
    Singh L.
    [J]. Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
  • [10] SPECTRO-TEMPORAL NEURAL FACTORIZATION FOR SPEECH DEREVERBERATION
    Chien, Jen-Tzung
    Kuo, Kuan-Ting
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5449 - 5453