DERIVING SPECTRO-TEMPORAL PROPERTIES OF HEARING FROM SPEECH DATA

被引:0
|
作者
Ondel, Lucas [1 ,3 ]
Li, Ruizhi [1 ]
Sell, Gregory [1 ,2 ]
Hermansky, Hynek [1 ,2 ,3 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD USA
[3] Brno Univ Technol, FIT, Ctr Excellence IT4I, Brno, Czech Republic
来源
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年
基金
美国国家科学基金会;
关键词
perception; spectro-temporal; auditory; deep learning; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Human hearing and human speech are intrinsically tied together, as the properties of speech almost certainly developed in order to be heard by human ears. As a result of this connection, it has been shown that certain properties of human hearing are mimicked within data-driven systems that are trained to understand human speech. In this paper, we further explore this phenomenon by measuring the spectro-temporal responses of data-derived filters in a front-end convolutional layer of a deep network trained to classify the phonemes of clean speech. The analyses show that the filters do indeed exhibit spectro-temporal responses similar to those measured in mammals, and also that the filters exhibit an additional level of frequency selectivity, similar to the processing pipeline assumed within the Articulation Index.
引用
收藏
页码:411 / 415
页数:5
相关论文
共 50 条
  • [21] A spectro-temporal modulation index (STMI) for assessment of speech intelligibility
    Elhilali, M
    Chi, T
    Shamma, SA
    SPEECH COMMUNICATION, 2003, 41 (2-3) : 331 - 348
  • [22] Spectro-temporal processing of speech - An information-theoretic framework
    Christiansen, Thomas U.
    Dau, Torsten
    Greenberg, Steven
    HEARING - FROM SENSORY PROCESSING TO PERCEPTION, 2007, : 517 - 523
  • [23] The impact of exploiting spectro-temporal context in computational speech segregation
    Bentsen, Thomas
    Kressner, Abigail A.
    Dau, Torsten
    May, Tobias
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (01): : 248 - 259
  • [24] Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
    Geng, Mengzhe
    Liu, Shansong
    Yu, Jianwei
    Xie, Xurong
    Hu, Shoukang
    Ye, Zi
    Jin, Zengrui
    Liu, Xunying
    Meng, Helen
    INTERSPEECH 2021, 2021, : 4793 - 4797
  • [25] Methods for capturing spectro-temporal modulations in automatic speech recognition
    Kleinschmidt, M
    ACTA ACUSTICA UNITED WITH ACUSTICA, 2002, 88 (03) : 416 - 422
  • [26] A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    INTERSPEECH 2021, 2021, : 206 - 210
  • [27] Bioinspired sparse spectro-temporal representation of speech for robust classification
    Martinez, C.
    Goddard, J.
    Milone, D.
    Rufiner, H.
    COMPUTER SPEECH AND LANGUAGE, 2012, 26 (05): : 336 - 348
  • [28] Speech Intelligibility Prediction Using Spectro-Temporal Modulation Analysis
    Edraki, Amin
    Chan, Wai-Yip
    Jensen, Jesper
    Fogerty, Daniel
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 210 - 225
  • [29] Spectro-Temporal Directional Derivative Features for Automatic Speech Recognition
    Gibson, James
    Van Segbroeck, Maarten
    Ortega, Antonio
    Georgiou, Panayiotis
    Narayanan, Shrikanth
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 872 - 875
  • [30] Comparing the influence of spectro-temporal integration in computational speech segregation
    Bentsen, Thomas
    May, Tobias
    Kressner, Abigail A.
    Dau, Torsten
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3324 - 3328