Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features

被引:28
|
作者
Schubotz, Wiebke
Brand, Thomas
Kollmeier, Birger
Ewert, Stephan D. [1 ]
机构
[1] Carl von Ossietzky Univ Oldenburg, Med Phys, D-26111 Oldenburg, Germany
来源
关键词
COMODULATION MASKING RELEASE; HEARING-IMPAIRED LISTENERS; INFORMATIONAL MASKING; FLUCTUATING NOISE; FREQUENCY-SELECTIVITY; SIMULTANEOUS TALKERS; RECEPTION THRESHOLD; PERCEPTION; INDEX; SEPARATION;
D O I
10.1121/1.4955079
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models. (C) 2016 Acoustical Society of America.
引用
收藏
页码:524 / 540
页数:17
相关论文
共 50 条
  • [31] Spectro-temporal glimpsing of speech in noise: Regularity and coherence of masking patterns reduces uncertainty and increases intelligibility
    Fogerty, Daniel
    Sevich, Victoria A.
    Healy, Eric W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 148 (03): : 1552 - 1566
  • [32] Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection
    Cheng, Haowei
    Mawalim, Candy Olivia
    Li, Kai
    Wang, Lijun
    Unoki, Masashi
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1822 - 1829
  • [33] Learning spectro-temporal features with 3D CNNs for speech emotion recognition
    Kim, Jaebok
    Truong, Khiet P.
    Englebienne, Gwenn
    Evers, Vanessa
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 383 - 388
  • [34] Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition
    Meyer, Bernd T.
    Kollmeier, Birger
    SPEECH COMMUNICATION, 2011, 53 (05) : 753 - 767
  • [35] Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition
    Geng, Mengzhe
    Xie, Xurong
    Ye, Zi
    Wang, Tianzi
    Li, Guinan
    Hu, Shujie
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2597 - 2611
  • [36] AUTOMATIC RECOGNITION OF SPEECH EMOTION USING LONG-TERM SPECTRO-TEMPORAL FEATURES
    Wu, Siqing
    Falk, Tiago H.
    Chan, Wai-Yip
    2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 205 - 210
  • [37] Spectro-Temporal Features for Howling Frequency Detection
    Lee, Jae-Won
    Choi, Seung Ho
    COMPUTER APPLICATIONS FOR WEB, HUMAN COMPUTER INTERACTION, SIGNAL AND IMAGE PROCESSING AND PATTERN RECOGNITION, 2012, 342 : 25 - +
  • [38] SPECTRO-TEMPORAL ANALYSIS OF SPEECH AFFECTED BY DEPRESSION AND PSYCHOMOTOR RETARDATION
    Cummins, Nicholas
    Epps, Julien
    Ambikairajah, Eliathamby
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7542 - 7546
  • [39] Spectro-temporal processing of speech - An information-theoretic framework
    Christiansen, Thomas U.
    Dau, Torsten
    Greenberg, Steven
    HEARING - FROM SENSORY PROCESSING TO PERCEPTION, 2007, : 517 - 523
  • [40] DERIVING SPECTRO-TEMPORAL PROPERTIES OF HEARING FROM SPEECH DATA
    Ondel, Lucas
    Li, Ruizhi
    Sell, Gregory
    Hermansky, Hynek
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 411 - 415