Comparing the influence of spectro-temporal integration in computational speech segregation

被引:2
|
作者
Bentsen, Thomas [1 ]
May, Tobias [1 ]
Kressner, Abigail A. [1 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Hearing Syst Grp, DK-2800 Lyngby, Denmark
关键词
computational speech segregation; binary masks; supervised learning; spectro-temporal integration; INTELLIGIBILITY; NOISE; PERCEPTION; ALGORITHM; MASKING;
D O I
10.21437/Interspeech.2016-1025
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The goal of computational speech segregation systems is to automatically segregate a target speaker from interfering maskers. Typically, these systems include a feature extraction stage in the front-end and a classification stage in the back-end. A spectrotemporal integration strategy can be applied in either the frontend, using the so-called delta features, or in the back-end, using a second classifier that exploits the posterior probability of speech from the first classifier across a spectro-temporal window. This study systematically analyzes the influence of such stages on segregation performance, the error distributions and intelligibility predictions. Results indicated that it could be problematic to exploit context in the back-end, even though such a spectro-temporal integration stage improves the segregation performance. Also, the results emphasized the potential need of a single metric that comprehensively predicts computational segregation performance and correlates well with intelligibility. The outcome of this study could help to identify the most effective spectro-temporal integration strategy for computational segregation systems.
引用
收藏
页码:3324 / 3328
页数:5
相关论文
共 50 条
  • [1] The impact of exploiting spectro-temporal context in computational speech segregation
    Bentsen, Thomas
    Kressner, Abigail A.
    Dau, Torsten
    May, Tobias
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (01): : 248 - 259
  • [2] Aging and Spectro-Temporal Integration of Speech
    Grose, John H.
    Porter, Heather L.
    Buss, Emily
    TRENDS IN HEARING, 2016, 20
  • [3] Development of spectro-temporal features of speech in children
    Gautam S.
    Singh L.
    Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
  • [4] SPECTRO-TEMPORAL NEURAL FACTORIZATION FOR SPEECH DEREVERBERATION
    Chien, Jen-Tzung
    Kuo, Kuan-Ting
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5449 - 5453
  • [5] Localized spectro-temporal cepstral analysis of speech
    Bouvrie, Jake
    Ezzat, Tony
    Poggio, Tomaso
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4733 - 4736
  • [6] Speaker sex effects on temporal and spectro-temporal measures of speech
    Herrmann, Frank
    Cunningham, Stuart P.
    Whiteside, Sandra P.
    JOURNAL OF THE INTERNATIONAL PHONETIC ASSOCIATION, 2014, 44 (01) : 59 - 74
  • [7] Comparing Different Flavors of Spectro-Temporal Features for ASR
    Meyer, Bernd T.
    Ravuri, Suman V.
    Schaedler, Marc Rene
    Morgan, Nelson
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
  • [8] Spectro-Temporal Sparsity Characterization for Dysarthric Speech Detection
    Kodrasi, Ina
    Bourlard, Herve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1210 - 1222
  • [9] Spectro-Temporal Representation of Speech for Intelligibility Assessment of Dysarthria
    Chandrashekar, H. M.
    Karjigi, Veena
    Sreedevi, N.
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (02) : 390 - 399
  • [10] Spectro-Temporal Modulations for Robust Speech Emotion Recognition
    Yeh, Lan-Ying
    Chi, Tai-Shih
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 789 - 792