Comparing the influence of spectro-temporal integration in computational speech segregation

被引:2
|
作者
Bentsen, Thomas [1 ]
May, Tobias [1 ]
Kressner, Abigail A. [1 ]
Dau, Torsten [1 ]
机构
[1] Tech Univ Denmark, Hearing Syst Grp, DK-2800 Lyngby, Denmark
关键词
computational speech segregation; binary masks; supervised learning; spectro-temporal integration; INTELLIGIBILITY; NOISE; PERCEPTION; ALGORITHM; MASKING;
D O I
10.21437/Interspeech.2016-1025
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The goal of computational speech segregation systems is to automatically segregate a target speaker from interfering maskers. Typically, these systems include a feature extraction stage in the front-end and a classification stage in the back-end. A spectrotemporal integration strategy can be applied in either the frontend, using the so-called delta features, or in the back-end, using a second classifier that exploits the posterior probability of speech from the first classifier across a spectro-temporal window. This study systematically analyzes the influence of such stages on segregation performance, the error distributions and intelligibility predictions. Results indicated that it could be problematic to exploit context in the back-end, even though such a spectro-temporal integration stage improves the segregation performance. Also, the results emphasized the potential need of a single metric that comprehensively predicts computational segregation performance and correlates well with intelligibility. The outcome of this study could help to identify the most effective spectro-temporal integration strategy for computational segregation systems.
引用
收藏
页码:3324 / 3328
页数:5
相关论文
共 50 条
  • [31] Multi-Stream Spectro-Temporal Features for Robust Speech Recognition
    Zhao, Sherry Y.
    Morgan, Nelson
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 898 - 901
  • [32] Multi-sensor spectro-temporal comb filtering for speech enhancement
    Demiroglu, Cenk
    Anderson, David V.
    Clements, Mark. A.
    Barnwell, Thomas
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 589 - +
  • [33] POINT PROCESS MODELS OF SPECTRO-TEMPORAL MODULATION EVENTS FOR SPEECH RECOGNITION
    Jansen, Aren
    Mesgarani, Nima
    Niyogi, Partha
    2010 CONFERENCE RECORD OF THE FORTY FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2010, : 104 - 108
  • [34] Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features
    Schubotz, Wiebke
    Brand, Thomas
    Kollmeier, Birger
    Ewert, Stephan D.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (01): : 524 - 540
  • [35] On the Suitability of the Riesz Spectro-Temporal Envelope for WaveNet Based Speech Synthesis
    Dhiman, Jitendra Kumar
    Adiga, Nagaraj
    Seelamantula, Chandra Sekhar
    INTERSPEECH 2019, 2019, : 944 - 948
  • [36] Spectro-Temporal Weighting of Loudness
    Oberfeld, Daniel
    Heeren, Wiebke
    Rennies, Jan
    Verhey, Jesko
    PLOS ONE, 2012, 7 (11):
  • [37] Spectro-temporal Encoding of Speech Responses in Glioma-Infiltrated Cortex
    Aabedi, Alexander
    Lipkin, Benjamin
    Young, Jacob
    Krishna, Saritha
    Kakaizada, Sofia
    Kaur, Jasleen
    Berger, Mitchel
    Brang, David
    Hervey-Jumper, Shawn
    JOURNAL OF NEUROSURGERY, 2021, 135 (02) : 15 - 15
  • [38] Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
    Mesgarani, N
    Slaney, M
    Shamma, SA
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 920 - 930
  • [39] Versatile Parametric Spectro-Temporal Analyzer
    Zhang, Chi
    Wong, Kenneth K. Y.
    2014 IEEE PHOTONICS SOCIETY SUMMER TOPICAL MEETING SERIES, 2014, : 132 - 133
  • [40] Spectro-Temporal Analysis of Speech Using 2-D Gabor Filters
    Ezzat, Tony
    Bouvrie, Jake
    Poggio, Tomaso
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2308 - 2311