Comparing the influence of spectro-temporal integration in computational speech segregation

被引：2

作者：

Bentsen, Thomas ^{[1
]}

May, Tobias ^{[1
]}

Kressner, Abigail A. ^{[1
]}

Dau, Torsten ^{[1
]}

机构：

[1] Tech Univ Denmark, Hearing Syst Grp, DK-2800 Lyngby, Denmark

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

computational speech segregation; binary masks; supervised learning; spectro-temporal integration; INTELLIGIBILITY; NOISE; PERCEPTION; ALGORITHM; MASKING;

D O I：

10.21437/Interspeech.2016-1025

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The goal of computational speech segregation systems is to automatically segregate a target speaker from interfering maskers. Typically, these systems include a feature extraction stage in the front-end and a classification stage in the back-end. A spectrotemporal integration strategy can be applied in either the frontend, using the so-called delta features, or in the back-end, using a second classifier that exploits the posterior probability of speech from the first classifier across a spectro-temporal window. This study systematically analyzes the influence of such stages on segregation performance, the error distributions and intelligibility predictions. Results indicated that it could be problematic to exploit context in the back-end, even though such a spectro-temporal integration stage improves the segregation performance. Also, the results emphasized the potential need of a single metric that comprehensively predicts computational segregation performance and correlates well with intelligibility. The outcome of this study could help to identify the most effective spectro-temporal integration strategy for computational segregation systems.

引用

页码：3324 / 3328

页数：5

共 50 条

[1] The impact of exploiting spectro-temporal context in computational speech segregation
Bentsen, Thomas
Kressner, Abigail A.
Dau, Torsten
May, Tobias
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 143 (01): : 248 - 259
[2] Aging and Spectro-Temporal Integration of Speech
Grose, John H.
Porter, Heather L.
Buss, Emily
TRENDS IN HEARING, 2016, 20
[3] Development of spectro-temporal features of speech in children
Gautam S.
Singh L.
Gautam, Sumanlata (suman.gautam82@gmail.com), 1600, Springer Science and Business Media, LLC (20): : 543 - 551
[4] SPECTRO-TEMPORAL NEURAL FACTORIZATION FOR SPEECH DEREVERBERATION
Chien, Jen-Tzung
Kuo, Kuan-Ting
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5449 - 5453
[5] Localized spectro-temporal cepstral analysis of speech
Bouvrie, Jake
Ezzat, Tony
Poggio, Tomaso
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4733 - 4736
[6] Speaker sex effects on temporal and spectro-temporal measures of speech
Herrmann, Frank
Cunningham, Stuart P.
Whiteside, Sandra P.
JOURNAL OF THE INTERNATIONAL PHONETIC ASSOCIATION, 2014, 44 (01) : 59 - 74
[7] Comparing Different Flavors of Spectro-Temporal Features for ASR
Meyer, Bernd T.
Ravuri, Suman V.
Schaedler, Marc Rene
Morgan, Nelson
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1276 - +
[8] Spectro-Temporal Sparsity Characterization for Dysarthric Speech Detection
Kodrasi, Ina
Bourlard, Herve
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1210 - 1222
[9] Spectro-Temporal Representation of Speech for Intelligibility Assessment of Dysarthria
Chandrashekar, H. M.
Karjigi, Veena
Sreedevi, N.
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (02) : 390 - 399
[10] Spectro-Temporal Modulations for Robust Speech Emotion Recognition
Yeh, Lan-Ying
Chi, Tai-Shih
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 789 - 792

← 1 2 3 4 5 →