SINGLE-CHANNEL SPEAKER DIARIZATION BASED ON SPATIAL FEATURES

被引：0

作者：

Hu, Mathieu ^{[1
]}

Parada, Pablo Peso ^{[2
]}

Sharma, Dushyant ^{[2
]}

Doclo, Simon ^{[3
,4
]}

van Waterschoot, Toon ^{[5
]}

Brookes, Mike ^{[1
]}

Naylor, Patrick A. ^{[1
]}

机构：

[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England

[2] Nuance Commun Inc, Voicemail To Text Res, Marlow, Bucks, England

[3] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26111 Oldenburg, Germany

[4] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4All, D-26111 Oldenburg, Germany

[5] Katholieke Univ Leuven, Dept Elect Engn ESAT STADIUS ETC, Leuven, Belgium

来源：

2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2015年

关键词：

Speaker diarization; direct-to-reverberant ratio; spatial acoustic features;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speaker diarization has gained much importance over the past five years in helping overcome key challenges faced by automatic meeting transcription systems. Current state-of-the-art algorithms can only utilize spatial information when multi-microphone recordings are available. In this paper, we propose the novel use of reverberation as a source of spatial information obtained from single-channel recordings to perform speaker diarization. The proposed system is shown to reduce speaker classification errors by 34% when compared with current MFCC based single-channel systems.

引用

页数：5

共 50 条

[1] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
Zheng, Naijun
Li, Na
Yu, JianWei
Weng, Chao
Su, Dan
Liu, XunYing
Meng, Helen
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
[2] Simultaneous Speech Detection With Spatial Features for Speaker Diarization
Zelenak, Martin
Segura, Carlos
Luque, Jordi
Hernando, Javier
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 436 - 446
[3] Speaker Separation Using Visual Speech Features and Single-channel Audio
Khan, Faheem
Milner, Ben
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3263 - 3267
[4] FILTERBANK SLOPE BASED FEATURES FOR SPEAKER DIARIZATION
Madikeri, Srikanth
Bourlard, Herve
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[5] Speaker Diarization Based on Intensity Channel Contribution
Barra-Chicote, Roberto
Manuel Pardo, Jose
Ferreiros, Javier
Manuel Montero, Juan
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 754 - 761
[6] Overlap Detection for Speaker Diarization by Fusing Spectral and Spatial Features
Zelenak, Martin
Segura, Carlos
Hernando, Javier
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2302 - 2305
[7] Speaker Verification-Based Evaluation of Single-Channel Speech Separation
Maciejewski, Matthew
Watanabe, Shinji
Khudanpur, Sanjeev
[J]. INTERSPEECH 2021, 2021, : 3520 - 3524
[8] Channel and channel subband selection for speaker diarization
Ahmed, Ahmed Isam
Chiverton, John P.
Ndzi, David L.
Al-Faris, Mahmoud M.
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 75
[9] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
Taherian, Hassan
Wang, Zhong-Qiu
Chang, Jorge
Wang, DeLiang
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
[10] AN ADAPTIVE INITIALIZATION METHOD FOR SPEAKER DIARIZATION BASED ON PROSODIC FEATURES
Imseng, David
Friedland, Gerald
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4946 - 4949

← 1 2 3 4 5 →