Purity algorithms for speaker diarization of meetings data

被引：0

作者：

Anguera, Xavier ^{[1
]}

Wooters, Chuck ^{[1
]}

Hernando, Javier ^{[1
]}

机构：

[1] ICSI, Berkeley, CA 94704 USA

来源：

2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

When performing speaker diarization, it is common to use an agglomerative clustering approach where the acoustic data is first split in small pieces and then pairs are merged until reaching a stopping point. When using a purely agglomerative clustering technique, one cluster cannot be split into two. Therefore, errors caused by multiple speakers being assigned to one cluster can be common. Furthermore, clusters often contain non-speech frames, creating problems when deciding which two clusters to merge and when to stop the clustering. In this paper, we present two algorithms that aim to purify the clusters. The first assigns conflicting speech segments to a new cluster, and the second detects and eliminates non-speech frames when comparing two clusters. We show improvements of over 18% relative using three datasets from the most current Rich Transcription (RT) evaluations.

引用

页码：1025 / 1028

页数：4

共 50 条

[41] SPEAKER DIARIZATION WITH LSTM
Wang, Quan
Downey, Carlton
Wan, Li
Mansfield, Philip Andrew
Moreno, Ignacio Lopez
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
[42] Multimodal Speaker Diarization
Noulas, Athanasios
Englebienne, Gwenn
Krose, Ben J. A.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
[43] Trainable Speaker Diarization
Aronowitz, Hagai
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
[44] INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION OF MEETINGS USING NON-SPEECH AS SIDE INFORMATION
Yella, Sree Harsha
Bourlard, Herve
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[45] DiarTk : An Open Source Toolkit for Research in Multistream Speaker Diarization and its Application to Meetings Recordings
Vijayasenan, Deepu
Valente, Fabio
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2167 - 2170
[46] MODELING AUDIO DIRECTIONAL STATISTICS USING A PROBABILISTIC SPATIAL DICTIONARY FOR SPEAKER DIARIZATION IN REAL MEETINGS
Fakhry, Mahmoud
Ito, Nobutaka
Araki, Shoko
Nakatani, Tomohiro
2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
[47] Factor analysis-based approaches applied to the speaker diarization task of meetings: a preliminary study
Tomasek, Pavel
Fredouille, Corinne
Matrouf, Driss
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 131 - 137
[48] Speaker diarization for multi-microphone meetings using only between-channel differences
Pardo, Jose M.
Anguera, Xavier
Wooters, Chuck
MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 257 - +
[49] SPEAKER DIARIZATION USING DATA-DRIVEN AUDIO SEQUENCING
Khemiri, Houssemeddine
Petrovska-Delacretaz, Dijana
Chollet, Gerard
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7736 - 7740
[50] ONLINE SPEAKER DIARIZATION FOR MULTIMEDIA DATA RETRIEVAL ON MOBILE DEVICES
Park, Kyung-Mi
Park, Jeong-Sik
Bae, Jae-Hyun
Oh, Yung-Hwan
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (08)

← 1 2 3 4 5 →