Acoustic beamforming for speaker diarization of meetings

被引：274

作者：

Anguera, Xavier ^{[1
]}

Wooters, Chuck

Hernando, Javier

机构：

[1] Telefon ID, Madrid 28043, Spain

[2] Univ Politecn Cataluna, E-08028 Barcelona, Spain

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2007年 / 15卷 / 07期

关键词：

acoustic beamforming; meeting processing; speaker diarization; speaker segmentation and clustering;

D O I：

10.1109/TASL.2007.902460

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

When performing speaker diarization on recordings from meetings, multiple microphones of different qualities are usually available and distributed around the meeting room. Although several approaches have been proposed in recent years to take advantage of multiple microphones, they are either too computationally expensive and not easily scalable or they cannot outperform the simpler case of using the best single microphone. In this paper, the use of classic acoustic beamforming techniques is proposed together with several novel algorithms to create a complete frontend for speaker diarization in the meeting room domain. New techniques we are presenting include blind reference-channel selection, two-step time delay of arrival (TDOA) Viterbi postprocessing, and a dynamic output signal weighting algorithm, together with using such TDOA values in the diarization to complement the acoustic information. Tests on speaker diarization show a 25% relative improvement on the test set compared to using a single most centrally located microphone. Additional experimental results show improvements using these techniques in a speech recognition task.

引用

下载

页码：2011 / 2022

页数：12

共 50 条

[1] Speaker diarization for multi-party meetings using acoustic fusion
Anguera, X
Wooters, C
Hernando, J
2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
[2] Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
Anguera, Xavier
Wooters, Chuck
Pardo, Jose M.
Hernando, Javier
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 241 - +
[3] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
El-Khoury, Elie
Senac, Christine
Pinquier, Julien
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100
[4] Improving Speaker Diarization for CHIL Lecture Meetings
Huang, Jing
Marcheret, Etienne
Visweswariah, Karthik
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2628 - 2631
[5] Purity algorithms for speaker diarization of meetings data
Anguera, Xavier
Wooters, Chuck
Hernando, Javier
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 1025 - 1028
[6] The SAIL Speaker Diarization System for Analysis of Spontaneous Meetings
Han, Kyu J.
Georgiou, Panayiotis G.
Narayanan, Shrikanth S.
2008 IEEE 10TH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, VOLS 1 AND 2, 2008, : 970 - 975
[7] A DOA based speaker diarization system for real meetings
Araki, Shoko
Fujimoto, Masakiyo
Ishizuka, Kentaro
Sawada, Hiroshi
Makino, Shoji
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 30 - 33
[8] Agglomerative Information Bottleneck for speaker diarization of meetings data
Vijayasenan, Deepu
Valente, Fabio
Bourlard, Herve
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 250 - 255
[9] SPEAKER DIARIZATION OF MEETINGS BASED ON SPEAKER ROLE N-GRAM MODELS
Valente, Fabio
Vijayasenan, Deepu
Motlicek, Petr
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4416 - 4419
[10] SPEAKER EMBEDDINGS INCORPORATING ACOUSTIC CONDITIONS FOR DIARIZATION
Higuchi, Yosuke
Suzuki, Masayuki
Kurata, Gakuto
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7129 - 7133

← 1 2 3 4 5 →