Fusion of Acoustic and Prosodic Features for Speaker Clustering

被引：0

作者：

Zibert, Janez ^{[1
]}

Mihelic, France ^{[2
]}

机构：

[1] Univ Primorska, Primorska Inst Nat Sci & Technol, Muzejski Trg 2, SI-6000 Koper, Slovenia

[2] Univ Ljubljana, Fac Elect Engn, Ljubljana 61000, Slovenia

来源：

TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2009年 / 5729卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work focus on a speaker clustering methods that are used in speaker diarization systems. The purpose of speaker clustering is to associate together segments that belong to the same speakers. It is usually applied in the last stage of the speaker-diarization process. We concentrate on developing of proper representations of speaker segments for clustering and explore different similarity measures for joining speaker segments together. We realize two different competitive systems. The first is a standard approach using a bottom-up agglomerative clustering principle with the Bayesian Information Criterion (BIC) as a merging criterion. In the next approach a fusion speaker clustering system is developed, where the speaker segments are modeled by acoustic and prosody representations. The idea here is to additionally model the speaker prosody characteristics and add it to basic acoustic information estimated from the speaker segments. We construct 10 basic prosody features derived from the energy of the audio signals, the estimated pitch contours, and the recognized voiced and unvoiced regions in speech. In this way we impose higher-level information in the representations of the speaker segments, which leads to improved clustering of the segments in the case of similar speaker acoustic characteristics or poor acoustic conditions.

引用

页码：210 / +

页数：3

共 50 条

[1] Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems
Zibert, Janez
Mihelic, France
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1040 - +
[2] Improvement of speaker identification by combining prosodic features with acoustic features
Zheng, R
Zhang, SW
Xu, B
[J]. ADVANCES IN BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2004, 3338 : 569 - 576
[3] CONTOUR MODELING OF PROSODIC AND ACOUSTIC FEATURES FOR SPEAKER RECOGNITION
Kockmann, Marcel
Burget, Lukas
[J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 45 - 48
[4] iVector Fusion of Prosodic and Cepstral Features for Speaker Verification
Kockmann, Marcel
Ferrer, Luciana
Burget, Lukas
Cernocky, Jan Honza
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 272 - 275
[5] Improvement of speaker recognition by combining residual and prosodic features with acoustic features
Chen, SH
Wang, HC
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 93 - 96
[6] Fusion of acoustic and tokenization features for speaker recognition
Tong, Rong
Ma, Bin
Lee, Kong-Aik
You, Changhuai
Zhu, Donglai
Kinnunen, Tomi
Sun, Hanwu
Dong, Minghui
Chng, Eng-Siong
Li, Haizhou
[J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 566 - +
[7] Prosodic Features for Speaker Verification
Mary, Leena
Yegnanarayana, B.
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 917 - 920
[8] Age and Gender Classification using Fusion of Acoustic and Prosodic Features
Meinedo, Hugo
Trancoso, Isabel
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2822 - 2825
[9] Speaker overlap detection with prosodic features for speaker diarisation
Zelenak, M.
Hernando, J.
[J]. IET SIGNAL PROCESSING, 2012, 6 (08) : 798 - 804
[10] Robust prosodic features for speaker identification
Carey, MJ
Parris, ES
LloydThomas, H
Bennett, S
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1800 - 1803

← 1 2 3 4 5 →