BINSEG: An Efficient Speaker-based Segmentation Technique

被引：0

作者：

Zdansky, Jindrich ^{[1
]}

机构：

[1] Tech Univ Liberec, Dept Elect & Signal Proc, Liberec 46117 1, Czech Republic

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

speaker change detection; acoustic segmentation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we present a new efficient approach to speaker-based audio stream segmentation. It employs binary segmentation technique that is well-known from mathematical statistic. Because integral part of this technique is hypotheses testing, we compare two well-founded (Maximum Likelihood, Informational) and one commonly used (BIC difference) approach for deriving speaker-change test statistics. Based on results of this comparison we propose both off-line and on-line speaker change detection algorithms (including way of effective training) that have merits of high accuracy and low computational costs. In simulated tests with artificially mixed data the on-line algorithm identified 95.7% of all speaker changes with precision of 96.9%. In tests done with 30 hours of real broadcast news (in 9 languages) the average recall was 74.4% and precision 70.3%.

引用

页码：2182 / 2185

页数：4

共 50 条

[1] DISTBIC: A speaker-based segmentation for audio data indexing
Delacourt, P
Wellekens, CJ
[J]. SPEECH COMMUNICATION, 2000, 32 (1-2) : 111 - 126
[2] A two-level method for unsupervised speaker-based audio segmentation
Zhang, Shilei
Zhang, Shuwu
Xu, Bo
[J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 298 - +
[3] Hybrid speaker-based segmentation system using model-level clustering
Kim, HG
Ertelt, D
Sikora, T
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 745 - 748
[4] A SPEAKER-BASED APPROACH TO ASPECT
SMITH, C
[J]. LINGUISTICS AND PHILOSOPHY, 1986, 9 (01) : 97 - 115
[5] Audio data indexing : use of second-order statistics for speaker-based segmentation
Delacourt, P
Wellekens, C
[J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 959 - 963
[6] Is a speaker-based pragmatics possible? Or how can a hearer infer a speaker's commitment?
Moeschler, Jacques
[J]. JOURNAL OF PRAGMATICS, 2013, 48 (01) : 84 - 97
[7] Computationally efficient and robust BIC-based speaker segmentation
Kotti, Margarita
Benetos, Emmanouil
Kotropoulos, Constantine
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05): : 920 - 933
[8] Negation in Modern Greek revisited: selecting between two speaker-based accounts
Veloudis, Ioannis
[J]. FOLIA LINGUISTICA, 2023, 57 (03) : 689 - 721
[9] DuG: Dual speaker-based acoustic gesture recognition for humanoid robot control
Ai, Haojun
Tang, Kaifeng
Han, Liangliang
Wang, Yifeng
Zhang, Sheng
[J]. INFORMATION SCIENCES, 2019, 504 : 84 - 94
[10] An Iterative Speaker Re-Diarization Scheme for Improving Speaker-Based Entity Extraction in Multimedia Archives
Ghaemmaghami, Houman
Dean, David
Sridharan, Sridha
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 577 - 581

← 1 2 3 4 5 →