BINSEG: An Efficient Speaker-based Segmentation Technique

被引:0
|
作者
Zdansky, Jindrich [1 ]
机构
[1] Tech Univ Liberec, Dept Elect & Signal Proc, Liberec 46117 1, Czech Republic
关键词
speaker change detection; acoustic segmentation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a new efficient approach to speaker-based audio stream segmentation. It employs binary segmentation technique that is well-known from mathematical statistic. Because integral part of this technique is hypotheses testing, we compare two well-founded (Maximum Likelihood, Informational) and one commonly used (BIC difference) approach for deriving speaker-change test statistics. Based on results of this comparison we propose both off-line and on-line speaker change detection algorithms (including way of effective training) that have merits of high accuracy and low computational costs. In simulated tests with artificially mixed data the on-line algorithm identified 95.7% of all speaker changes with precision of 96.9%. In tests done with 30 hours of real broadcast news (in 9 languages) the average recall was 74.4% and precision 70.3%.
引用
收藏
页码:2182 / 2185
页数:4
相关论文
共 50 条
  • [1] DISTBIC: A speaker-based segmentation for audio data indexing
    Delacourt, P
    Wellekens, CJ
    [J]. SPEECH COMMUNICATION, 2000, 32 (1-2) : 111 - 126
  • [2] A two-level method for unsupervised speaker-based audio segmentation
    Zhang, Shilei
    Zhang, Shuwu
    Xu, Bo
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 298 - +
  • [3] Hybrid speaker-based segmentation system using model-level clustering
    Kim, HG
    Ertelt, D
    Sikora, T
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 745 - 748
  • [4] A SPEAKER-BASED APPROACH TO ASPECT
    SMITH, C
    [J]. LINGUISTICS AND PHILOSOPHY, 1986, 9 (01) : 97 - 115
  • [5] Audio data indexing : use of second-order statistics for speaker-based segmentation
    Delacourt, P
    Wellekens, C
    [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 2, 1999, : 959 - 963
  • [6] Is a speaker-based pragmatics possible? Or how can a hearer infer a speaker's commitment?
    Moeschler, Jacques
    [J]. JOURNAL OF PRAGMATICS, 2013, 48 (01) : 84 - 97
  • [7] Computationally efficient and robust BIC-based speaker segmentation
    Kotti, Margarita
    Benetos, Emmanouil
    Kotropoulos, Constantine
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05): : 920 - 933
  • [8] Negation in Modern Greek revisited: selecting between two speaker-based accounts
    Veloudis, Ioannis
    [J]. FOLIA LINGUISTICA, 2023, 57 (03) : 689 - 721
  • [9] DuG: Dual speaker-based acoustic gesture recognition for humanoid robot control
    Ai, Haojun
    Tang, Kaifeng
    Han, Liangliang
    Wang, Yifeng
    Zhang, Sheng
    [J]. INFORMATION SCIENCES, 2019, 504 : 84 - 94
  • [10] An Iterative Speaker Re-Diarization Scheme for Improving Speaker-Based Entity Extraction in Multimedia Archives
    Ghaemmaghami, Houman
    Dean, David
    Sridharan, Sridha
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 577 - 581