Extending the Task of Diarization to Speaker Attribution

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Dean, David [1 ]
Vogt, Robbie [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
关键词
speaker attribution; diarization; clustering; cross likelihood ratio; joint factor analysis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.
引用
收藏
页码:1056 / 1059
页数:4
相关论文
共 50 条
  • [1] Towards a complete Binary Key System for the Speaker Diarization Task
    Delgado, Hector
    Fredouille, Corinne
    Serrano, Javier
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 572 - 576
  • [2] Leveraging speaker attribute information using multi task learning for speaker verification and diarization
    Luu, Chau
    Bell, Peter
    Renals, Steve
    INTERSPEECH 2021, 2021, : 491 - 495
  • [3] Incorporation of the ASR Output in Speaker Segmentation and Clustering within the Task of Speaker Diarization of Broadcast Streams
    Silovsky, Jan
    Zdansky, Jindrich
    Nouza, Jan
    Cerva, Petr
    Prazak, Jan
    2012 IEEE 14TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2012, : 118 - 123
  • [4] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [5] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [6] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [7] The Domain Mismatch Problem in the Broadcast Speaker Attribution Task
    Vinals, Ignacio
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [8] Trainable Speaker Diarization
    Aronowitz, Hagai
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
  • [9] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [10] New Advances in Speaker Diarization
    Aronowitz, Hagai
    Zhu, Weizhong
    Suzuki, Masayuki
    Kurata, Gakuto
    Hoory, Ron
    INTERSPEECH 2020, 2020, : 279 - 283