SUPERVISED ONLINE DIARIZATION WITH SAMPLE MEAN LOSS FOR MULTI-DOMAIN DATA

被引:0
|
作者
Fini, Enrico [1 ]
Brutti, Alessio [2 ]
机构
[1] PerVoice Spa, Trento, Italy
[2] Fdn Bruno Kessler, Trento, Italy
关键词
Speaker diarization; x-vectors; clustering; supervised learning; recurrent neural networks; SPEAKER DIARIZATION;
D O I
10.1109/icassp40776.2020.9053477
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speakers using multiple instances of a parameter-sharing recurrent neural network. In this paper we propose qualitative modifications to the model that significantly improve the learning efficiency and the overall diarization performance. In particular, we introduce a novel loss function, we called Sample Mean Loss and we present a better modelling of the speaker turn behaviour, by devising an analytical expression to compute the probability of a new speaker joining the conversation. In addition, we demonstrate that our model can be trained on fixed-length speech segments, removing the need for speaker change information in inference. Using x-vectors as input features, we evaluate our proposed approach on the multi-domain dataset employed in the DIHARD-II challenge: our online method improves with respect to the original UIS-RNN and achieves similar performance to an offline agglomerative clustering baseline using PLDA scoring.
引用
收藏
页码:7134 / 7138
页数:5
相关论文
共 50 条
  • [1] Semi-supervised single- and multi-domain regression with multi-domain training
    Michaeli, Tomer
    Eldar, Yonina C.
    Sapiro, Guillermo
    [J]. INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2012, 1 (01) : 68 - 97
  • [2] Self-Supervised Representation Learning From Multi-Domain Data
    Feng, Zeyu
    Xu, Chang
    Tao, Dacheng
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3244 - 3254
  • [3] Visualization of Multi-domain Ranked Data
    Bozzon, Alessandro
    Brambilla, Marco
    Catarci, Tiziana
    Ceri, Stefano
    Fraternali, Piero
    Matera, Maristella
    [J]. SEARCH COMPUTING: TRENDS AND DEVELOPMENTS, 2011, 6585 : 53 - +
  • [4] WEAKLY SUPERVISED USER INTENT DETECTION FOR MULTI-DOMAIN DIALOGUES
    Sun, Ming
    Pappu, Aasish
    Chen, Yun-Nung
    Rudnicky, Alexander I.
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 91 - 97
  • [5] Towards Efficient Multi-domain Data Processing
    Luong, Johannes
    Habich, Dirk
    Kissinger, Thomas
    Lehner, Wolfgang
    [J]. DATA MANAGEMENT TECHNOLOGIES AND APPLICATIONS, 2017, 737 : 47 - 64
  • [6] Multi-Domain Data Integration for Criminal Intelligence
    Dajda, Jacek
    Debski, Roman
    Kisiel-Dorohinicki, Marek
    Pietak, Kamil
    [J]. MAN-MACHINE INTERACTIONS 3, 2014, 242 : 345 - 352
  • [7] Data Fabrics for Multi-Domain Information Systems
    Habibi, Pooyan
    Moghaddassian, Morteza
    Shafaghi, Shayan
    Leon-Garcia, Alberto
    [J]. 2023 19TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT, CNSM, 2023,
  • [8] Quality Multi-domain Meshing for Volumetric Data
    Zhang, Qin
    Subramanian, Bharadwaj
    Xu, Guoliang
    Bajaj, Chandrajit L.
    [J]. 2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 472 - 476
  • [9] Dynamic Data Analytics in Multi-domain Environments
    Blasch, Erik
    Ashdown, Jonathan
    Kopsaftopoulos, Fotis
    Varela, Carlos
    Newkirk, Richard
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
  • [10] Hierarchical Self-supervised Learning for Medical Image Segmentation Based on Multi-domain Data Aggregation
    Zheng, Hao
    Han, Jun
    Wang, Hongxiao
    Yang, Lin
    Zhao, Zhuo
    Wang, Chaoli
    Chen, Danny Z.
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 : 622 - 632