IMPROVED LARGE-MARGIN SOFTMAX LOSS FOR SPEAKER DIARISATION

被引:0
|
作者
Fathullah, Y. [1 ]
Zhang, C. [1 ]
Woodland, P. C. [1 ]
机构
[1] Univ Cambridge, Engn Dept, Cambridge, England
关键词
Speaker diarisation; speaker embeddings; large-margin softmax; overlapping speech; DIARIZATION;
D O I
10.1109/icassp40776.2020.9053373
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers. It is well-known that large-margin training can improve the generalisation ability to unseen data, and its use in such open-set problems has been widespread. Therefore, this paper introduces a general approach to the large-margin softmax loss without any approximations to improve the quality of speaker embeddings for diarisation. Furthermore, a novel and simple way to stabilise training, when large-margin softmax is used, is proposed. Finally, to combat the effect of overlapping speech, different training margins are used to reduce the negative effect overlapping speech has on creating discriminative embeddings. Experiments on the AMI meeting corpus show that the use of large-margin softmax significantly improves the speaker error rate (SER). By using all hyper parameters of the loss in a unified way, further improvements were achieved which reached a relative SER reduction of 24.6% over the baseline. However, by training overlapping and single speaker speech samples with different margins, the best result was achieved, giving overall a 29.5% SER reduction relative to the baseline.
引用
下载
收藏
页码:7104 / 7108
页数:5
相关论文
共 50 条
  • [21] Large-Margin Determinantal Point Processes
    Chao, Wei-Lun
    Gong, Boqing
    Grauman, Kristen
    Sha, Fei
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 191 - 200
  • [22] Deep Large-Margin Rank Loss for Multi-Label Image Classification
    Ma, Zhongchen
    Li, Zongpeng
    Zhan, Yongzhao
    MATHEMATICS, 2022, 10 (23)
  • [23] Fine-Grained Image Classification Using Modified DCNNs Trained by Cascaded Softmax and Generalized Large-Margin Losses
    Shi, Weiwei
    Gong, Yihong
    Tho, Xiaoyu
    Cheng, De
    Zheng, Nanning
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 683 - 694
  • [24] Multicategory large-margin unified machines
    Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
    不详
    J. Mach. Learn. Res., 2013, (1349-1386):
  • [25] Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification
    Yun, Sungrack
    Yoo, Chang D.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 585 - 598
  • [26] Large-margin feature selection for monotonic classification
    Hu, Qinghua
    Pan, Weiwei
    Song, Yanping
    Yu, Daren
    KNOWLEDGE-BASED SYSTEMS, 2012, 31 : 8 - 18
  • [27] Robust large-margin learning in hyperbolic space
    Weber, Melanie
    Zaheer, Manzil
    Rawat, Ankit Singh
    Menon, Aditya
    Kumar, Sanjiv
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [28] Large-Margin Classification in Infinite Neural Networks
    Cho, Youngmin
    Saul, Lawrence K.
    NEURAL COMPUTATION, 2010, 22 (10) : 2678 - 2697
  • [29] Contrapositive Margin Softmax Loss for Face Verification
    Xu, Dongxue
    Zhao, Qijun
    PROCEEDINGS OF ICRCA 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION / ICRMV 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS AND MACHINE VISION, 2018, : 190 - 194
  • [30] Large-margin representation learning for texture classification
    de Matos, Jonathan
    de Oliveira, Luiz Eduardo Soares
    Britto Junior, Alceu de Souza
    Koerich, Alessandro Lameiras
    PATTERN RECOGNITION LETTERS, 2023, 170 : 39 - 47