IMPROVED LARGE-MARGIN SOFTMAX LOSS FOR SPEAKER DIARISATION

被引:0
|
作者
Fathullah, Y. [1 ]
Zhang, C. [1 ]
Woodland, P. C. [1 ]
机构
[1] Univ Cambridge, Engn Dept, Cambridge, England
关键词
Speaker diarisation; speaker embeddings; large-margin softmax; overlapping speech; DIARIZATION;
D O I
10.1109/icassp40776.2020.9053373
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers. It is well-known that large-margin training can improve the generalisation ability to unseen data, and its use in such open-set problems has been widespread. Therefore, this paper introduces a general approach to the large-margin softmax loss without any approximations to improve the quality of speaker embeddings for diarisation. Furthermore, a novel and simple way to stabilise training, when large-margin softmax is used, is proposed. Finally, to combat the effect of overlapping speech, different training margins are used to reduce the negative effect overlapping speech has on creating discriminative embeddings. Experiments on the AMI meeting corpus show that the use of large-margin softmax significantly improves the speaker error rate (SER). By using all hyper parameters of the loss in a unified way, further improvements were achieved which reached a relative SER reduction of 24.6% over the baseline. However, by training overlapping and single speaker speech samples with different margins, the best result was achieved, giving overall a 29.5% SER reduction relative to the baseline.
引用
收藏
页码:7104 / 7108
页数:5
相关论文
共 50 条
  • [1] Large-Margin Softmax Loss for Convolutional Neural Networks
    Liu, Weiyang
    Wen, Yandong
    Yu, Zhiding
    Yang, Meng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [2] Large Margin Softmax Loss for Speaker Verification
    Liu, Yi
    He, Liang
    Liu, Jia
    [J]. INTERSPEECH 2019, 2019, : 2873 - 2877
  • [3] Large-Margin Regularized Softmax Cross-Entropy Loss
    Li, Xiaoxu
    Chang, Dongliang
    Tian, Tao
    Cao, Jie
    [J]. IEEE ACCESS, 2019, 7 : 19572 - 19578
  • [4] Integrate Receptive Field Block into Large-margin Softmax Loss for Face Recognition
    Wei, Yi
    Pu, Haibo
    Zhu, Yu
    Li, XiaoFan
    [J]. 2019 3RD INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2019), 2019, 1229
  • [5] Investigation of Large-Margin Softmax in Neural Language Modeling
    Huo, Jingjing
    Gao, Yingbo
    Wang, Weiyue
    Schlueter, Ralf
    Ney, Hermann
    [J]. INTERSPEECH 2020, 2020, : 3645 - 3649
  • [6] Convolutional Neural Networks with Large-Margin Softmax Loss Function for Cognitive Load Recognition
    Liu, Yuetian
    Liu, Qingshan
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 4045 - 4049
  • [7] Dynamic Margin Softmax Loss for Speaker Verification
    Zhou, Dao
    Wang, Longbiao
    Lee, Kong Aik
    Wu, Yibo
    Liu, Meng
    Dang, Jianwu
    Wei, Jianguo
    [J]. INTERSPEECH 2020, 2020, : 3800 - 3804
  • [8] Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
    Wang, Qiongqiong
    Lee, Kong Aik
    Liu, Tianchi
    [J]. INTERSPEECH 2022, 2022, : 600 - 604
  • [9] Advancing neural network calibration: The role of gradient decay in large-margin Softmax optimization
    Zhang, Siyuan
    Xie, Linbo
    [J]. NEURAL NETWORKS, 2024, 178
  • [10] Few-Shot Object Detection Based on Adaptive Attention Mechanism and Large-Margin Softmax
    Huang, Rong
    Lin, Runchao
    Dong, Aihua
    Wang, Zhijie
    [J]. AATCC JOURNAL OF RESEARCH, 2022,