IMPROVED LARGE-MARGIN SOFTMAX LOSS FOR SPEAKER DIARISATION

被引：0

作者：

Fathullah, Y. ^{[1
]}

Zhang, C. ^{[1
]}

Woodland, P. C. ^{[1
]}

机构：

[1] Univ Cambridge, Engn Dept, Cambridge, England

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

Speaker diarisation; speaker embeddings; large-margin softmax; overlapping speech; DIARIZATION;

D O I：

10.1109/icassp40776.2020.9053373

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers. It is well-known that large-margin training can improve the generalisation ability to unseen data, and its use in such open-set problems has been widespread. Therefore, this paper introduces a general approach to the large-margin softmax loss without any approximations to improve the quality of speaker embeddings for diarisation. Furthermore, a novel and simple way to stabilise training, when large-margin softmax is used, is proposed. Finally, to combat the effect of overlapping speech, different training margins are used to reduce the negative effect overlapping speech has on creating discriminative embeddings. Experiments on the AMI meeting corpus show that the use of large-margin softmax significantly improves the speaker error rate (SER). By using all hyper parameters of the loss in a unified way, further improvements were achieved which reached a relative SER reduction of 24.6% over the baseline. However, by training overlapping and single speaker speech samples with different margins, the best result was achieved, giving overall a 29.5% SER reduction relative to the baseline.

引用

下载

页码：7104 / 7108

页数：5

共 50 条

[21] Large-Margin Determinantal Point Processes
Chao, Wei-Lun
Gong, Boqing
Grauman, Kristen
Sha, Fei
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 191 - 200
[22] Deep Large-Margin Rank Loss for Multi-Label Image Classification
Ma, Zhongchen
Li, Zongpeng
Zhan, Yongzhao
MATHEMATICS, 2022, 10 (23)
[23] Fine-Grained Image Classification Using Modified DCNNs Trained by Cascaded Softmax and Generalized Large-Margin Losses
Shi, Weiwei
Gong, Yihong
Tho, Xiaoyu
Cheng, De
Zheng, Nanning
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 683 - 694
[24] Multicategory large-margin unified machines
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
不详
J. Mach. Learn. Res., 2013, (1349-1386):
[25] Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification
Yun, Sungrack
Yoo, Chang D.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 585 - 598
[26] Large-margin feature selection for monotonic classification
Hu, Qinghua
Pan, Weiwei
Song, Yanping
Yu, Daren
KNOWLEDGE-BASED SYSTEMS, 2012, 31 : 8 - 18
[27] Robust large-margin learning in hyperbolic space
Weber, Melanie
Zaheer, Manzil
Rawat, Ankit Singh
Menon, Aditya
Kumar, Sanjiv
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[28] Large-Margin Classification in Infinite Neural Networks
Cho, Youngmin
Saul, Lawrence K.
NEURAL COMPUTATION, 2010, 22 (10) : 2678 - 2697
[29] Contrapositive Margin Softmax Loss for Face Verification
Xu, Dongxue
Zhao, Qijun
PROCEEDINGS OF ICRCA 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION / ICRMV 2018: 2018 THE 3RD INTERNATIONAL CONFERENCE ON ROBOTICS AND MACHINE VISION, 2018, : 190 - 194
[30] Large-margin representation learning for texture classification
de Matos, Jonathan
de Oliveira, Luiz Eduardo Soares
Britto Junior, Alceu de Souza
Koerich, Alessandro Lameiras
PATTERN RECOGNITION LETTERS, 2023, 170 : 39 - 47

← 1 2 3 4 5 →