OPTIMIZING NEURAL NETWORK EMBEDDINGS USING A PAIR-WISE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dhamyal, Hira [1 ]
Zhou, Tianyan [1 ]
Raj, Bhiksha [1 ]
Singh, Rita [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
quartet loss; embeddings; neural-networks; speaker verification; DISCRIMINANT-ANALYSIS;
D O I
10.1109/asru46091.2019.9003794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new loss function called the "quartet" loss for the better optimization of the neural networks for matching tasks. For such tasks, where neural network embeddings are the key component, the optimization of the network for better embeddings is critical. The embeddings are required to be class discriminative, resulting in minimal inter-class variation and maximal intra-class variation even for unseen classes for better generalization of the network. The quartet loss explicitly computes the distance metric between pairs of inputs and increases the gap between the similarity score distributions between the same class pairs and the different class pairs. We evaluate on the speaker verification task and demonstrate the performance of the loss on our proposed neural network.
引用
收藏
页码:742 / 748
页数:7
相关论文
共 50 条
  • [41] Generalized locally recurrent probabilistic neural networks for text-independent speaker verification
    Ganchev, T
    Fakotakis, N
    Tasoulis, DK
    Vrahatis, MN
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 41 - 44
  • [42] Text-Independent Speaker Verification Using Lightweight 3D Convolutional Neural Networks
    Chen, Jyun-Yan
    Jeng, Jin-Tsong
    [J]. 2024 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING, ICSSE 2024, 2024,
  • [43] BOUNDARY DISCRIMINATIVE LARGE MARGIN COSINE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Li, Rongjin
    Li, Na
    Tuo, Deyi
    Yu, Meng
    Su, Dan
    Yu, Dong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6321 - 6325
  • [44] RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification
    Jung, Jee-weon
    Heo, Hee-Soo
    Kim, Ju-ho
    Shim, Hye-jin
    Yu, Ha-Jin
    [J]. INTERSPEECH 2019, 2019, : 1268 - 1272
  • [45] Text-Independent Speaker Verification Using Rank Threshold in Large Number of Speaker Models
    Okamoto, Haruka
    Tsuge, Satoru
    Abdelwahab, Amira
    Nishida, Masafumi
    Horiuchi, Yasuo
    Kuroiwa, Shingo
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2319 - +
  • [46] An integrated system for text-independent speaker recognition using binary neural network classifiers
    Hou, FL
    Wang, BX
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 710 - 713
  • [47] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255
  • [48] A text-independent speaker verification model: A comparative analysis
    Charan, Rishi
    Manisha, A.
    Karthik, R.
    Kumar, Rajesh M.
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [49] Analysis-Based Optimization of Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification
    Kim, Seong-Hu
    Nam, Hyeonuk
    Park, Yong-Hwa
    [J]. IEEE ACCESS, 2023, 11 : 60646 - 60659
  • [50] Residual Factor Analysis for Text-independent Speaker Verification
    Zhu, Lei
    Zheng, Rong
    Xu, Bo
    [J]. PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 964 - 968