OPTIMIZING NEURAL NETWORK EMBEDDINGS USING A PAIR-WISE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Dhamyal, Hira ^{[1
]}

Zhou, Tianyan ^{[1
]}

Raj, Bhiksha ^{[1
]}

Singh, Rita ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) | 2019年

关键词：

quartet loss; embeddings; neural-networks; speaker verification; DISCRIMINANT-ANALYSIS;

D O I：

10.1109/asru46091.2019.9003794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a new loss function called the "quartet" loss for the better optimization of the neural networks for matching tasks. For such tasks, where neural network embeddings are the key component, the optimization of the network for better embeddings is critical. The embeddings are required to be class discriminative, resulting in minimal inter-class variation and maximal intra-class variation even for unseen classes for better generalization of the network. The quartet loss explicitly computes the distance metric between pairs of inputs and increases the gap between the similarity score distributions between the same class pairs and the different class pairs. We evaluate on the speaker verification task and demonstrate the performance of the loss on our proposed neural network.

引用

页码：742 / 748

页数：7

共 50 条

[1] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[2] Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings
Zhang, Chunlei
Koishida, Kazuhito
Hansen, John H. L.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1633 - 1644
[3] Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
You, Lanhua
Guo, Wu
Dai, Li-Rong
Du, Jun
[J]. INTERSPEECH 2019, 2019, : 1168 - 1172
[4] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Gupta, Vishwa
Kenny, Patrick
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592
[5] Group-based speaker embeddings for text-independent speaker verification
Jung, Youngmoon
Eom, Youngsik
Lee, Yeonghyeon
Kim, Hoirin
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
[6] Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
Zhu, Yingke
Ko, Tom
Snyder, David
Mak, Brian
Povey, Daniel
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3573 - 3577
[7] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
Cai, Danwei
Cai, Zexin
Li, Ming
[J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
[8] Text-independent speaker verification using predictive neural networks
Finan, RA
Sapeluk, AT
Damper, RI
[J]. FIFTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS, 1997, (440): : 274 - 279
[9] Bayesian Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
Zhu, Yingke
Mak, Brian
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1000 - 1012
[10] Online text-independent speaker verification system using Autoassociative Neural Network models
Kishore, SP
Yegnanarayana, B
Gangashetty, SV
[J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1548 - 1553

← 1 2 3 4 5 →