Automatic text-independent speaker verification using convolutional deep belief network

被引：0

作者：

Rakhmanenko, I. A. ^{[1
]}

Shelupanov, A. A. ^{[1
]}

Kostyuchenko, E. Y. ^{[1
]}

机构：

[1] Tomsk State Univ Control Syst & Radioelect, Prospect Lenina 40, Tomsk 634050, Russia

来源：

COMPUTER OPTICS | 2020年 / 44卷 / 04期

关键词：

speaker recognition; speaker verification; Gaussian mixture models; GMM-UBM system; speech features; speech processing; deep learning; neural networks; pattern recognition; RECOGNITION;

D O I：

10.18287/2412-6179-CO-621

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

This paper is devoted to the use of the convolutional deep belief network as a speech feature extractor for automatic text-independent speaker verification. The paper describes the scope and problems of automatic speaker verification systems. Types of modern speaker verification systems and types of speech features used in speaker verification systems are considered. The structure and learning algorithm of convolutional deep belief networks is described. The use of speech features extracted from three layers of a trained convolution deep belief network is proposed. Experimental studies of the proposed features were performed on two speech corpora: own speech corpus including audio recordings of 50 speakers and TIMIT speech corpus including audio recordings of 630 speakers. The accuracy of the proposed features was assessed using different types of classifiers. Direct use of these features did not increase the accuracy compared to the use of traditional spectral speech features, such as mel-frequency cepstral coefficients. However, the use of these features in the classifiers ensemble made it possible to achieve a reduction of the equal error rate to 0.21% on 50-speaker speech corpus and to 0.23% on the TIMIT speech corpus.

引用

下载

页码：596 / +

页数：12

共 50 条

[1] Deep Neural Network Embeddings for Text-Independent Speaker Verification
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
Khudanpur, Sanjeev
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
[2] Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings
Zhang, Chunlei
Koishida, Kazuhito
Hansen, John H. L.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1633 - 1644
[3] Deep Speaker Feature Learning for Text-independent Speaker Verification
Li, Lantian
Chen, Yixiang
Shi, Zing
Tang, Zhiyuan
Wang, Dong
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
[4] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
Cai, Danwei
Cai, Zexin
Li, Ming
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
[5] Text-Independent Speaker Verification with Dual Attention Network
Li, Jingyu
Lee, Tan
INTERSPEECH 2020, 2020, : 956 - 960
[6] TEMPORAL DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR TEXT-INDEPENDENT SPEAKER VERIFICATION AND PHONEMIC ANALYSIS
Kim, Seong-Hu
Nam, Hyeonuk
Park, Yong-Hwa
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6742 - 6746
[7] Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
You, Lanhua
Guo, Wu
Dai, Li-Rong
Du, Jun
INTERSPEECH 2019, 2019, : 1168 - 1172
[8] A tutorial on text-independent speaker verification
Bimbot, F. (bimbot@irisa.fr), 1600, Hindawi Publishing Corporation (2004):
[9] A tutorial on text-independent speaker verification
Bimbot, F
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
[10] A Tutorial on Text-Independent Speaker Verification
Frédéric Bimbot
Jean-François Bonastre
Corinne Fredouille
Guillaume Gravier
Ivan Magrin-Chagnolleau
Sylvain Meignier
Teva Merlin
Javier Ortega-García
Dijana Petrovska-Delacrétaz
Douglas A. Reynolds
EURASIP Journal on Advances in Signal Processing, 2004

← 1 2 3 4 5 →