Automatic text-independent speaker verification using convolutional deep belief network

被引:0
|
作者
Rakhmanenko, I. A. [1 ]
Shelupanov, A. A. [1 ]
Kostyuchenko, E. Y. [1 ]
机构
[1] Tomsk State Univ Control Syst & Radioelect, Prospect Lenina 40, Tomsk 634050, Russia
关键词
speaker recognition; speaker verification; Gaussian mixture models; GMM-UBM system; speech features; speech processing; deep learning; neural networks; pattern recognition; RECOGNITION;
D O I
10.18287/2412-6179-CO-621
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
This paper is devoted to the use of the convolutional deep belief network as a speech feature extractor for automatic text-independent speaker verification. The paper describes the scope and problems of automatic speaker verification systems. Types of modern speaker verification systems and types of speech features used in speaker verification systems are considered. The structure and learning algorithm of convolutional deep belief networks is described. The use of speech features extracted from three layers of a trained convolution deep belief network is proposed. Experimental studies of the proposed features were performed on two speech corpora: own speech corpus including audio recordings of 50 speakers and TIMIT speech corpus including audio recordings of 630 speakers. The accuracy of the proposed features was assessed using different types of classifiers. Direct use of these features did not increase the accuracy compared to the use of traditional spectral speech features, such as mel-frequency cepstral coefficients. However, the use of these features in the classifiers ensemble made it possible to achieve a reduction of the equal error rate to 0.21% on 50-speaker speech corpus and to 0.23% on the TIMIT speech corpus.
引用
下载
收藏
页码:596 / +
页数:12
相关论文
共 50 条
  • [1] Deep Neural Network Embeddings for Text-Independent Speaker Verification
    Snyder, David
    Garcia-Romero, Daniel
    Povey, Daniel
    Khudanpur, Sanjeev
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 999 - 1003
  • [2] Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings
    Zhang, Chunlei
    Koishida, Kazuhito
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1633 - 1644
  • [3] Deep Speaker Feature Learning for Text-independent Speaker Verification
    Li, Lantian
    Chen, Yixiang
    Shi, Zing
    Tang, Zhiyuan
    Wang, Dong
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
  • [4] Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
    Cai, Danwei
    Cai, Zexin
    Li, Ming
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1478 - 1482
  • [5] Text-Independent Speaker Verification with Dual Attention Network
    Li, Jingyu
    Lee, Tan
    INTERSPEECH 2020, 2020, : 956 - 960
  • [6] TEMPORAL DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR TEXT-INDEPENDENT SPEAKER VERIFICATION AND PHONEMIC ANALYSIS
    Kim, Seong-Hu
    Nam, Hyeonuk
    Park, Yong-Hwa
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6742 - 6746
  • [7] Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification
    You, Lanhua
    Guo, Wu
    Dai, Li-Rong
    Du, Jun
    INTERSPEECH 2019, 2019, : 1168 - 1172
  • [8] A tutorial on text-independent speaker verification
    Bimbot, F. (bimbot@irisa.fr), 1600, Hindawi Publishing Corporation (2004):
  • [9] A tutorial on text-independent speaker verification
    Bimbot, F
    Bonastre, JF
    Fredouille, C
    Gravier, G
    Magrin-Chagnolleau, I
    Meignier, S
    Merlin, T
    Ortega-García, J
    Petrovska-Delacrétaz, D
    Reynolds, DA
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
  • [10] A Tutorial on Text-Independent Speaker Verification
    Frédéric Bimbot
    Jean-François Bonastre
    Corinne Fredouille
    Guillaume Gravier
    Ivan Magrin-Chagnolleau
    Sylvain Meignier
    Teva Merlin
    Javier Ortega-García
    Dijana Petrovska-Delacrétaz
    Douglas A. Reynolds
    EURASIP Journal on Advances in Signal Processing, 2004