Automatic text-independent speaker verification using convolutional deep belief network

被引:0
|
作者
Rakhmanenko, I. A. [1 ]
Shelupanov, A. A. [1 ]
Kostyuchenko, E. Y. [1 ]
机构
[1] Tomsk State Univ Control Syst & Radioelect, Prospect Lenina 40, Tomsk 634050, Russia
关键词
speaker recognition; speaker verification; Gaussian mixture models; GMM-UBM system; speech features; speech processing; deep learning; neural networks; pattern recognition; RECOGNITION;
D O I
10.18287/2412-6179-CO-621
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
This paper is devoted to the use of the convolutional deep belief network as a speech feature extractor for automatic text-independent speaker verification. The paper describes the scope and problems of automatic speaker verification systems. Types of modern speaker verification systems and types of speech features used in speaker verification systems are considered. The structure and learning algorithm of convolutional deep belief networks is described. The use of speech features extracted from three layers of a trained convolution deep belief network is proposed. Experimental studies of the proposed features were performed on two speech corpora: own speech corpus including audio recordings of 50 speakers and TIMIT speech corpus including audio recordings of 630 speakers. The accuracy of the proposed features was assessed using different types of classifiers. Direct use of these features did not increase the accuracy compared to the use of traditional spectral speech features, such as mel-frequency cepstral coefficients. However, the use of these features in the classifiers ensemble made it possible to achieve a reduction of the equal error rate to 0.21% on 50-speaker speech corpus and to 0.23% on the TIMIT speech corpus.
引用
下载
收藏
页码:596 / +
页数:12
相关论文
共 50 条
  • [31] Online text-independent speaker verification system using Autoassociative Neural Network models
    Kishore, SP
    Yegnanarayana, B
    Gangashetty, SV
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1548 - 1553
  • [32] Pseudo speaker models for text-independent speaker verification using rank threshold
    Chiba University, Chiba, Japan
    NLP-KE - Proc. Int. Conf. Nat. Lang. Process. Knowl. Eng., (265-268):
  • [33] Text-Independent Speaker ID for Automatic Video Lecture Classification Using Deep Learning
    Imran, Ali Shariq
    Kastrati, Zenun
    Svendsen, Torbjorn Karl
    Kurti, Arianit
    ICCAI '19 - PROCEEDINGS OF THE 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING AND ARTIFICIAL INTELLIGENCE, 2019, : 175 - 180
  • [34] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
    Bhattacharya, Gautam
    Alam, Jahangir
    Gupta, Vishwa
    Kenny, Patrick
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592
  • [35] A Survey on Text-Dependent and Text-Independent Speaker Verification
    Tu, Youzhi
    Lin, Weiwei
    Mak, Man-Wai
    IEEE ACCESS, 2022, 10 : 99038 - 99049
  • [36] DEEP SPEAKER EMBEDDING LEARNING WITH MULTI-LEVEL POOLING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Tang, Yun
    Ding, Guohong
    Huang, Jing
    He, Xiaodong
    Zhou, Bowen
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6116 - 6120
  • [37] RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification
    Jung, Jee-weon
    Heo, Hee-Soo
    Kim, Ju-ho
    Shim, Hye-jin
    Yu, Ha-Jin
    INTERSPEECH 2019, 2019, : 1268 - 1272
  • [38] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
    Kashani, Hamidreza Baradaran
    Reza, Shaghayegh
    Rezaei, Iman Sarraf
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [39] Deep Speaker Embedding with Long Short Term Centroid Learning for Text-independent Speaker Verification
    Peng, Junyi
    Gu, Rongzhi
    Zou, Yuexian
    INTERSPEECH 2020, 2020, : 3246 - 3250
  • [40] Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
    Camarena-Ibarrola, Antonio
    Reynoso, Miguel
    Figueroa, Karina
    ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 108 - 119