Speaker Identification Using Entropygrams and Convolutional Neural Networks

被引：3

作者：

Camarena-Ibarrola, Antonio ^{[1
]}

Figueroa, Karina ^{[2
]}

Garcia, Jonathan ^{[2
]}

机构：

[1] Univ Michoacana, Fac Ingn Elect, Div Estudios Postgrad, Morelia 58000, Michoacan, Mexico

[2] Univ Michoacana, Fac Ciencias Fis Matemat, Morelia 58000, Michoacan, Mexico

来源：

ADVANCES IN SOFT COMPUTING, MICAI 2020, PT I | 2020年 / 12468卷

关键词：

Speaker identification; Spectral entropy; Convolutional Neural Network;

D O I：

10.1007/978-3-030-60884-2_2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speaker Identification is a problem that consists in discovering the identity of an individual from the captured speech signal and it is still an open problem. Recent advances in deep learning in combination with spectrograms encouraged us to propose the use of entropygrams in combination with convolutional neural networks. We extract the entropygrams of specific words uttered by the individual whose identity needs to be found among those known to the system. An entropygram is an image that shows how the information contentained in the speech signal distributes along with frequency and how such distribution evolves in time. By extracting the entropygram from the speech signal we effectively transform the problem into an image recognition issue, and Convolutional Neural Networks (CNN) are known to be very useful for image recognition. In our experiments we used a collection of 21 young mexican speakers from both genders and confirmed our hypothesis that entropygrams can successfully be used instead of spectrograms for speaker identification using CNN. We also experimented with noisy speech and found that entropygrams outperform spectrograms as images that better represent speakers to be identified using CNN.

引用

页码：23 / 34

页数：12

共 50 条

[1] SPEAKER IDENTIFICATION AND CLUSTERING USING CONVOLUTIONAL NEURAL NETWORKS
Lukic, Yanick
Vogt, Carlo
Durr, Oliver
Stadelmann, Thilo
[J]. 2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
[2] Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
Camarena-Ibarrola, Antonio
Reynoso, Miguel
Figueroa, Karina
[J]. ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 108 - 119
[3] Speaker identification using neural networks
Pawar, RV
Kajave, PP
Mali, SN
[J]. ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 429 - 433
[4] Speaker Identification using Neural Networks
Pawar, R. V.
Kajave, P. P.
Mali, S. N.
[J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 429 - 433
[5] Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
Cyrta, Pawel
Trzcinski, Tomasz
Stokowiec, Wojciech
[J]. INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 107 - 117
[6] Speaker identification using Neural Networks on an FPGA
Trujillo-Romero, F.
Caballero-Morales, S. O.
[J]. 2012 IEEE NINTH ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE (CERMA 2012), 2012, : 197 - 202
[7] Speaker identification from voice using neural networks
Biswas, B
Konar, A
[J]. JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2002, 61 (08): : 599 - 606
[8] 3D CONVOLUTIONAL NEURAL NETWORKS BASED SPEAKER IDENTIFICATION AND AUTHENTICATION
Liao, Jianguo
Wang, Shilin
Zhang, Xingxuan
Liu, Gongshen
[J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2042 - 2046
[9] Coevolutionary approach to speaker identification using neural networks
He, XM
Hu, GR
Tan, ZH
[J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 1572 - 1575
[10] Speaker verification and identification using gamma neural networks
Wang, C
Xu, DX
Principe, JC
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2085 - 2088

← 1 2 3 4 5 →