Speaker Identification Using Entropygrams and Convolutional Neural Networks

被引:3
|
作者
Camarena-Ibarrola, Antonio [1 ]
Figueroa, Karina [2 ]
Garcia, Jonathan [2 ]
机构
[1] Univ Michoacana, Fac Ingn Elect, Div Estudios Postgrad, Morelia 58000, Michoacan, Mexico
[2] Univ Michoacana, Fac Ciencias Fis Matemat, Morelia 58000, Michoacan, Mexico
关键词
Speaker identification; Spectral entropy; Convolutional Neural Network;
D O I
10.1007/978-3-030-60884-2_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker Identification is a problem that consists in discovering the identity of an individual from the captured speech signal and it is still an open problem. Recent advances in deep learning in combination with spectrograms encouraged us to propose the use of entropygrams in combination with convolutional neural networks. We extract the entropygrams of specific words uttered by the individual whose identity needs to be found among those known to the system. An entropygram is an image that shows how the information contentained in the speech signal distributes along with frequency and how such distribution evolves in time. By extracting the entropygram from the speech signal we effectively transform the problem into an image recognition issue, and Convolutional Neural Networks (CNN) are known to be very useful for image recognition. In our experiments we used a collection of 21 young mexican speakers from both genders and confirmed our hypothesis that entropygrams can successfully be used instead of spectrograms for speaker identification using CNN. We also experimented with noisy speech and found that entropygrams outperform spectrograms as images that better represent speakers to be identified using CNN.
引用
收藏
页码:23 / 34
页数:12
相关论文
共 50 条
  • [1] SPEAKER IDENTIFICATION AND CLUSTERING USING CONVOLUTIONAL NEURAL NETWORKS
    Lukic, Yanick
    Vogt, Carlo
    Durr, Oliver
    Stadelmann, Thilo
    [J]. 2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [2] Text-Independent Speaker Identification Using Formants and Convolutional Neural Networks
    Camarena-Ibarrola, Antonio
    Reynoso, Miguel
    Figueroa, Karina
    [J]. ADVANCES IN SOFT COMPUTING (MICAI 2021), PT II, 2021, 13068 : 108 - 119
  • [3] Speaker identification using neural networks
    Pawar, RV
    Kajave, PP
    Mali, SN
    [J]. ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 429 - 433
  • [4] Speaker Identification using Neural Networks
    Pawar, R. V.
    Kajave, P. P.
    Mali, S. N.
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 7, 2005, 7 : 429 - 433
  • [5] Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
    Cyrta, Pawel
    Trzcinski, Tomasz
    Stokowiec, Wojciech
    [J]. INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 107 - 117
  • [6] Speaker identification using Neural Networks on an FPGA
    Trujillo-Romero, F.
    Caballero-Morales, S. O.
    [J]. 2012 IEEE NINTH ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE (CERMA 2012), 2012, : 197 - 202
  • [7] Speaker identification from voice using neural networks
    Biswas, B
    Konar, A
    [J]. JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2002, 61 (08): : 599 - 606
  • [8] 3D CONVOLUTIONAL NEURAL NETWORKS BASED SPEAKER IDENTIFICATION AND AUTHENTICATION
    Liao, Jianguo
    Wang, Shilin
    Zhang, Xingxuan
    Liu, Gongshen
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2042 - 2046
  • [9] Coevolutionary approach to speaker identification using neural networks
    He, XM
    Hu, GR
    Tan, ZH
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 1572 - 1575
  • [10] Speaker verification and identification using gamma neural networks
    Wang, C
    Xu, DX
    Principe, JC
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2085 - 2088