Efficient speaker identification using spectral entropy

被引:7
|
作者
Luque-Suarez, Fernando [1 ]
Camarena-Ibarrola, Antonio [2 ]
Chavez, Edgar [1 ]
机构
[1] CICESE, Ensenada, Baja California, Mexico
[2] Univ Michoacana, Morelia, Michoacan, Mexico
关键词
Speaker recognition; Speaker identification; Entropygrams; RECOGNITION;
D O I
10.1007/s11042-018-7035-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be time-consuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropy-based features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.
引用
收藏
页码:16803 / 16815
页数:13
相关论文
共 50 条
  • [21] COMPUTATIONALLY EFFICIENT SPEAKER IDENTIFICATION USING FAST-MLLR BASED ANCHOR MODELING
    Sarkar, A. K.
    Umesh, S.
    Bonastre, J. F.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4357 - 4360
  • [22] Computationally Efficient Speaker Identification for Large Population Tasks using MLLR and Sufficient Statistics
    Sarkar, A. K.
    Umesh, S.
    Rath, S. P.
    ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 7 - 11
  • [23] An Efficient Text-Independent Speaker Identification Using Feature Fusion and Transformer Model
    Khan, Arfat Ahmad
    Jahangir, Rashid
    Alroobaea, Roobaea
    Alyahyan, Saleh Yahya
    Almulhi, Ahmed H.
    Alsafyani, Majed
    Wechtaisong, Chitapong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 4085 - 4100
  • [24] SPEAKER IDENTIFICATION AND VERIFICATION USING GAUSSIAN MIXTURE SPEAKER MODELS
    REYNOLDS, DA
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 91 - 108
  • [25] Speaker Recognition using Spectral Dimension Features
    Chen, Wen-Shiung
    Huang, Jr-Feng
    2009 FOURTH INTERNATIONAL MULTI-CONFERENCE ON COMPUTING IN THE GLOBAL INFORMATION TECHNOLOGY (ICCGI 2009), 2009, : 132 - 137
  • [26] Efficient KLT based on overlapped subframes for speaker identification
    Chen, CT
    Chiang, CT
    Chen, YH
    2001 IEEE THIRD WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, PROCEEDINGS, 2001, : 376 - 379
  • [27] Spectral Restoration Based Speech Enhancement for Robust Speaker Identification
    Saleem, Nasir
    Tareen, Tayyaba Gul
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2018, 5 (01): : 34 - 39
  • [28] SPEAKER IDENTIFICATION USING DIFFUSION MAPS
    Michalevsky, Yan
    Talmon, Ronen
    Cohen, Israel
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1299 - 1302
  • [29] Video classification using speaker identification
    Patel, NV
    Sethi, IK
    STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 : 218 - 225
  • [30] Speaker identification using neural networks
    Pawar, RV
    Kajave, PP
    Mali, SN
    ENFORMATIKA, VOL 7: IEC 2005 PROCEEDINGS, 2005, : 429 - 433