Efficient speaker identification using spectral entropy

被引:7
|
作者
Luque-Suarez, Fernando [1 ]
Camarena-Ibarrola, Antonio [2 ]
Chavez, Edgar [1 ]
机构
[1] CICESE, Ensenada, Baja California, Mexico
[2] Univ Michoacana, Morelia, Michoacan, Mexico
关键词
Speaker recognition; Speaker identification; Entropygrams; RECOGNITION;
D O I
10.1007/s11042-018-7035-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be time-consuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropy-based features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.
引用
收藏
页码:16803 / 16815
页数:13
相关论文
共 50 条
  • [41] A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification
    Siam, Ali, I
    El-khobby, Heba A.
    Abd Elnaby, Mustafa M.
    Abdelkader, Hatem S.
    Abd El-Samie, Fathi E.
    WIRELESS PERSONAL COMMUNICATIONS, 2019, 108 (02) : 1055 - 1068
  • [42] Speaker Modeling Using Emotional Speech for More Robust Speaker Identification
    Milosevic, M.
    Nedeljkovic, Z.
    Glavitsch, U.
    Durovic, Z.
    JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2019, 64 (11) : 1256 - 1265
  • [43] EFFICIENT FEATURE EXTRACTION OF SPEAKER IDENTIFICATION USING PHONEME MEAN F-RATIO FOR CHINESE
    Zhao, Chen
    Wang, Hongcui
    Hyon, Songgun
    Wei, Jianguo
    Dang, Jianwu
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 345 - 348
  • [44] A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification
    Ali I. Siam
    Heba A. El-khobby
    Mustafa M. Abd Elnaby
    Hatem S. Abdelkader
    Fathi E. Abd El-Samie
    Wireless Personal Communications, 2019, 108 : 1055 - 1068
  • [45] SPEAKER CLUSTERING USING VECTOR QUANTIZATION AND SPECTRAL CLUSTERING
    Iso, Ken-ichi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4986 - 4989
  • [46] Efficient speaker identification based on robust VQ-PCA
    Lee, Y
    Lee, J
    Lee, KY
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 2, PROCEEDINGS, 2003, 2668 : 631 - 638
  • [47] SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones
    Lu, Hong
    Brush, A. J. Bernheim
    Priyantha, Bodhi
    Karlson, Amy K.
    Liu, Jie
    PERVASIVE COMPUTING, 2011, 6696 : 188 - 205
  • [48] Blind signal separation with Noise Reduction for efficient speaker identification
    Hossam Hammam
    Walid El-Shafai
    Emad Hassan
    Atef E. Abu El-Azm
    Moawad I. Dessouky
    Mohamed E. Elhalawany
    Fathi E. Abd El-Samie
    International Journal of Speech Technology, 2021, 24 : 235 - 250
  • [49] Confidence for Speaker Diarization using PCA Spectral Ratio
    Toledo-Ronen, Orith
    Aronowitz, Hagai
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2159 - 2162
  • [50] Modulation Spectral Features for Robust Far-Field Speaker Identification
    Falk, Tiago H.
    Chan, Wai-Yip
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (01): : 90 - 100