Efficient speaker identification using spectral entropy

被引：0

作者：

Fernando Luque-Suárez

Antonio Camarena-Ibarrola

Edgar Chávez

机构：

[1] CICESE,

[2] Universidad Michoacana,undefined

来源：

Multimedia Tools and Applications | 2019年 / 78卷

关键词：

Speaker recognition; Speaker identification; Entropygrams;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be time-consuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropy-based features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.

引用

页码：16803 / 16815

页数：12

共 50 条

[41] A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification
Siam, Ali, I
El-khobby, Heba A.
Abd Elnaby, Mustafa M.
Abdelkader, Hatem S.
Abd El-Samie, Fathi E.
WIRELESS PERSONAL COMMUNICATIONS, 2019, 108 (02) : 1055 - 1068
[42] Speaker Modeling Using Emotional Speech for More Robust Speaker Identification
Milosevic, M.
Nedeljkovic, Z.
Glavitsch, U.
Durovic, Z.
JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS, 2019, 64 (11) : 1256 - 1265
[43] EFFICIENT FEATURE EXTRACTION OF SPEAKER IDENTIFICATION USING PHONEME MEAN F-RATIO FOR CHINESE
Zhao, Chen
Wang, Hongcui
Hyon, Songgun
Wei, Jianguo
Dang, Jianwu
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 345 - 348
[44] A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification
Ali I. Siam
Heba A. El-khobby
Mustafa M. Abd Elnaby
Hatem S. Abdelkader
Fathi E. Abd El-Samie
Wireless Personal Communications, 2019, 108 : 1055 - 1068
[45] SPEAKER CLUSTERING USING VECTOR QUANTIZATION AND SPECTRAL CLUSTERING
Iso, Ken-ichi
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4986 - 4989
[46] Efficient speaker identification based on robust VQ-PCA
Lee, Y
Lee, J
Lee, KY
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 2, PROCEEDINGS, 2003, 2668 : 631 - 638
[47] SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones
Lu, Hong
Brush, A. J. Bernheim
Priyantha, Bodhi
Karlson, Amy K.
Liu, Jie
PERVASIVE COMPUTING, 2011, 6696 : 188 - 205
[48] Blind signal separation with Noise Reduction for efficient speaker identification
Hossam Hammam
Walid El-Shafai
Emad Hassan
Atef E. Abu El-Azm
Moawad I. Dessouky
Mohamed E. Elhalawany
Fathi E. Abd El-Samie
International Journal of Speech Technology, 2021, 24 : 235 - 250
[49] Confidence for Speaker Diarization using PCA Spectral Ratio
Toledo-Ronen, Orith
Aronowitz, Hagai
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2159 - 2162
[50] Modulation Spectral Features for Robust Far-Field Speaker Identification
Falk, Tiago H.
Chan, Wai-Yip
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (01): : 90 - 100

← 1 2 3 4 5 →