Experimental Evaluation of Features for Robust Speaker Identification

被引:145
|
作者
Reynolds, Douglas A. [1 ]
机构
[1] MIT, Lincoln Lab, Lexington, MA 02173 USA
来源
关键词
Communication channels (information theory) - Database systems - Digital filters - Identification (control systems) - Iterative methods - Mathematical models - Matrix algebra - Robustness (control systems) - Speech analysis - Speech processing - Vectors;
D O I
10.1109/89.326623
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This correspondence presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture densities is used as the classifier, and experiments are conducted on the King speech database, a conversational, telephone-speech database. The features examined are mel-frequency and linear-frequency filterbank cepstral coefficients, linear prediction cepstral coefficients, and perceptual linear prdiction (PLP) cepstral coefficients. The channel compensation techniques examined are cepstral mean removal, RASTA processing, and a quadratic trend removal technique. It is shown for this database that performance differences between the basic features is small, and the major gains are due to the channel compensation techniques. The best "across-the-divide" recognition accuracy of 92% is obtained for both high-order LPC features and band-limited filterbank features.
引用
收藏
页码:639 / 643
页数:5
相关论文
共 50 条
  • [41] Robust speaker recognition based on biologically inspired features
    Zouhir, Youssef
    Ben Fredj, Ines
    Ouni, Kais
    Zarka, Mohamed
    [J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2020, 12 (1-2) : 19 - 27
  • [42] Refining Cosine Distance Features for Robust Speaker Verification
    Balasingam, M. D.
    Kumar, C. Santhosh
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 152 - 155
  • [43] Improvement of speaker identification by combining prosodic features with acoustic features
    Zheng, R
    Zhang, SW
    Xu, B
    [J]. ADVANCES IN BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2004, 3338 : 569 - 576
  • [44] Speaker identification using speech and lip features
    Ou, GB
    Li, X
    Yao, XC
    Jia, HB
    Murphey, YL
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 2565 - 2570
  • [45] Speaker identification using nonlinear dynamical features
    Petry, A
    Barone, DAC
    [J]. CHAOS SOLITONS & FRACTALS, 2002, 13 (02) : 221 - 231
  • [46] Speaker Identification Enhancement Using Emotional Features
    Jabnoun, Jihed
    Zrigui, Ahmed
    Slimi, Anwer
    Ringeval, Fabien
    Schwab, Didier
    Zrigui, Mounir
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 : 526 - 539
  • [47] Visual Speaker Identification with Spatiotemporal Directional Features
    Zhao, Guoying
    Pietikainen, Matti
    [J]. IMAGE ANALYSIS AND RECOGNITION, 2013, 7950 : 1 - 10
  • [48] Learning Discriminative Features for Speaker Identification and Verification
    Yadav, Sarthak
    Rai, Atul
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2237 - 2241
  • [49] Performance of selective speech features for speaker identification
    Department of Electronics and Communication Engineering, Indian Institute of Technology, Guwahati 781039, India
    [J]. J Inst Eng India Part CP, 2008, MAY (38-46):
  • [50] Speaker identification using orthogonal and discriminative features
    Davarpanah, SH
    Mirzaei, A
    Ziaei, A
    [J]. IWSSIP 2005: Proceedings of the 12th International Worshop on Systems, Signals & Image Processing, 2005, : 293 - 296