Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition

被引:1
|
作者
Jia-Ching Wang
Chien-Yao Wang
Yu-Hao Chin
Yu-Ting Liu
En-Ting Chen
Pao-Chi Chang
机构
[1] National Central University,Department of Computer Science and Information Engineering
[2] National Central University,Department of Communication Engineering
来源
关键词
STRF; Speaker recognition; Feature extraction; Speaker authentication;
D O I
暂无
中图分类号
学科分类号
摘要
This paper proposes a speaker recognition system using acoustic features that are based on spectral-temporal receptive fields (STRFs). The STRF is derived from physiological models of the mammalian auditory system in the spectral-temporal domain. With the STRF, a signal is expressed by rate (in Hz) and scale (in cycles/octaves). The rate and scale are used to specify the temporal response and spectral response, respectively. This paper uses the proposed STRF based feature to perform speaker recognition. First, the energy of each scale is calculated using the STRF representation. A logarithmic operation is then applied to the scale energies. Finally, a discrete cosine transform is utilized to the generation of the proposed STRF feature. This paper also presents a feature set that combines the proposed STRF feature with conventional Mel frequency cepstral coefficients (MFCCs). The support vector machines (SVMs) are adopted to be the speaker classifiers. To evaluate the performance of the proposed speaker recognition system, experiments on 36-speaker recognition were conducted. Comparing with the MFCC baseline, the proposed feature set increases the speaker recognition rates by 3.85 % and 18.49 % on clean and noisy speeches, respectively. The experiments results demonstrate the effectiveness of adopting STRF based feature in speaker recognition.
引用
收藏
页码:4055 / 4068
页数:13
相关论文
共 50 条
  • [31] Optimal MFCC Features Extraction by Differential Evolution Algorithm for Speaker Recognition
    Sadeghi, Mohsen
    Marvi, Hossein
    2017 3RD IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2017, : 169 - 173
  • [32] Robust Spectral-Temporal Two-Dimensional Spectrum Prediction
    Ding, Guoru
    Zhai, Siyu
    Chen, Xiaoming
    Zhang, Yuming
    Liu, Chao
    MACHINE LEARNING AND INTELLIGENT COMMUNICATIONS, 2017, 183 : 393 - 401
  • [33] Speech Recognizer-Based Non-Uniform Spectral Compression for Robust MFCC Feature Extraction
    Ali, Bagher Baba
    Wojcik, Waldemar
    Mamyrbayev, Orken
    Turdalyuly, Mussa
    Mekebayev, Nurbapa
    PRZEGLAD ELEKTROTECHNICZNY, 2018, 94 (06): : 90 - 93
  • [34] Auditory evoked fields elicited by spectral, temporal, and spectral-temporal changes in human cerebral cortex
    Okamoto, Hidehiko
    Teismann, Henning
    Kakigi, Ryusuke
    Pantev, Christo
    FRONTIERS IN PSYCHOLOGY, 2012, 3
  • [35] Classification and Recognition of Underwater Target Based on MFCC Feature Extraction
    Tong, Yuze
    Zhang, Xin
    Ge, Yizhou
    2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020), 2020,
  • [36] Hardware Implementation of MFCC Feature Extraction for Speech Recognition on FPGA
    Van-Lan Dao
    Van-Danh Nguyen
    Hai-Duong Nguyen
    Van-Phuc Hoang
    ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 538 : 248 - 254
  • [37] Feature Extraction Methods for Speaker Recognition: A Review
    Chaudhary, Gopal
    Srivastava, Smriti
    Bhardwaj, Saurabh
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (12)
  • [38] On the Use of MFCC Feature Vector Clustering for Efficient Text Dependent Speaker Recognition
    Samal, Ankit
    Parida, Deebyadeep
    Satapathy, Mihir Ranjan
    Mohanty, Mihir Narayan
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON FRONTIERS OF INTELLIGENT COMPUTING: THEORY AND APPLICATIONS (FICTA) 2013, 2014, 247 : 305 - 312
  • [39] A Novel Feature Extraction Methods for Speaker Recognition
    Zou, Muchun
    COMMUNICATIONS AND INFORMATION PROCESSING, PT 1, 2012, 288 : 713 - 722
  • [40] Feature extraction for poultry vocalization recognition based on improved MFCC
    Key Laboratory of Agricultural Bioenvironmental Engineering, College of Water Conservancy and Civil Engineering, China Agricultural University, Beijing 100083, China
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2008, 24 (11): : 202 - 205