Robust Speaker Recognition Using Improved GFCC and Adaptive Feature Selection

被引:0
|
作者
Zhang, Xingyu [1 ,2 ]
Zou, Xia [1 ,2 ]
Sun, Meng [1 ,2 ]
Wu, Penglong [1 ,2 ]
机构
[1] Army Engn Univ, Nanjing, Jiangsu, Peoples R China
[2] PLA Army Engn Univ, Lab Intelligent Informat Proc, Nanjing, Jiangsu, Peoples R China
关键词
Gammatone Frequency Cepstrum Coefficients (GFCC); i-vector; Robust speaker recognition; Mel-Frequency Cepstrum Coefficient (MFCC); Adaptive feature selection;
D O I
10.1007/978-3-030-16946-6_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker recognition systems have shown good performance in noise-free environments, but the performance will severely deteriorate in the presence of noises. At the front end of the systems, Mel-Frequency Cepstral Coefficient (MFCC), or a relatively noise-robust feature Gammatone Frequency Cepstral Coefficients (GFCC), is commonly used as time-frequency feature. To further improve the noise-robustness of GFCC, signal processing techniques, such as DC removal, pre-emphasis and Cepstral Mean Variance Normalization (CMVN), are investigated in the extraction of GFCC. Being aware the advantages and disadvantages of MFCC and GFCC, an adaptive strategy was proposed to make feature selection based on the quality of speech. Experiments were conducted on TIMIT dataset to evaluate our approach. Compared with ordinary GFCC and MFCC features, our method significantly reduced the EER in speech data with miscellaneous SNRs.
引用
收藏
页码:159 / 169
页数:11
相关论文
共 50 条
  • [31] FEATURE SELECTION USING ADAPTIVE LEARNING NETWORKS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    CHEUNG, RS
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1978, 64 : S183 - S183
  • [32] Improved wavelet feature extraction using kernel analysis for text independent speaker recognition
    Lung, Shung-Yung
    DIGITAL SIGNAL PROCESSING, 2010, 20 (05) : 1400 - 1407
  • [33] An information-theoretic perspective on feature selection in speaker recognition
    Eriksson, T
    Kim, S
    Kang, HG
    Lee, C
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (07) : 500 - 503
  • [34] Improved Emotion Recognition With a Novel Speaker-Independent Feature
    Kim, Eun Ho
    Hyun, Kyung Hak
    Kim, Soo Hyun
    Kwak, Yoon Keun
    IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2009, 14 (03) : 317 - 325
  • [35] Robust object tracking with adaptive feature selection
    Qi, Yuan-Chen, 1600, Northeast University (29):
  • [36] Robust Threshold Selection for Environment Specific Voice in Speaker Recognition
    Soumen Kanrar
    Wireless Personal Communications, 2022, 126 : 3071 - 3092
  • [37] ROBUST SPEECH RECOGNITION THROUGH SELECTION OF SPEAKER AND ENVIRONMENT TRANSFORMS
    Bilgi, Raghavendra
    Joshi, Vikas
    Umesh, S.
    Garcia, L.
    Benitez, C.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4333 - 4336
  • [38] Robust Threshold Selection for Environment Specific Voice in Speaker Recognition
    Kanrar, Soumen
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 126 (04) : 3071 - 3092
  • [39] Speech Emotion Recognition using Feature Selection with Adaptive Structure Learning
    Rayaluru, Akshay
    Bandela, Surekha Reddy
    Kumar, T. Kishore
    2019 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2019), 2019, : 233 - 236
  • [40] Gabor feature selection for face recognition using improved AdaBoost learning
    Shen, LL
    Bai, L
    Bardsley, D
    Wang, YS
    ADVANCES IN BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2005, 3781 : 39 - 49