Speaker age and gender classification using GMM supervector and NAP channel compensation method

被引:0
|
作者
Ergün Yücesoy
机构
[1] Ordu University,Vocational School of Technical Sciences
关键词
Speaker age and gender classification; Gaussian mixture model (GMM); Nuisance attribute projection (NAP); Support vector machine (SVM); Maximum-A-posteriori (MAP);
D O I
暂无
中图分类号
学科分类号
摘要
One of the most important factors affecting the performance of speech-based recognition systems is the differences between training and test conditions. The Nuisance attribute projection (NAP) is an effective method for eliminating these differences, called channel effects. In this study, the effects of the NAP approach in determining age and gender groups are investigated. Mel-frequency cepstral coefficients and delta coefficients are used as a feature and Gaussian mixture models (GMM) adapted from the universal background model by maximum-a-posteriori method are used for the modeling of age and gender classes. After the GMMs corresponding to each speech are converted into mean supervectors, they are applied to a Support Vector Machine (SVM), and speeches are classified according to the age and gender group of the speakers. While linear GMM kernel based on Kullback–Leibler divergence is used instead of standard SVM kernels, the NAP channel subspace size is changed between 20 and 200 and the number of GMM components is changed between 32 and 512 to determine the optimum values for these parameters. In the tests on the aGender database, the optimum number of components is determined as 128, and the optimum NAP channel subspace size is determined as 45. The age and gender classification accuracy of the system, which is developed using these optimum parameters, is increased from 60.52 to 62.03% with the use of NAP. In addition, age classification accuracy is increased from 60.23 to 61.82% and gender classification accuracy is increased from 91.71 to 92.30%.
引用
收藏
页码:3633 / 3642
页数:9
相关论文
共 50 条
  • [1] Speaker age and gender classification using GMM supervector and NAP channel compensation method
    Yucesoy, Ergun
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 13 (7) : 3633 - 3642
  • [2] SVM based speaker verification using a GMM supervector kernel and nap variability compensation
    Campbell, W. M.
    Sturim, D. E.
    Reynolds, D. A.
    Solomonoff, A.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 97 - 100
  • [3] SVM based speaker selection using GMM supervector for rapid speaker adaptation
    Wang, Jian
    Lei, Jianjun
    Guo, Jun
    Yang, Zhen
    [J]. SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 617 - 624
  • [4] GMM-based speaker age and gender classification in Czech and Slovak
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2017, 68 (01): : 3 - 12
  • [5] Enhanced Speaker Verification Using GMM-Supervector Based Modified Adaptive GMM Training
    Trinh, Tan Dat
    Park, Min Kyung
    Kim, Jin Young
    Lee, Kyong Rok
    Cho, Keeseong
    [J]. MOBILE AND WIRELESS TECHNOLOGY 2015, 2015, 310 : 147 - 153
  • [6] GMM-Based Speaker Gender and Age Classification After Voice Conversion
    Pribil, Jiri
    Pribilova, Anna
    Matousek, Jindrich
    [J]. 2016 FIRST INTERNATIONAL WORKSHOP ON SENSING, PROCESSING AND LEARNING FOR INTELLIGENT MACHINES (SPLINE), 2016,
  • [7] Speaker Verification Using SVM Kernel with GMM-Supervector Based on the Mahalanobis Distance
    Kim, Hyoung-Gook
    Shin, Dong
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2010, 29 (03): : 216 - 221
  • [8] Gender Identification of a Speaker Using MFCC and GMM
    Yucesoy, Ergun
    Nabiyev, Vasif V.
    [J]. 2013 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2013, : 626 - 629
  • [9] Multi-feature Fusion using Multi-GMM Supervector for SVM Speaker Verification
    Liu, Minghui
    Huang, Zhongwei
    [J]. PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4332 - 4335
  • [10] Analysis of feature extraction and channel compensation in a GMM speaker recognition system
    Burget, Lukas
    Matejka, Pavel
    Schwarz, Petr
    Glembek, Ondfei
    Cernocky, Jan 'Honza'
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 1979 - 1986