Prediction of emotional dimensions PAD for emotional speech recognition

被引:0
|
作者
Sun Y. [1 ]
Hu Y.-X. [1 ]
Zhang X.-Y. [1 ]
Duan S.-F. [1 ]
机构
[1] College of Information and Computer, Taiyuan University of Technology, Taiyuan
关键词
Grey relational analysis (GRA); Least squares support vector machine (LSSVM); PAD dimensions; Principal component analysis (PCA); Speech emotion recognition;
D O I
10.3785/j.issn.1008-973X.2019.10.022
中图分类号
学科分类号
摘要
The continuous emotional dimension PAD (pleasure, arousal, dominance) was proposed to introduce into emotion recognition in view of the fact that the existing emotional characteristics only analyze emotion from the point of view of signal, and can not directly reflect the emotional state. The experimental samples were based on three emotions (sadness, anger and happiness) from the TYUT2.0 database and the Berlin voice library, and the emotional features (prosodic feature, formant, MFCC and nonlinear feature) were extracted. Grey relational analysis (GRA) was used to select the main features that affect P, A and D in order to obtain the objective and accurate PAD dimension values. Then principal component analysis (PCA) was used to extract the principal components of the main features, and was made as the input of least squares support vector machine (LSSVM) to predict the P, A and D. The emotional features, PAD dimensions and their fusion were used separately for emotion recognition by using support vector machine. The experimental results show that the prediction method improves the prediction accuracy of the P, A and D to a certain extent. The predictive values can effectively identify the emotion, which has a certain complement to emotional characteristics in emotion recognition. © 2019, Zhejiang University Press. All right reserved.
引用
收藏
页码:2041 / 2048
页数:7
相关论文
共 18 条
  • [1] Jiang H.-H., Hu B., Speech emotion recognition in mandarin based on PCA and SVM, Computer Science, 42, 11, pp. 270-273, (2015)
  • [2] Tan F.-Z., Study of speech motion states fuzzy recognition, (2015)
  • [3] Zbancioc M.D., Feraru M., Using the Lyapunov exponent from cepstral coefficients for automatic emotion recognition, International Conference and Exposition on Electrical and Power Engineering, pp. 110-113, (2014)
  • [4] Sun Y., Song C.-X., Emotional speech feature extraction and optimization of phase space reconstruction, Journal of Xidian University: Natural Science, 44, 6, pp. 162-168, (2017)
  • [5] Mehrabian A., Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament, Current Psychology, 14, 4, pp. 261-292, (1996)
  • [6] Verma G.K., Tiwary U.S., Affect representation and recognition in 3D continuous valence-arousal-dominance space, Multimedia Tools and Applications, 76, 2, pp. 1-25, (2016)
  • [7] Suykens J.A.K., Vandewalle J., Least squares support machine classifiers, Neural Processing Letters, 9, 3, pp. 293-300, (1999)
  • [8] Sun W., Sun J., Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm, Journal of Environmental Management, 188, pp. 144-152, (2016)
  • [9] Cai Z., Xu W., Meng Y., Et al., Prediction of landslide displacement based on GA-LSSVM with multiple factors, Bulletin of Engineering Geology and the Environment, 75, 2, pp. 637-646, (2016)
  • [10] Liang N., Geng L.-Y., Zhang Z.-F., Et al., A prediction method of railway freight volumes using GRA and SVM-mixed, Journal of Transportation Systems Engineering and Information Technology, 16, 6, pp. 94-99, (2016)