Ranking Speech Features for Their Usage in Singing Emotion Classification

被引:1
|
作者
Zaporowski, Szymon [1 ]
Kostek, Bozena [1 ]
机构
[1] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Multimedia Syst Dept, PL-80233 Gdansk, Poland
关键词
Mel Frequency Cepstral Coefficients (MFCC); MPEG 7 low-level audio descriptors; Feature selection; Singing expression classification;
D O I
10.1007/978-3-030-59491-6_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper aims to retrieve speech descriptors that may be useful for the classification of emotions in singing. For this purpose, Mel Frequency Cepstral Coefficients (MFCC) and selected Low-Level MPEG 7 descriptors were calculated based on the RAVDESS dataset. The database contains recordings of emotional speech and singing of professional actors presenting six different emotions. Employing the algorithm of Feature Selection based on the Forest of Trees method, descriptors with the best ranking results were determined. Then, the emotions were classified using the Support Vector Machine (SVM). The training was performed several times, and the results were averaged. It was found that descriptors used for emotion detection in speech are not as useful for singing. Also, an approach using Convolutional Neural Network (CNN) employing spectrogram representation of audio signals was tested. Several parameters for singing were determined, which, according to the obtained results, allow for a significant reduction in the dimensionality of feature vectors while increasing the classification efficiency of emotion detection.
引用
收藏
页码:225 / 234
页数:10
相关论文
共 50 条
  • [1] Combining Ranking and Classification to Improve Emotion Recognition in Spontaneous Speech
    Cao, Houwei
    Verma, Ragini
    Nenkova, Ani
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 358 - 361
  • [2] Speech emotion recognition: Features and classification models
    Chen, Lijiang
    Mao, Xia
    Xue, Yuli
    Cheng, Lee Lung
    [J]. DIGITAL SIGNAL PROCESSING, 2012, 22 (06) : 1154 - 1160
  • [3] EMOTION CLASSIFICATION OF SPEECH USING MODULATION FEATURES
    Chaspari, Theodora
    Dimitriadis, Dimitrios
    Maragos, Petros
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1552 - 1556
  • [4] Speech Emotion Classification using Acoustic Features
    Chen, Shizhe
    Jin, Qin
    Li, Xirong
    Yang, Gang
    Xu, Jieping
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 579 - 583
  • [5] Speech emotion classification with the combination of statistic features and temporal features
    Jiang, DN
    Cai, LH
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1967 - 1970
  • [6] Emotion in the singing voice—a deeperlook at acoustic features in the light ofautomatic classification
    Florian Eyben
    Gláucia L Salomão
    Johan Sundberg
    Klaus R Scherer
    Björn W Schuller
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [7] Selection of Fractal Dimension Features for Speech Emotion Classification
    Tamulevicius, Gintautas
    Karbauskaite, Rasa
    Dzemvda, Gintautas
    [J]. 2017 OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2017,
  • [8] Facial expressions of emotion in speech and singing
    Scotto di Carlo, N
    Guaïtella, I
    [J]. SEMIOTICA, 2004, 149 (1-4) : 37 - 55
  • [9] Survey on speech emotion recognition: Features, classification schemes, and databases
    El Ayadi, Moataz
    Kamel, Mohamed S.
    Karray, Fakhri
    [J]. PATTERN RECOGNITION, 2011, 44 (03) : 572 - 587
  • [10] Emotion classification of mandarin speech based on TEO nonlinear features
    Hui, Gao
    Chen Shanguang
    Su Guangchuan
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 394 - +