Emotion in the singing voice-a deeper look at acoustic features in the light of automatic classification

被引:23
|
作者
Eyben, Florian [1 ,3 ,8 ]
Salomao, Glaucia L. [2 ,5 ]
Sundberg, Johan [2 ,6 ]
Scherer, Klaus R. [3 ]
Schuller, Bjoern W. [1 ,3 ,4 ,7 ]
机构
[1] Tech Univ Munich, MISP Grp, D-80290 Munich, Germany
[2] KTH Royal Inst Technol, Sch Comp Sci & Commun, Dept Speech Mus Hearing, Stockholm, Sweden
[3] Univ Geneva, Geneva, Switzerland
[4] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London, England
[5] Stockholm Univ, Dept Linguist, S-10691 Stockholm, Sweden
[6] Univ Coll Mus Educ, Stockholm, Sweden
[7] Univ Passau, Chair Complex & Intelligent Syst, D-94032 Passau, Germany
[8] AudEERING UG Ltd, Gilching, Germany
基金
瑞士国家科学基金会; 欧洲研究理事会;
关键词
Emotion recognition; Singing voice; Acoustic features; Feature selection; RECOGNITION; EXPRESSION;
D O I
10.1186/s13636-015-0057-6
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We investigate the automatic recognition of emotions in the singing voice and study the worth and role of a variety of relevant acoustic parameters. The data set contains phrases and vocalises sung by eight renowned professional opera singers in ten different emotions and a neutral state. The states are mapped to ternary arousal and valence labels. We propose a small set of relevant acoustic features basing on our previous findings on the same data and compare it with a large-scale state-of-the-art feature set for paralinguistics recognition, the baseline feature set of the Interspeech 2013 Computational Paralinguistics ChallengE (ComParE). A feature importance analysis with respect to classification accuracy and correlation of features with the targets is provided in the paper. Results show that the classification performance with both feature sets is similar for arousal, while the ComParE set is superior for valence. Intra singer feature ranking criteria further improve the classification accuracy in a leave-one-singer-out cross validation significantly.
引用
收藏
页数:9
相关论文
共 45 条
  • [1] Emotion in the singing voice—a deeperlook at acoustic features in the light ofautomatic classification
    Florian Eyben
    Gláucia L Salomão
    Johan Sundberg
    Klaus R Scherer
    Björn W Schuller
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2015
  • [2] Automatic classification of singing voice quality
    Kostek, B
    Zwan, P
    [J]. 5th International Conference on Intelligent Systems Design and Applications, Proceedings, 2005, : 444 - 449
  • [3] Comparing the acoustic expression of emotion in the speaking and the singing voice
    Scherer, Klaus R.
    Sundberg, Johan
    Tamarit, Lucas
    Salomao, Glaucia L.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01): : 218 - 235
  • [4] The expression of emotion in the singing voice: Acoustic patterns in vocal performance
    Scherer, Klaus R.
    Sundberg, Johan
    Fantini, Bernardino
    Trznadel, Stephanie
    Eyben, Florian
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 142 (04): : 1805 - 1815
  • [5] Perceptual (but not acoustic) features predict singing voice preferences
    Bruder, Camila
    Poeppel, David
    Larrouy-Maestri, Pauline
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [6] Ranking Speech Features for Their Usage in Singing Emotion Classification
    Zaporowski, Szymon
    Kostek, Bozena
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS (ISMIS 2020), 2020, 12117 : 225 - 234
  • [7] Improving Automatic Singing Skill Evaluation with Timbral Features, Attention, and Singing Voice Separation
    Ju, Yaolong
    Xu, Chunyang
    Guo, Yichen
    Li, Jinhu
    Lui, Simon
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 612 - 617
  • [8] A Deeper Look into Ancillary Features in LI-RADS Classification
    Crivellaro, Priscila
    [J]. RADIOLOGY, 2024, 310 (02)
  • [9] Speech Emotion Classification using Acoustic Features
    Chen, Shizhe
    Jin, Qin
    Li, Xirong
    Yang, Gang
    Xu, Jieping
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 579 - 583
  • [10] An Investigation of Acoustic Features for Singing Voice Conversion based on Perceptual Age
    Kobayashi, Kazuhiro
    Doi, Hironori
    Toda, Tomoki
    Nakano, Tomoyasu
    Goto, Masataka
    Neubig, Graham
    Sakti, Sakriani
    Nakamura, Satoshi
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1056 - 1060