COMPARISON OF SPEAKER DEPENDENT AND SPEAKER INDEPENDENT EMOTION RECOGNITION

被引:22
|
作者
Rybka, Jan [1 ]
Janicki, Artur [2 ]
机构
[1] Warsaw Univ Technol, Inst Comp Sci, PL-00665 Warsaw, Poland
[2] Warsaw Univ Technol, Inst Telecommun, PL-00665 Warsaw, Poland
关键词
speech processing; emotion recognition; EMO-DB; support vector machines; artificial neural networks; SPEECH;
D O I
10.2478/amcs-2013-0060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a study of emotion recognition based on speech analysis. The introduction to the theory contains a review of emotion inventories used in various studies of emotion recognition as well as the speech corpora applied, methods of speech parametrization, and the most commonly employed classification algorithms. In the current study the EMO-DB speech corpus and three selected classifiers, the k-Nearest Neighbor (k-NN), the Artificial Neural Network (ANN) and Support Vector Machines (SVMs), were used in experiments. SVMs turned out to provide the best classification accuracy of 75.44% in the speaker dependent mode, that is, when speech samples from the same speaker were included in the training corpus. Various speaker dependent and speaker independent configurations were analyzed and compared. Emotion recognition in speaker dependent conditions usually yielded higher accuracy results than a similar but speaker independent configuration. The improvement was especially well observed if the base recognition ratio of a given speaker was low. Happiness and anger, as well as boredom and neutrality, proved to be the pairs of emotions most often confused.
引用
收藏
页码:797 / 808
页数:12
相关论文
共 50 条
  • [1] Comparison Between Speaker Dependent Mode and Speaker Independent Mode for Voice Recognition
    Mrvaljevic, Nikola
    Sun, Ying
    2009 35TH ANNUAL NORTHEAST BIOENGINEERING CONFERENCE, 2009, : 187 - 188
  • [2] Speaker Dependent, Speaker Independent and Cross Language Emotion Recognition From Speech Using GMM and HMM
    Bhaykar, Manav
    Yadav, Jainath
    Rao, K. Sreenivasa
    2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
  • [3] Performance Comparison of Speaker and Emotion Recognition
    Revathy, A.
    Shanmugapriya, P.
    Mohan, V.
    2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,
  • [4] Large Vocabulary Speech Recognition: Speaker Dependent and Speaker Independent
    Hemakumar, G.
    Punitha, P.
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, 2015, 339 : 73 - 80
  • [5] On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition
    Huang, Xuedong
    Lee, Kai-Fu
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (02): : 150 - 157
  • [6] Discriminative Adversarial Learning for Speaker Independent Emotion Recognition
    Kasun, Chamara
    Ahn, Chung Soo
    Rajapakse, Jagath C.
    Lin, Zhiping
    Huang, Guang-Bin
    INTERSPEECH 2022, 2022, : 4975 - 4979
  • [7] Cascaded Adversarial Learning for Speaker Independent Emotion Recognition
    Lekamalage, Chamara Kasun Liyanaarachchi
    Lin, Zhiping
    Huang, Guang-Bin
    Rajapakse, Jagath Chandana
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [8] Speaker independent speech emotion recognition by ensemble classification
    Schuller, B
    Reiter, S
    Müller, R
    Al-Hames, M
    Lang, M
    Rigoll, G
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 865 - 868
  • [9] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Fahad, Md Shah
    Ranjan, Ashish
    Deepak, Akshay
    Pradhan, Gayadhar
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (11) : 6113 - 6135
  • [10] Speaker Adversarial Neural Network (SANN) for Speaker-independent Speech Emotion Recognition
    Md Shah Fahad
    Ashish Ranjan
    Akshay Deepak
    Gayadhar Pradhan
    Circuits, Systems, and Signal Processing, 2022, 41 : 6113 - 6135