Audio-Visual Emotion Recognition Based on Facial Expression and Affective Speech

被引:0
|
作者
Zhang, Shiqing [1 ,2 ]
Li, Lemin [1 ]
Zhao, Zhijin [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Commun & Informat Engn, Chengdu 611731, Peoples R China
[2] Taizhou Univ, Sch Phys & Elect Engn, Taizhou 318000, Peoples R China
[3] Hangzhou Dianzi Univ, Sch Communicat Engn, Hangzhou 310018, Peoples R China
来源
关键词
Emotion recognition; Local binary patterns; Acoustic features; Support vector machines; HUMAN-COMPUTER INTERACTION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, the performance of audio-visual emotion recognition integrating facial expression and affective speech is investigated. The local binary patterns (LBP) features are extracted for facial image representations for the single facial expression recognition. Three typical acoustic features including prosody features, voice quality features as well as the Mel-Frequency Cepstral Coefficients (MFCC) features are extracted for the single speech emotion recognition. Then, we fuse the two modalities, i.e. facial expression and affective speech, and performed audio-visual emotion recognition at the feature-level. The support vector machines (SVM) is used for all the emotion classification. Experimental results on the publicly available eNTERFACE' 05 emotional audio-visual database demonstrate that the presented method of audio-visual expression recognition obtains an accuracy of 66.51%, giving better performance than the mono-modality.
引用
收藏
页码:46 / +
页数:3
相关论文
共 50 条
  • [1] Audio-visual affective expression recognition
    Huang, Thomas S.
    Zeng, Zhihong
    [J]. MIPPR 2007: PATTERN RECOGNITION AND COMPUTER VISION, 2007, 6788
  • [2] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
    Hwang, Jung-Wook
    Park, Jeongkyun
    Park, Rae-Hong
    Park, Hyung-Min
    [J]. APPLIED ACOUSTICS, 2023, 211
  • [3] Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes
    Ito, Koichiro
    Fujioka, Takuya
    Sun, Qinghua
    Nagamatsu, Kenji
    [J]. INTERSPEECH 2021, 2021, : 4493 - 4497
  • [4] MANDARIN AUDIO-VISUAL SPEECH RECOGNITION WITH EFFECTS TO THE NOISE AND EMOTION
    Pao, Tsang-Long
    Liao, Wen-Yuan
    Chen, Yu-Te
    Wu, Tsan-Nung
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (02): : 711 - 723
  • [5] Audio-visual based emotion recognition - A new approach
    Song, ML
    Bu, JJ
    Chen, C
    Li, N
    [J]. PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 1020 - 1025
  • [6] Audio-visual spontaneous emotion recognition
    Zeng, Zhihong
    Hu, Yuxiao
    Roisman, Glenn I.
    Wen, Zhen
    Fu, Yun
    Huang, Thomas S.
    [J]. ARTIFICIAL INTELLIGENCE FOR HUMAN COMPUTING, 2007, 4451 : 72 - +
  • [7] Deep emotion recognition based on audio-visual correlation
    Hajarolasvadi, Noushin
    Demirel, Hasan
    [J]. IET COMPUTER VISION, 2020, 14 (07) : 517 - 527
  • [8] Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
    Wei, Jie
    Hu, Guanyu
    Yang, Xinyu
    Luu, Anh Tuan
    Dong, Yizhuo
    [J]. INTERSPEECH 2022, 2022, : 1988 - 1992
  • [9] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    [J]. INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [10] Deep Audio-Visual Speech Recognition
    Afouras, Triantafyllos
    Chung, Joon Son
    Senior, Andrew
    Vinyals, Oriol
    Zisserman, Andrew
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 8717 - 8727