Attention to Emotions: Body Emotion Recognition In-the-Wild Using Self-attention Transformer Network

被引:0
|
作者
Paiva, Pedro V. V. [1 ,3 ]
Ramos, Josue J. G. [2 ]
Gavrilova, Marina [3 ]
Carvalho, Marco A. G. [1 ]
机构
[1] Univ Estadual Campinas, Sch Technol, Limeira, Brazil
[2] Renato Archer IT Ctr, Cyber Phys Syst Div, Campinas, Brazil
[3] Univ Calgary, Dept Comp Sci, Calgary, AB, Canada
基金
加拿大自然科学与工程研究理事会; 巴西圣保罗研究基金会;
关键词
Body emotion recognition; Affective computing; Video and image processing; Gait analysis; Attention-based design; GRAPH CONVOLUTIONAL NETWORKS;
D O I
10.1007/978-3-031-66743-5_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.
引用
收藏
页码:206 / 228
页数:23
相关论文
共 50 条
  • [1] Self-attention for Speech Emotion Recognition
    Tarantino, Lorenzo
    Garner, Philip N.
    Lazaridis, Alexandros
    INTERSPEECH 2019, 2019, : 2578 - 2582
  • [2] Facial Expression Recognition in-the-Wild Using Blended Feature Attention Network
    Karnati, Mohan
    Seal, Ayan
    Jaworek-Korjakowska, Joanna
    Krejcar, Ondrej
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [3] MULTIMODAL CROSS- AND SELF-ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION
    Sun, Licai
    Liu, Bin
    Tao, Jianhua
    Lian, Zheng
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4275 - 4279
  • [4] Self-attention transfer networks for speech emotion recognition
    Ziping ZHAO
    Keru Wang
    Zhongtian BAO
    Zixing ZHANG
    Nicholas CUMMINS
    Shihuang SUN
    Haishuai WANG
    Jianhua TAO
    Bj?rn W.SCHULLER
    虚拟现实与智能硬件(中英文), 2021, 3 (01) : 43 - 54
  • [5] A lightweight transformer with linear self-attention for defect recognition
    Zhai, Yuwen
    Li, Xinyu
    Gao, Liang
    Gao, Yiping
    ELECTRONICS LETTERS, 2024, 60 (17)
  • [6] Spatial-frequency convolutional self-attention network for EEG emotion recognition
    Li, Dongdong
    Xie, Li
    Chai, Bing
    Wang, Zhe
    Yang, Hai
    APPLIED SOFT COMPUTING, 2022, 122
  • [7] Episodic Memory Network with Self-attention for Emotion Detection
    Huang, Jiangping
    Lin, Zhong
    Liu, Xin
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 220 - 224
  • [8] Transformer Self-Attention Network for Forecasting Mortality Rates
    Roshani, Amin
    Izadi, Muhyiddin
    Khaledi, Baha-Eldin
    JIRSS-JOURNAL OF THE IRANIAN STATISTICAL SOCIETY, 2022, 21 (01): : 81 - 103
  • [9] IS CROSS-ATTENTION PREFERABLE TO SELF-ATTENTION FOR MULTI-MODAL EMOTION RECOGNITION?
    Rajan, Vandana
    Brutti, Alessio
    Cavallaro, Andrea
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4693 - 4697
  • [10] Speech emotion recognition using recurrent neural networks with directional self-attention
    Li, Dongdong
    Liu, Jinlin
    Yang, Zhuo
    Sun, Linyu
    Wang, Zhe
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173