MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily Behavior Recognition in Group Settings

被引:0
|
作者
Madan, Surbhi [1 ]
Jain, Rishabh [1 ]
Sharma, Gulshan [1 ]
Subramanian, Ramanathan [2 ]
Dhall, Abhinav [1 ,3 ]
机构
[1] Indian Inst Technol Ropar, Rupnagar, Punjab, India
[2] Univ Canberra, Canberra, ACT, Australia
[3] Monash Univ, Clayton, Vic, Australia
关键词
Bodily Behavior; Multiview Attention; DCT; Transformer;
D O I
10.1145/3581783.3612858
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bodily behavioral language is an important social cue, and its automated analysis helps in enhancing the understanding of artificial intelligence systems. Furthermore, behavioral language cues are essential for active engagement in social agent-based user interactions. Despite the progress made in computer vision for tasks like head and body pose estimation, there is still a need to explore the detection of finer behaviors such as gesturing, grooming, or fumbling. This paper proposes a multiview attention fusion method named MAGIC-TBR that combines features extracted from videos and their corresponding Discrete Cosine Transform coefficients via a transformer-based approach. The experiments are conducted on the BBSI dataset and the results demonstrate the effectiveness of the proposed feature fusion with multiview attention. The code is available at: https://github.com/surbhimadan92/MAGIC- TBR
引用
收藏
页码:9526 / 9530
页数:5
相关论文
共 50 条
  • [31] Multiview Feature Fusion Attention Convolutional Recurrent Neural Networks for EEG-Based Emotion Recognition
    Xin, Ruihao
    Miao, Fengbo
    Cong, Ping
    Zhang, Fan
    Xin, Yongxian
    Feng, Xin
    JOURNAL OF SENSORS, 2023, 2023
  • [32] PlaceFormer: Transformer-Based Visual Place Recognition Using Multi-Scale Patch Selection and Fusion
    Kannan, Shyam Sundar
    Min, Byung-Cheol
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (07): : 6552 - 6559
  • [33] A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module
    Wang, Shuling
    Jiang, Fengze
    Gong, Xiaojin
    Sensors, 2024, 24 (19)
  • [34] Multi-Encoder Learning and Stream Fusion for Transformer-Based End-to-End Automatic Speech Recognition
    Lohrenz, Timo
    Li, Zhengyang
    Fingscheidt, Tim
    INTERSPEECH 2021, 2021, : 2846 - 2850
  • [35] Multi-Label Multimodal Emotion Recognition With Transformer-Based Fusion and Emotion-Level Representation Learning
    Le, Hoai-Duy
    Lee, Guee-Sang
    Kim, Soo-Hyung
    Kim, Seungwon
    Yang, Hyung-Jeong
    IEEE ACCESS, 2023, 11 : 14742 - 14751
  • [36] Worker behavior recognition based on temporal and spatial self-attention of vision Transformer
    Lu Y.-X.
    Xu G.-H.
    Tang B.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (03): : 446 - 454
  • [37] Cattle behavior recognition based on feature fusion under a dual attention mechanism
    Shang, Cheng
    Wu, Feng
    Wang, MeiLi
    Gao, Qiang
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 85
  • [38] ViT-LLMR: Vision Transformer-based lower limb motion recognition from fusion signals of MMG and IMU
    Zhang, Hanyang
    Yang, Ke
    Cao, Gangsheng
    Xia, Chunming
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 82
  • [39] Transformer-based Self-supervised Representation Learning for Emotion Recognition Using Bio-signal Feature Fusion
    Sawant, Shrutika S.
    Erick, F. X.
    Arora, Pulkit
    Pahl, Jaspar
    Foltyn, Andreas
    Holzer, Nina
    Gotz, Theresa
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [40] Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition
    Liu, Pengfei
    Li, Kun
    Meng, Helen
    INTERSPEECH 2020, 2020, : 379 - 383