Multimodal Framework for Analyzing the Affect of a Group of People

被引:19
|
作者
Huang, Xiaohua [1 ]
Dhall, Abhinav [2 ]
Goecke, Roland [3 ]
Pietikainen, Matti [1 ]
Zhao, Guoying [1 ,4 ]
机构
[1] Univ Oulu, Ctr Machine Vis & Signal Anal, Oulu 90014, Finland
[2] Indian Inst Technol Ropar, Dept Comp Sci & Engn, Rupnagar 140001, India
[3] Univ Canberra, Human Ctr Technol Res Ctr, Bruce, ACT 2617, Australia
[4] Northwest Univ, Sch Informat & Technol, Xian 710069, Shaanxi, Peoples R China
基金
芬兰科学院; 中国国家自然科学基金;
关键词
Facial expression recognition; Group-level emotion recognition; Feature descriptor; Information aggregation; Multi-modality; TEXTURE CLASSIFICATION; REPRESENTATION; EMOTIONS; MODEL;
D O I
10.1109/TMM.2018.2818015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the advances in multimedia and the world wide web, users upload millions of images and videos everyone on social networking platforms on the Internet. From the perspective of automatic human behavior understanding, it is of interest to analyze and model the affects that are exhibited by groups of people who are participating in social events in these images. However, the analysis of the affect that is expressed by multiple people is challenging due to the varied indoor and outdoor settings. Recently, a few interesting works have investigated face-based group-level emotion recognition (GER). In this paper, we propose a multimodal framework for enhancing the affective analysis ability of GER in challenging environments. Specifically, for encoding a person's information in a group-level image, we first propose an information aggregation method for generating feature descriptions of face, upper body, and scene. Later, we revisit localized multiple kernel learning for fusing face, upper body, and scene information for GER against challenging environments. Intensive experiments are performed on two challenging group-level emotion databases (HAPPEI and GAFF) to investigate the roles of the face, upper body, scene information, and the multimodal framework. Experimental results demonstrate that the multimodal framework achieves promising performance for GER.
引用
收藏
页码:2706 / 2721
页数:16
相关论文
共 50 条
  • [1] Analyzing multimodal interaction: A methodological framework.
    Jewitt, C
    APPLIED LINGUISTICS, 2005, 26 (02) : 275 - 278
  • [2] A Multimodal Framework for Analyzing Websites as Cultural Expressions
    Pauwels, Luc
    JOURNAL OF COMPUTER-MEDIATED COMMUNICATION, 2012, 17 (03): : 247 - 265
  • [3] Analyzing multimodal interaction. A methodological framework.
    Patino Santos, Adriana
    DISCOURSE STUDIES, 2006, 8 (05) : 714 - 716
  • [4] A critical multimodal framework for reading and analyzing pedagogical materials
    Huang, Shin-ying
    ENGLISH TEACHING-PRACTICE AND CRITIQUE, 2019, 18 (01): : 52 - 69
  • [5] Group Affect Prediction Using Multimodal Distributions
    Shamsi, Saqib Nizam
    Singh, Bhanu Pratap
    Wadhwa, Manya
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2018), 2018, : 77 - 83
  • [6] Analyzing controller conflicts in multimodal smart grids - framework design
    Nieße A.
    Shahbakhsh A.
    Energy Informatics, 1 (Suppl 1) : 319 - 325
  • [7] A Framework for Identifying Influential People by Analyzing Social Media Data
    Ahsan, Md. Sabbir Al
    Arefin, Mohammad Shamsul
    Kayes, A. S. M.
    Hammoudeh, Mohammad
    Aldabbas, Omar
    APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 16
  • [8] A Multi-Layered Framework for Analyzing Primary Students' Multimodal Reasoning in Science
    Xu, Lihua
    van Driel, Jan
    Healy, Ryan
    EDUCATION SCIENCES, 2021, 11 (12):
  • [9] The More the Merrier: Analysing the Affect of a Group of People in Images
    Dhall, Abhinav
    Joshi, Jyoti
    Sikka, Karan
    Goecke, Roland
    Sebe, Nicu
    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1, 2015,
  • [10] A Joint Multimodal Group Analysis Framework for Modeling Corticomuscular Activity
    Chen, Xun
    Chen, Xiang
    Ward, Rabab Kreidieh
    Wang, Z. Jane
    IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (05) : 1049 - 1059