Audio and video feature fusion for activity recognition in unconstrained videos

被引:0
|
作者
Lopes, Jose [1 ]
Singh, Sameer [1 ]
机构
[1] Univ Loughborough, Res Sch Informat, Loughborough LE11 3TU, Leics, England
来源
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS | 2006年 / 4224卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining audio and image processing for understanding video content has several benefits when compared to using each modality on their own. For the task of context and activity recognition in video sequences, it is important to explore both data streams to gather relevant information. In this paper we describe a video context and activity recognition model. Our work extracts a range of audio and visual features, followed by feature reduction and information fusion. We show that combining audio with video based decision making improves the quality of context and activity recognition in videos by 4% over audio data and 18% over image data.
引用
收藏
页码:823 / 831
页数:9
相关论文
共 50 条
  • [31] Audio-Adaptive Activity Recognition Across Video Domains
    Zhang, Yunhua
    Doughty, Hazel
    Shao, Ling
    Snoek, Cees G. M.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13781 - 13790
  • [32] A fusion approach to unconstrained iris recognition
    Santos, Gil
    Hoyle, Edmundo
    PATTERN RECOGNITION LETTERS, 2012, 33 (08) : 984 - 990
  • [33] Hierarchical feature representation for unconstrained video analysis
    Mohammadi, Eman
    Wu, Q. M. Jonathan
    Saif, Mehrdad
    Yang, Yimin
    NEUROCOMPUTING, 2019, 363 : 182 - 194
  • [34] Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
    Wei, Jie
    Hu, Guanyu
    Yang, Xinyu
    Luu, Anh Tuan
    Dong, Yizhuo
    INTERSPEECH 2022, 2022, : 1988 - 1992
  • [35] Active Speaker Recognition using Cross Attention Audio-Video Fusion
    Mocanu, Bogdan
    Tapu, Ruxandra
    2022 10TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP), 2022,
  • [36] Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition
    Zhou, Hengshun
    Meng, Debin
    Zhang, Yuanyuan
    Peng, Xiaojiang
    Du, Jun
    Wang, Kai
    Qiao, Yu
    ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 562 - 566
  • [37] A Novel Feature-Selection Method for Human Activity Recognition in Videos
    Tweit, Nadia
    Obaidat, Muath A.
    Rawashdeh, Majdi
    Bsoul, Abdalraoof K.
    Al Zamil, Mohammed GH.
    ELECTRONICS, 2022, 11 (05)
  • [38] Object and motion cues based collaborative approach for human activity localization and recognition in unconstrained videos
    Ullah, Javid
    Jaffar, Muhammad Arfan
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (01): : 311 - 322
  • [39] Object and motion cues based collaborative approach for human activity localization and recognition in unconstrained videos
    Javid Ullah
    Muhammad Arfan Jaffar
    Cluster Computing, 2018, 21 : 311 - 322
  • [40] Acoustic Event Detection Based on Feature-Level Fusion of Audio and Video Modalities
    Butko, Taras
    Canton-Ferrer, Cristian
    Segura, Carlos
    Giro, Xavier
    Nadeu, Climent
    Hernando, Javier
    Casas, Josep R.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,