Audio and video feature fusion for activity recognition in unconstrained videos

被引：0

作者：

Lopes, Jose ^{[1
]}

Singh, Sameer ^{[1
]}

机构：

[1] Univ Loughborough, Res Sch Informat, Loughborough LE11 3TU, Leics, England

来源：

INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS | 2006年 / 4224卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Combining audio and image processing for understanding video content has several benefits when compared to using each modality on their own. For the task of context and activity recognition in video sequences, it is important to explore both data streams to gather relevant information. In this paper we describe a video context and activity recognition model. Our work extracts a range of audio and visual features, followed by feature reduction and information fusion. We show that combining audio with video based decision making improves the quality of context and activity recognition in videos by 4% over audio data and 18% over image data.

引用

页码：823 / 831

页数：9

共 50 条

[31] Audio-Adaptive Activity Recognition Across Video Domains
Zhang, Yunhua
Doughty, Hazel
Shao, Ling
Snoek, Cees G. M.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13781 - 13790
[32] A fusion approach to unconstrained iris recognition
Santos, Gil
Hoyle, Edmundo
PATTERN RECOGNITION LETTERS, 2012, 33 (08) : 984 - 990
[33] Hierarchical feature representation for unconstrained video analysis
Mohammadi, Eman
Wu, Q. M. Jonathan
Saif, Mehrdad
Yang, Yimin
NEUROCOMPUTING, 2019, 363 : 182 - 194
[34] Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
Wei, Jie
Hu, Guanyu
Yang, Xinyu
Luu, Anh Tuan
Dong, Yizhuo
INTERSPEECH 2022, 2022, : 1988 - 1992
[35] Active Speaker Recognition using Cross Attention Audio-Video Fusion
Mocanu, Bogdan
Tapu, Ruxandra
2022 10TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP), 2022,
[36] Exploring Emotion Features and Fusion Strategies for Audio-Video Emotion Recognition
Zhou, Hengshun
Meng, Debin
Zhang, Yuanyuan
Peng, Xiaojiang
Du, Jun
Wang, Kai
Qiao, Yu
ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 562 - 566
[37] A Novel Feature-Selection Method for Human Activity Recognition in Videos
Tweit, Nadia
Obaidat, Muath A.
Rawashdeh, Majdi
Bsoul, Abdalraoof K.
Al Zamil, Mohammed GH.
ELECTRONICS, 2022, 11 (05)
[38] Object and motion cues based collaborative approach for human activity localization and recognition in unconstrained videos
Ullah, Javid
Jaffar, Muhammad Arfan
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2018, 21 (01): : 311 - 322
[39] Object and motion cues based collaborative approach for human activity localization and recognition in unconstrained videos
Javid Ullah
Muhammad Arfan Jaffar
Cluster Computing, 2018, 21 : 311 - 322
[40] Acoustic Event Detection Based on Feature-Level Fusion of Audio and Video Modalities
Butko, Taras
Canton-Ferrer, Cristian
Segura, Carlos
Giro, Xavier
Nadeu, Climent
Hernando, Javier
Casas, Josep R.
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,

← 1 2 3 4 5 →