MMA: a multi-view and multi-modality benchmark dataset for human action recognition

被引:5
|
作者
Gao, Zan [1 ,2 ]
Han, Tao-tao [1 ,2 ]
Zhang, Hua [1 ,2 ]
Xue, Yan-bing [1 ,2 ]
Xu, Guang-ping [1 ,2 ]
机构
[1] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Benchmark dataset; Multi-view; Multi-modalidy; Cross-view; Multi-task; Cross-domain; FEATURE-SELECTION;
D O I
10.1007/s11042-018-5833-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is an active research topic in both computer vision and machine learning communities, which has broad applications including surveillance, biometrics and human computer interaction. In the past decades, although some famous action datasets have been released, there still exist limitations, including the limited action categories and samples, camera views and variety of scenarios. Moreover, most of them are designed for a subset of the learning problems, such as single-view learning problem, cross-view learning problem and multi-task learning problem. In this paper, we introduce a multi-view, multi-modality benchmark dataset for human action recognition (abbreviated to MMA). MMA consists of 7080 action samples from 25 action categories, including 15 single-subject actions and 10 double-subject interactive actions in three views of two different scenarios. Further, we systematically benchmark the state-of-the-art approaches on MMA with respective to all three learning problems by different temporal-spatial feature representations. Experimental results demonstrate that MMA is challenging on all three learning problems due to significant intra-class variations, occlusion issues, views and scene variations, and multiple similar action categories. Meanwhile, we provide the baseline for the evaluation of existing state-of-the-art algorithms.
引用
收藏
页码:29383 / 29404
页数:22
相关论文
共 50 条
  • [1] MMA: a multi-view and multi-modality benchmark dataset for human action recognition
    Zan Gao
    Tao-tao Han
    Hua Zhang
    Yan-bing Xue
    Guang-ping Xu
    [J]. Multimedia Tools and Applications, 2018, 77 : 29383 - 29404
  • [2] A Multi-modal & Multi-view & Interactive Benchmark Dataset for Human Action Recognition
    Xu, Ning
    Liu, Anan
    Nie, Weizhi
    Wong, Yongkang
    Li, Fuwu
    Su, Yuting
    [J]. MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1195 - 1198
  • [3] Multi-modality learning for human action recognition
    Ziliang Ren
    Qieshi Zhang
    Xiangyang Gao
    Pengyi Hao
    Jun Cheng
    [J]. Multimedia Tools and Applications, 2021, 80 : 16185 - 16203
  • [4] Multi-modality learning for human action recognition
    Ren, Ziliang
    Zhang, Qieshi
    Gao, Xiangyang
    Hao, Pengyi
    Cheng, Jun
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16185 - 16203
  • [5] Human Action Recognition Via Multi-modality Information
    Gao, Zan
    Song, Jian-ming
    Zhang, Hua
    Liu, An-An
    Xue, Yan-Bing
    Xu, Guang-ping
    [J]. JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2014, 9 (02) : 739 - 748
  • [6] Continuous Multi-View Human Action Recognition
    Wang, Qiang
    Sun, Gan
    Dong, Jiahua
    Wang, Qianqian
    Ding, Zhengming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) : 3603 - 3614
  • [7] Multi-view human action recognition: A survey
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. 2013 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2013), 2013, : 522 - 525
  • [8] Generative Multi-View Human Action Recognition
    Wang, Lichen
    Ding, Zhengming
    Tao, Zhiqiang
    Liu, Yunyu
    Fu, Yun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6221 - 6230
  • [9] Multi-view representation learning for multi-view action recognition
    Hao, Tong
    Wu, Dan
    Wang, Qian
    Sun, Jin-Sheng
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 48 : 453 - 460
  • [10] Multi-modality Fusion Network for Action Recognition
    Huang, Kai
    Qin, Zheng
    Xu, Kaiping
    Ye, Shuxiong
    Wang, Guolong
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 139 - 149