MMA: a multi-view and multi-modality benchmark dataset for human action recognition

被引:5
|
作者
Gao, Zan [1 ,2 ]
Han, Tao-tao [1 ,2 ]
Zhang, Hua [1 ,2 ]
Xue, Yan-bing [1 ,2 ]
Xu, Guang-ping [1 ,2 ]
机构
[1] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Tianjin Key Lab Intelligence Comp & Novel Softwar, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Benchmark dataset; Multi-view; Multi-modalidy; Cross-view; Multi-task; Cross-domain; FEATURE-SELECTION;
D O I
10.1007/s11042-018-5833-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is an active research topic in both computer vision and machine learning communities, which has broad applications including surveillance, biometrics and human computer interaction. In the past decades, although some famous action datasets have been released, there still exist limitations, including the limited action categories and samples, camera views and variety of scenarios. Moreover, most of them are designed for a subset of the learning problems, such as single-view learning problem, cross-view learning problem and multi-task learning problem. In this paper, we introduce a multi-view, multi-modality benchmark dataset for human action recognition (abbreviated to MMA). MMA consists of 7080 action samples from 25 action categories, including 15 single-subject actions and 10 double-subject interactive actions in three views of two different scenarios. Further, we systematically benchmark the state-of-the-art approaches on MMA with respective to all three learning problems by different temporal-spatial feature representations. Experimental results demonstrate that MMA is challenging on all three learning problems due to significant intra-class variations, occlusion issues, views and scene variations, and multiple similar action categories. Meanwhile, we provide the baseline for the evaluation of existing state-of-the-art algorithms.
引用
收藏
页码:29383 / 29404
页数:22
相关论文
共 50 条
  • [21] MMED: A multi-domain and Multi-modality event dataset
    Yang Zhenguo
    Lin Zehang
    Guo Lingni
    Li Qing
    Liu Wenyin
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [22] Multi-view Regularized Extreme Learning Machine for Human Action Recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    [J]. ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 84 - 94
  • [23] Feature Extraction and Representation for Distributed Multi-View Human Action Recognition
    Luo, Jiajia
    Wang, Wei
    Qi, Hairong
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2013, 3 (02) : 145 - 154
  • [24] Human action recognition using multi-view image sequences features
    Ahmad, Mohiuddin
    Lee, Seong-Whan
    [J]. PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION - PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE, 2006, : 523 - +
  • [25] Silhouette-Based Multi-View Human Action Recognition in Video
    Aryanfar, Alihossein
    Yaakob, Razali
    Halin, Alfian Abdul
    Sulaiman, Md Nasir
    Kasmiran, Khairul Azhar
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND TECHNOLOGY (ICCST), 2014,
  • [26] Human action recognition using hull convexity defect features with multi-modality setups
    Youssef, M. M.
    Asari, V. K.
    [J]. PATTERN RECOGNITION LETTERS, 2013, 34 (15) : 1971 - 1979
  • [27] View knowledge transfer network for multi-view action recognition
    Liang, Zixi
    Yin, Ming
    Gao, Junli
    He, Yicheng
    Huang, Weitian
    [J]. IMAGE AND VISION COMPUTING, 2022, 118
  • [28] MLRMV: Multi-layer representation for multi-view action recognition
    Liu, Zhigang
    Yin, Ziyang
    Wu, Yin
    [J]. IMAGE AND VISION COMPUTING, 2021, 116
  • [29] Regularized Multi-View Multi-Metric Learning for Action Recognition
    Wu, Xuqing
    Shah, Shishir K.
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 471 - 476
  • [30] Multi-View and Multi-Modal Action Recognition with Learned Fusion
    Ardianto, Sandy
    Hang, Hsueh-Ming
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1601 - 1604