Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

被引:2
|
作者
Pang, Bo [1 ]
Zha, Kaiwen [1 ]
Lu, Cewu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
关键词
D O I
10.1109/CVPRW.2018.00308
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
引用
收藏
页码:2388 / 2397
页数:10
相关论文
共 50 条
  • [11] Three-Stream Flamelet Model for Industrial Applications
    Riechelmann, Dirk
    Uchida, Masahiro
    JOURNAL OF ENGINEERING FOR GAS TURBINES AND POWER-TRANSACTIONS OF THE ASME, 2010, 132 (06): : 1 - 8
  • [12] Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos
    Wu, Yutang
    Wang, Hanli
    Li, Qinyu
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 296 - 306
  • [13] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
    Fang Ren
    Chao Tang
    Anyang Tong
    Wenjian Wang
    Multimedia Tools and Applications, 2024, 83 : 6273 - 6295
  • [14] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
    Ren, Fang
    Tang, Chao
    Tong, Anyang
    Wang, Wenjian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 6273 - 6295
  • [15] A three-stream fusion network for 3D skeleton-based action recognition
    Fang, Ming
    Liu, Qi
    Ren, Jianping
    Li, Jie
    Du, Xinning
    Liu, Shuhua
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [16] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos
    Purwanto, Didik
    Pramono, Rizard Renanda Adhi
    Chen, Yie-Tarng
    Fang, Wen-Hsien
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (08) : 1187 - 1191
  • [17] Three-stream spatio-temporal attention network for first-person action and interaction recognition
    Imran, Javed
    Raman, Balasubramanian
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 13 (02) : 1137 - 1152
  • [18] Three-stream spatio-temporal attention network for first-person action and interaction recognition
    Javed Imran
    Balasubramanian Raman
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 1137 - 1152
  • [19] Tifar-net: three-stream inception former-based action recognition network for infrared videos
    Imran, Javed
    Rajput, Amitesh Singh
    Vashisht, Rohit
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (02)
  • [20] Three-stream fusion network for first-person interaction recognition
    Kim, Ye-Ji
    Lee, Dong-Gyu
    Lee, Seong-Whan
    PATTERN RECOGNITION, 2020, 103