Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model

被引：2

作者：

Pang, Bo ^{[1
]}

Zha, Kaiwen ^{[1
]}

Lu, Cewu ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

来源：

PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW) | 2018年

关键词：

D O I：

10.1109/CVPRW.2018.00308

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.

引用

页码：2388 / 2397

页数：10

共 50 条

[11] Three-Stream Flamelet Model for Industrial Applications
Riechelmann, Dirk
Uchida, Masahiro
JOURNAL OF ENGINEERING FOR GAS TURBINES AND POWER-TRANSACTIONS OF THE ASME, 2010, 132 (06): : 1 - 8
[12] Three-Stream Action Tubelet Detector for Spatiotemporal Action Detection in Videos
Wu, Yutang
Wang, Hanli
Li, Qinyu
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 296 - 306
[13] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
Fang Ren
Chao Tang
Anyang Tong
Wenjian Wang
Multimedia Tools and Applications, 2024, 83 : 6273 - 6295
[14] Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM
Ren, Fang
Tang, Chao
Tong, Anyang
Wang, Wenjian
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 6273 - 6295
[15] A three-stream fusion network for 3D skeleton-based action recognition
Fang, Ming
Liu, Qi
Ren, Jianping
Li, Jie
Du, Xinning
Liu, Shuhua
MULTIMEDIA SYSTEMS, 2025, 31 (02)
[16] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos
Purwanto, Didik
Pramono, Rizard Renanda Adhi
Chen, Yie-Tarng
Fang, Wen-Hsien
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (08) : 1187 - 1191
[17] Three-stream spatio-temporal attention network for first-person action and interaction recognition
Imran, Javed
Raman, Balasubramanian
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 13 (02) : 1137 - 1152
[18] Three-stream spatio-temporal attention network for first-person action and interaction recognition
Javed Imran
Balasubramanian Raman
Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 1137 - 1152
[19] Tifar-net: three-stream inception former-based action recognition network for infrared videos
Imran, Javed
Rajput, Amitesh Singh
Vashisht, Rohit
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (02)
[20] Three-stream fusion network for first-person interaction recognition
Kim, Ye-Ji
Lee, Dong-Gyu
Lee, Seong-Whan
PATTERN RECOGNITION, 2020, 103

← 1 2 3 4 5 →