Joint Segmentation and Classification of Human Actions in Video

被引：0

作者：

Minh Hoai ^{[1
]}

Lan, Zhen-Zhong ^{[1
]}

De la Torre, Fernando ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2011年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automatic video segmentation and action recognition has been a long-standing problem in computer vision. Much work in the literature treats video segmentation and action recognition as two independent problems; while segmentation is often done without a temporal model of the activity, action recognition is usually performed on pre-segmented clips. In this paper we propose a novel method that avoids the limitations of the above approaches by jointly performing video segmentation and action recognition. Unlike standard approaches based on extensions of dynamic Bayesian networks, our method is based on a discriminative temporal extension of the spatial bag-of-words model that has been very popular in object recognition. The classification is performed robustly within a multi-class SVM framework whereas the inference over the segments is done efficiently with dynamic programming. Experimental results on honeybee, Weizmann, and Hollywood datasets illustrate the benefits of our approach compared to state-of-the-art methods.

引用

页数：8

共 50 条

[21] Multimodal topic segmentation and classification of news video
Raaijmakers, S
den Hartog, J
Baan, J
IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A33 - A36
[22] Temporal video segmentation and classification of edit effects
Porter, S
Mirmehdi, M
Thomas, B
IMAGE AND VISION COMPUTING, 2003, 21 (13-14) : 1097 - 1106
[23] Video segmentation via temporal pattern classification
Cooper, Matthew
Liu, Ting
Rieffel, Eleanor
IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (03) : 610 - 618
[24] Automatic Annotation of Human Actions in Video
Duchenne, Olivier
Laptev, Ivan
Sivic, Josef
Bach, Francis
Ponce, Jean
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 1491 - 1498
[25] Analysis of Human Actions for Video Indexing
Chen, Zhuoyuan
Cui, Peng
Sun, Lifeng
Yang, Shiqiang
Advances in Multimedia Information Processing - PCM 2008, 9th Pacific Rim Conference on Multimedia, 2008, 5353 : 635 - 642
[26] A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
Tang, Duyu
Qin, Bing
Wei, Furu
Dong, Li
Liu, Ting
Zhou, Ming
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1750 - 1761
[27] A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
Tang, Duyu
Qin, Bing
Wei, Furu
Dong, Li
Liu, Ting
Zhou, Ming
IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (11): : 1750 - 1761
[28] Joint Segmentation and Path Classification of Curvilinear Structures
Mosinska, Agata
Kozinski, Mateusz
Fua, Pascal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1515 - 1521
[29] Joint Segmentation and Classification with Fully Convolutional Networks
Shen, Falong
Gan, Rui
2016 3RD INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2016, : 338 - 343
[30] VMASS: Massive Dataset of Multi-camera Video for Learning, Classification and Recognition of Human Actions
Kulbacki, Marek
Segen, Jakub
Wereszczynski, Kamil
Gudys, Adam
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, 2014, 8398 : 565 - 574

← 1 2 3 4 5 →