Joint Segmentation and Classification of Human Actions in Video

被引:0
|
作者
Minh Hoai [1 ]
Lan, Zhen-Zhong [1 ]
De la Torre, Fernando [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2011年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic video segmentation and action recognition has been a long-standing problem in computer vision. Much work in the literature treats video segmentation and action recognition as two independent problems; while segmentation is often done without a temporal model of the activity, action recognition is usually performed on pre-segmented clips. In this paper we propose a novel method that avoids the limitations of the above approaches by jointly performing video segmentation and action recognition. Unlike standard approaches based on extensions of dynamic Bayesian networks, our method is based on a discriminative temporal extension of the spatial bag-of-words model that has been very popular in object recognition. The classification is performed robustly within a multi-class SVM framework whereas the inference over the segments is done efficiently with dynamic programming. Experimental results on honeybee, Weizmann, and Hollywood datasets illustrate the benefits of our approach compared to state-of-the-art methods.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Multimodal topic segmentation and classification of news video
    Raaijmakers, S
    den Hartog, J
    Baan, J
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A33 - A36
  • [22] Temporal video segmentation and classification of edit effects
    Porter, S
    Mirmehdi, M
    Thomas, B
    IMAGE AND VISION COMPUTING, 2003, 21 (13-14) : 1097 - 1106
  • [23] Video segmentation via temporal pattern classification
    Cooper, Matthew
    Liu, Ting
    Rieffel, Eleanor
    IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (03) : 610 - 618
  • [24] Automatic Annotation of Human Actions in Video
    Duchenne, Olivier
    Laptev, Ivan
    Sivic, Josef
    Bach, Francis
    Ponce, Jean
    2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 1491 - 1498
  • [25] Analysis of Human Actions for Video Indexing
    Chen, Zhuoyuan
    Cui, Peng
    Sun, Lifeng
    Yang, Shiqiang
    Advances in Multimedia Information Processing - PCM 2008, 9th Pacific Rim Conference on Multimedia, 2008, 5353 : 635 - 642
  • [26] A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
    Tang, Duyu
    Qin, Bing
    Wei, Furu
    Dong, Li
    Liu, Ting
    Zhou, Ming
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1750 - 1761
  • [27] A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
    Tang, Duyu
    Qin, Bing
    Wei, Furu
    Dong, Li
    Liu, Ting
    Zhou, Ming
    IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (11): : 1750 - 1761
  • [28] Joint Segmentation and Path Classification of Curvilinear Structures
    Mosinska, Agata
    Kozinski, Mateusz
    Fua, Pascal
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1515 - 1521
  • [29] Joint Segmentation and Classification with Fully Convolutional Networks
    Shen, Falong
    Gan, Rui
    2016 3RD INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2016, : 338 - 343
  • [30] VMASS: Massive Dataset of Multi-camera Video for Learning, Classification and Recognition of Human Actions
    Kulbacki, Marek
    Segen, Jakub
    Wereszczynski, Kamil
    Gudys, Adam
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, 2014, 8398 : 565 - 574