Contextual Explainable Video Representation: Human Perception-based Understanding

被引:2
|
作者
Khoa Vo [1 ]
Yamazaki, Kashu [1 ]
Nguyen, Phong X. [2 ]
Phat Nguyen [2 ]
Khoa Luu [1 ]
Ngan Le [1 ]
机构
[1] Univ Arkansas, Dept CSCE, Fayetteville, AR 72701 USA
[2] FPT Software, AI Lab, Ho Chi Minh City, Vietnam
基金
美国国家科学基金会;
关键词
video understanding; action detection; dense video captioning; attention; human-perception; explainable ML;
D O I
10.1109/IEEECONF56349.2022.10052051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video understanding is a growing field and a subject of intense research, which includes many interesting tasks to understanding both spatial and temporal information, e.g., action detection, action recognition, video captioning, video retrieval. One of the most challenging problems in video understanding is dealing with feature extraction, i.e. extract contextual visual representation from given untrimmed video due to the long and complicated temporal structure of unconstrained videos. Different from existing approaches, which apply a pre-trained backbone network as a black-box to extract visual representation, our approach aims to extract the most contextual information with an explainable mechanism. As we observed, humans typically perceive a video through the interactions between three main factors, i.e., the actors, the relevant objects, and the surrounding environment. Therefore, it is very crucial to design a contextual explainable video representation extraction that can capture each of such factors and model the relationships between them. In this paper, we discuss approaches, that incorporate the human perception process into modeling actors, objects, and the environment. We choose video paragraph captioning and temporal action detection to illustrate the effectiveness of human perception based-contextual representation in video understanding. Source code is publicly available at https://github.com/UARK-AICV/Video Representation.
引用
收藏
页码:1326 / 1333
页数:8
相关论文
共 50 条
  • [41] SIFT IN PERCEPTION-BASED COLOR SPACE
    Cui, Yan
    Pagani, Alain
    Stricker, Didier
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 3909 - 3912
  • [42] PERCEPTION-BASED HIGH DYNAMIC RANGE VIDEO COMPRESSION WITH OPTIMAL BIT-DEPTH TRANSFORMATION
    Zhang, Yang
    Reinhard, Erik
    Bull, David
    2011 18TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2011, : 1321 - 1324
  • [43] Render2MPEG: A perception-based framework towards integrating rendering and video compression
    Herzog, Robert
    Kinuwaki, Shinichi
    Myszkowski, Karol
    Seidel, Hans-Peter
    COMPUTER GRAPHICS FORUM, 2008, 27 (02) : 183 - 192
  • [44] On understanding understanding. Perception-based processing of NL texts in SCIP systems, or meaning constitution as visualized learning
    Rieger, BB
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2004, 34 (04): : 425 - 438
  • [45] Perception-based adaptive quantization for transform-domain Wyner-Ziv video coding
    Lei Zhang
    Qiang Peng
    Xiao Wu
    Multimedia Tools and Applications, 2017, 76 : 16699 - 16725
  • [46] Perception-based adaptive quantization for transform-domain Wyner-Ziv video coding
    Zhang, Lei
    Peng, Qiang
    Wu, Xiao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (15) : 16699 - 16725
  • [47] Similarity-based and perception-based image segmentation
    Huart, J
    Bertolino, P
    2005 International Conference on Image Processing (ICIP), Vols 1-5, 2005, : 3653 - 3656
  • [48] A novel methodology for perception-based portfolio management
    Kocherlakota Satya Pritam
    Trilok Mathur
    Shivi Agarwal
    Sanjoy Kumar Paul
    Ahmed Mulla
    Annals of Operations Research, 2022, 315 : 1107 - 1133
  • [49] A novel methodology for perception-based portfolio management
    Pritam, Kocherlakota Satya
    Mathur, Trilok
    Agarwal, Shivi
    Paul, Sanjoy Kumar
    Mulla, Ahmed
    ANNALS OF OPERATIONS RESEARCH, 2022, 315 (02) : 1107 - 1133
  • [50] Perception-based Fuzzy Detection An exploratory study
    Xu, Yanwei
    Yan, Shefeng
    Ma, Xiaochuan
    Hou, Chaohuan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1006 - 1010