Predicting Important Objects for Egocentric Video Summarization

被引:0
|
作者
Yong Jae Lee
Kristen Grauman
机构
[1] University of California,Department of Computer Science
[2] University of Texas at Austin,Department of Computer Science
来源
关键词
Egocentric vision; Video summarization; Category discovery; Saliency detection;
D O I
暂无
中图分类号
学科分类号
摘要
We present a video summarization approach for egocentric or “wearable” camera data. Given hours of video, the proposed method produces a compact storyboard summary of the camera wearer’s day. In contrast to traditional keyframe selection techniques, the resulting summary focuses on the most important objects and people with which the camera wearer interacts. To accomplish this, we develop region cues indicative of high-level saliency in egocentric video—such as the nearness to hands, gaze, and frequency of occurrence—and learn a regressor to predict the relative importance of any new region based on these cues. Using these predictions and a simple form of temporal event detection, our method selects frames for the storyboard that reflect the key object-driven happenings. We adjust the compactness of the final summary given either an importance selection criterion or a length budget; for the latter, we design an efficient dynamic programming solution that accounts for importance, visual uniqueness, and temporal displacement. Critically, the approach is neither camera-wearer-specific nor object-specific; that means the learned importance metric need not be trained for a given user or context, and it can predict the importance of objects and people that have never been seen previously. Our results on two egocentric video datasets show the method’s promise relative to existing techniques for saliency and summarization.
引用
收藏
页码:38 / 55
页数:17
相关论文
共 50 条
  • [41] Video Co-summarization: Video Summarization by Visual Co-occurrence
    Chu, Wen-Sheng
    Song, Yale
    Jaimes, Alejandro
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3584 - 3592
  • [42] Summarization of Egocentric Moving Videos for Generating Walking Route Guidance
    Okamoto, Masaya
    Yanai, Keiji
    IMAGE AND VIDEO TECHNOLOGY, PSIVT 2013, 2014, 8333 : 431 - 442
  • [43] Content-based access to video objects: Temporal segmentation, visual summarization, and feature extraction
    Gunsel, B
    Tekalp, AM
    van Beek, PJL
    SIGNAL PROCESSING, 1998, 66 (02) : 261 - 280
  • [44] Wisdom of the Crowd in Egocentric Video Curation
    Hoshen, Yedid
    Ben-Artzi, Gil
    Peleg, Shmuel
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, : 587 - 593
  • [45] Tenodesis Grasp Detection in Egocentric Video
    Dousty, Mehdy
    Zariffa, Jose
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (05) : 1463 - 1470
  • [46] From video summarization to real time video summarization in smart cities and beyond: A survey
    Shambharkar, Prashant Giridhar
    Goel, Ruchi
    FRONTIERS IN BIG DATA, 2023, 5
  • [47] Tracking Multiple Deformable Objects in Egocentric Videos
    Huang, Mingzhen
    Li, Xiaoxing
    Hu, Jun
    Peng, Honghong
    Lyu, Siwei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1461 - 1471
  • [48] Unsupervised video summarization using deep Non-Local video summarization networks
    Zang, Sha-Sha
    Yu, Hui
    Song, Yan
    Zeng, Ru
    NEUROCOMPUTING, 2023, 519 : 26 - 35
  • [49] Learning to Predict Gaze in Egocentric Video
    Li, Yin
    Fathi, Alireza
    Rehg, James M.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3216 - 3223
  • [50] An Egocentric Look at Video Photographer Identity
    Hoshen, Yedid
    Peleg, Shmuel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4284 - 4292