FastClip: An Efficient Video Understanding System with Heterogeneous Computing and Coarse-to-fine Processing

被引:0
|
作者
Zhao, Liming [1 ]
Sun, Siyang [1 ]
Zhang, Yanhao [1 ]
Zheng, Yun [1 ]
Pan, Pan [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
video understanding; heterogeneous computing; system speedup;
D O I
10.1145/3487553.3524209
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, video medias are exponentially growing in many areas such as E-commerce shopping and gaming. Understanding the video contents is critical for real-world applications. However, processing long videos is usually time-consuming and expensive. In this paper, we present an efficient video understanding system, which aims to speed up the video processing with a coarse-to-fine two-stage pipeline and heterogeneous computing framework. First, we use a coarse but fast multi-modal filtering module to recognize and remove useless video segments from a long video, which could be deployed on an edge device and reduce computations for the next processing. Second, several semantic models are applied for finely parsing the remained sequences. To accelerate the model inference, we propose a novel heterogeneous computing framework, which trains a model with lightweight and heavyweight backbones to support a distributed deployment on a powerful device (e.g., cloud or GPU) and another different device (e.g., edge or CPU). In this way, the model could be both efficient and effective. The proposed system has been widely used in Alibaba, including "Taobao Live Analysis" and "Commodity Short-Video Generation", which could achieve a 10x speedup for the system.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [31] Coarse-to-fine online learning for hand segmentation in egocentric video
    Ying Zhao
    Zhiwei Luo
    Changqin Quan
    EURASIP Journal on Image and Video Processing, 2018
  • [32] Coarse-to-fine online learning for hand segmentation in egocentric video
    Zhao, Ying
    Luo, Zhiwei
    Quan, Changqin
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2018,
  • [33] Unified Coarse-to-Fine Alignment for Video-Text Retrieval
    Wang, Ziyang
    Sung, Yi-Lin
    Cheng, Feng
    Bertasius, Gedas
    Bansal, Mohit
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2804 - 2815
  • [34] Time course of visual perception:: Coarse-to-fine processing and beyond
    Hegde, Jay
    PROGRESS IN NEUROBIOLOGY, 2008, 84 (04) : 405 - 439
  • [35] Stereoscopic depth processing in the visual cortex: a coarse-to-fine mechanism
    Menz, MD
    Freeman, RD
    NATURE NEUROSCIENCE, 2003, 6 (01) : 59 - 65
  • [36] Stereoscopic depth processing in the visual cortex: a coarse-to-fine mechanism
    Michael D. Menz
    Ralph D. Freeman
    Nature Neuroscience, 2003, 6 : 59 - 65
  • [37] Modeling the development of coarse-to-fine processing in the central visual pathway
    Jasmine A Nirody
    BMC Neuroscience, 14 (Suppl 1)
  • [38] A Coarse-to-Fine Approach to Computing the k-Best Viterbi Paths
    Nielsen, Jesper
    COMBINATORIAL PATTERN MATCHING, 22ND ANNUAL SYMPOSIUM, CPM 2011, 2011, 6661 : 376 - 387
  • [39] COARSE-TO-FINE STRATEGY FOR EFFICIENT COST-VOLUME FILTERING
    Furuta, Ryosuke
    Ikehata, Satoshi
    Yamasaki, Toshihiko
    Aizawa, Kiyoharu
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3793 - 3797
  • [40] Efficient Parallel Connected Component Labeling With a Coarse-to-Fine Strategy
    Chen, Jun
    Nonaka, Keisuke
    Sankoh, Hiroshi
    Watanabe, Ryosuke
    Sabirin, Houari
    Naito, Sei
    IEEE ACCESS, 2018, 6 : 55731 - 55740