FastClip: An Efficient Video Understanding System with Heterogeneous Computing and Coarse-to-fine Processing

被引:0
|
作者
Zhao, Liming [1 ]
Sun, Siyang [1 ]
Zhang, Yanhao [1 ]
Zheng, Yun [1 ]
Pan, Pan [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
video understanding; heterogeneous computing; system speedup;
D O I
10.1145/3487553.3524209
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, video medias are exponentially growing in many areas such as E-commerce shopping and gaming. Understanding the video contents is critical for real-world applications. However, processing long videos is usually time-consuming and expensive. In this paper, we present an efficient video understanding system, which aims to speed up the video processing with a coarse-to-fine two-stage pipeline and heterogeneous computing framework. First, we use a coarse but fast multi-modal filtering module to recognize and remove useless video segments from a long video, which could be deployed on an edge device and reduce computations for the next processing. Second, several semantic models are applied for finely parsing the remained sequences. To accelerate the model inference, we propose a novel heterogeneous computing framework, which trains a model with lightweight and heavyweight backbones to support a distributed deployment on a powerful device (e.g., cloud or GPU) and another different device (e.g., edge or CPU). In this way, the model could be both efficient and effective. The proposed system has been widely used in Alibaba, including "Taobao Live Analysis" and "Commodity Short-Video Generation", which could achieve a 10x speedup for the system.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [41] HafaNet: An Efficient Coarse-to-Fine Facial Landmark Detection Network
    Zheng, Shaun
    Bai, Xiuxiu
    Ye, Lele
    Fang, Zhan
    IEEE ACCESS, 2020, 8 : 123037 - 123043
  • [42] Efficient Coarse-to-Fine PatchMatch for Large Displacement Optical Flow
    Hu, Yinlin
    Song, Rui
    Li, Yunsong
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5704 - 5712
  • [43] Coarse-to-fine three-dimensional digitization system
    Daval, Vincent
    Truchetet, Frederic
    Aubreton, Olivier
    JOURNAL OF ELECTRONIC IMAGING, 2015, 24 (06)
  • [44] Coarse-to-Fine Entity Alignment for Chinese Heterogeneous Encyclopedia Knowledge Base
    Wu, Meng
    Jiang, Tingting
    Bu, Chenyang
    Zhu, Bin
    FUTURE INTERNET, 2022, 14 (02):
  • [45] COARSE-TO-FINE TEMPORAL OPTIMIZATION FOR VIDEO RETARGETING BASED ON SEAM CARVING
    Chao, Wei-Lun
    Su, Hsiao-Hang
    Chien, Shao-Yi
    Hsu, Winston
    Ding, Jian-Jiun
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [46] A coarse-to-fine collective entity linking method for heterogeneous information networks
    Li, Jiao
    Bu, Chenyang
    Li, Peipei
    Wu, Xindong
    KNOWLEDGE-BASED SYSTEMS, 2021, 228
  • [47] Coarse-to-Fine Copy-Move Forgery Detection for Video Forensics
    Jia, Shan
    Xu, Zhengquan
    Wang, Hao
    Feng, Chunhui
    Wang, Tao
    IEEE ACCESS, 2018, 6 : 25323 - 25335
  • [48] Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows
    Qin, Zheyun
    Lu, Xiankai
    Nie, Xiushan
    Liu, Dongfang
    Yin, Yilong
    Wang, Wenguan
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2023, 10 (05) : 1192 - 1208
  • [49] Coarse-to-Fine Robust Heterogeneous Network Representation Learning Without Metapath
    Chen, Lei
    Guo, Haomiao
    Lei, Yong
    Li, Yuan
    Liu, Zhaohua
    IEEE Transactions on Network Science and Engineering, 2024, 11 (06): : 5773 - 5789
  • [50] Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows
    Zheyun Qin
    Xiankai Lu
    Xiushan Nie
    Dongfang Liu
    Yilong Yin
    Wenguan Wang
    IEEE/CAA Journal of Automatica Sinica, 2023, 10 (05) : 1192 - 1208