Video scene parsing: An overview of deep learning methods and datasets

被引:10
|
作者
Yan, Xiyu [1 ,2 ]
Gong, Huihui [2 ]
Jiang, Yong [1 ,3 ]
Xia, Shu-Tao [1 ,3 ]
Zheng, Feng [2 ]
You, Xinge [4 ]
Shao, Ling [5 ,6 ]
机构
[1] Tsinghua Univ THU, Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[2] Southern Univ Sci & Technol SUSTech, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China
[3] PCL Res Ctr Networks & Commun, Peng Cheng Lab, Shenzhen, Peoples R China
[4] Huazhong Univ Sci & Technol HUST, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[5] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[6] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
中国国家自然科学基金;
关键词
Video Scene Parsing; Deep Learning; overview3; SEGMENTATION; PREDICTION;
D O I
10.1016/j.cviu.2020.103077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video scene parsing (VSP) has become a key problem in the field of computer vision in recent years due to its wide range of applications in numerous domains (e.g., autonomous driving). With the renaissance of deep learning (DL) techniques, various of VSP methods under this framework have demonstrated promising performance. However, no thorough review has been provided to comprehensively summarize the advantages and disadvantages of these methods, their datasets, or the directions for development. To remedy this, we provide an overview of the different DL methods applied to VSP in various scientific and engineering areas. Firstly, we describe several indispensable preliminaries of this field, defining essential background concepts as well as fundamental terminologies and differentiating between VSP and other similar problems. Then, according to their principles, contributions and importance, recent advanced DL methods for VSP are meticulously classified and thoroughly analyzed. Thirdly, we elaborate on the most frequently-used datasets and describe common evaluation metrics for VSP. Besides, extensive of experimental results for the aforementioned methods are presented to demonstrate their advantages and disadvantages. This is followed by further comparisons and discussions on the main challenges faced by researchers. Finally, we sum up the paper by drawing conclusions on the state-of-the-art methods for VSP and highlights potential research orientations as well as promising future work for DL techniques applied to VSP.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Video scene analysis: an overview and challenges on deep learning algorithms
    Qaisar Abbas
    Mostafa E. A. Ibrahim
    M. Arfan Jaffar
    [J]. Multimedia Tools and Applications, 2018, 77 : 20415 - 20453
  • [2] Video Scene Parsing with Predictive Feature Learning
    Jin, Xiaojie
    Li, Xin
    Xiao, Huaxin
    Shen, Xiaohui
    Lin, Zhe
    Yang, Jimei
    Chen, Yunpeng
    Dong, Jian
    Liu, Luoqi
    Jie, Zequn
    Feng, Jiashi
    Yan, Shuicheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5581 - 5589
  • [3] Video scene analysis: an overview and challenges on deep learning algorithms
    Abbas, Qaisar
    Ibrahim, Mostafa E. A.
    Jaffar, M. Arfan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (16) : 20415 - 20453
  • [4] Non-parametric scene parsing: Label transfer methods and datasets
    Bhowmick, Alexy
    Saharia, Sarat
    Hazarika, Shyamanta M.
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 219
  • [5] Deep Structured Scene Parsing by Learning with Image Descriptions
    Lin, Liang
    Wang, Guangrun
    Zhang, Rui
    Zhang, Ruimao
    Liang, Xiaodan
    Zuo, Wangmeng
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2276 - 2284
  • [6] Video Fire Detection Methods Based on Deep Learning: Datasets, Methods, and Future Directions
    Jin, Chengtuo
    Wang, Tao
    Alhusaini, Naji
    Zhao, Shenghui
    Liu, Huilin
    Xu, Kun
    Zhang, Jin
    Chen, Tao
    [J]. FIRE-SWITZERLAND, 2023, 6 (08):
  • [7] A Survey on Algorithm Research of Scene Parsing Based on Deep Learning
    Zhang, Rui
    Li, Jintao
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (04): : 859 - 875
  • [8] Learning deep representations for semantic image parsing: a comprehensive overview
    Huang, Lili
    Peng, Jiefeng
    Zhang, Ruimao
    Li, Guanbin
    Lin, Liang
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2018, 12 (05) : 840 - 857
  • [9] Learning deep representations for semantic image parsing: a comprehensive overview
    Lili Huang
    Jiefeng Peng
    Ruimao Zhang
    Guanbin Li
    Liang Lin
    [J]. Frontiers of Computer Science, 2018, 12 : 840 - 857
  • [10] Lane Detection with Deep Learning: Methods and Datasets
    Li, Junyan
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2023, 52 (02): : 297 - 308