Features to Text: A Comprehensive Survey of Deep Learning on Semantic Segmentation and Image Captioning

被引:10
|
作者
Oluwasammi, Ariyo [1 ]
Aftab, Muhammad Umar [2 ]
Qin, Zhiguang [1 ]
Son Tung Ngo [3 ]
Thang Van Doan [3 ]
Son Ba Nguyen [3 ]
Son Hoang Nguyen [3 ]
Giang Hoang Nguyen [3 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
[2] Natl Univ Comp & Emerging Sci, Dept Comp Sci, Chiniot Faisalabad Campus, Islamabad 35400, Chiniot, Pakistan
[3] FPT Univ, ICT Dept, Hanoi 10000, Vietnam
基金
中国国家自然科学基金;
关键词
RANDOM-FIELDS; CLASSIFICATION; CONNECTIONS; ATTENTION; NETWORKS; LANGUAGE; VISION; FUSION; MODELS;
D O I
10.1155/2021/5538927
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
With the emergence of deep learning, computer vision has witnessed extensive advancement and has seen immense applications in multiple domains. Specifically, image captioning has become an attractive focal direction for most machine learning experts, which includes the prerequisite of object identification, location, and semantic understanding. In this paper, semantic segmentation and image captioning are comprehensively investigated based on traditional and state-of-the-art methodologies. In this survey, we deliberate on the use of deep learning techniques on the segmentation analysis of both 2D and 3D images using a fully convolutional network and other high-level hierarchical feature extraction methods. First, each domain's preliminaries and concept are described, and then semantic segmentation is discussed alongside its relevant features, available datasets, and evaluation criteria. Also, the semantic information capturing of objects and their attributes is presented in relation to their annotation generation. Finally, analysis of the existing methods, their contributions, and relevance are highlighted, informing the importance of these methods and illuminating a possible research continuation for the application of semantic image segmentation and image captioning approaches.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] A Comprehensive Survey of Deep Learning for Image Captioning
    Hossain, Md Zakir
    Sohel, Ferdous
    Shiratuddin, Mohd Fairuz
    Laga, Hamid
    [J]. ACM COMPUTING SURVEYS, 2019, 51 (06)
  • [2] A survey on deep learning techniques for image and video semantic segmentation
    Garcia-Garcia, Alberto
    Orts-Escolano, Sergio
    Oprea, Sergiu
    Villena-Martinez, Victor
    Martinez-Gonzalez, Pablo
    Garcia-Rodriguez, Jose
    [J]. APPLIED SOFT COMPUTING, 2018, 70 : 41 - 65
  • [3] A Survey on Image Semantic Segmentation Using Deep Learning Techniques
    Cheng, Jieren
    Li, Hua
    Li, Dengbo
    Hua, Shuai
    Sheng, Victor S.
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 1941 - 1957
  • [4] Semantic Image Segmentation with Deep Features
    Sunetci, Sercan
    Ates, Hasan F.
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [5] SEMANTIC SEGMENTATION OF TEXT USING DEEP LEARNING
    Lattisi, Tiziano
    Farina, Davide
    Ronchetti, Marco
    [J]. COMPUTING AND INFORMATICS, 2022, 41 (01) : 78 - 97
  • [6] A comprehensive survey on deep-learning-based visual captioning
    Bowen Xin
    Ning Xu
    Yingchen Zhai
    Tingting Zhang
    Zimu Lu
    Jing Liu
    Weizhi Nie
    Xuanya Li
    An-An Liu
    [J]. Multimedia Systems, 2023, 29 (6) : 3781 - 3804
  • [7] A comprehensive survey on deep-learning-based visual captioning
    Xin, Bowen
    Xu, Ning
    Zhai, Yingchen
    Zhang, Tingting
    Lu, Zimu
    Liu, Jing
    Nie, Weizhi
    Li, Xuanya
    Liu, An-An
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (06) : 3781 - 3804
  • [8] A Brief Survey on Semantic Segmentation with Deep Learning
    Hao, Shijie
    Zhou, Yuan
    Guo, Yanrong
    [J]. NEUROCOMPUTING, 2020, 406 : 302 - 321
  • [9] Image Classification and Semantic Segmentation with Deep Learning
    Quazi, Saiman
    Musa, Sarhan M.
    [J]. 6TH IEEE INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2021,
  • [10] Deep Dual Learning for Semantic Image Segmentation
    Luo, Ping
    Wang, Guangrun
    Lin, Liang
    Wang, Xiaogang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2737 - 2745