Traffic Scene Captioning with Multi-Stage Feature Enhancement

被引:1
|
作者
Zhang, Dehai [1 ]
Ma, Yu [1 ]
Liu, Qing [1 ]
Wang, Haoxing [1 ]
Ren, Anquan [1 ]
Liang, Jiashu [1 ]
机构
[1] Yunnan Univ, Sch Software, Kunming 650091, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 76卷 / 03期
关键词
Traffic scene captioning; sustainable transportation; feature enhancement; encoder-decoder structure; multi-level granularity; scene knowledge graph;
D O I
10.32604/cmc.2023.038264
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images, ensuring road safety while providing an important decision-making function for sustainable transportation. In order to provide a comprehensive and reasonable description of complex traffic scenes, a traffic scene semantic captioning model with multi-stage feature enhancement is proposed in this paper. In general, the model follows an encoder-decoder structure. First, multilevel granularity visual features are used for feature enhancement during the encoding process, which enables the model to learn more detailed content in the traffic scene image. Second, the scene knowledge graph is applied to the decoding process, and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again, so that the model can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions. This paper reports extensive experiments on the challenging MS-COCO dataset, evaluated by five standard automatic evaluation metrics, and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods, especially achieving a score of 129.0 on the CIDEr-D evaluation metric, which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.
引用
收藏
页码:2901 / 2920
页数:20
相关论文
共 50 条
  • [1] Swin-Caption: Swin Transformer-Based Image Captioning with Feature Enhancement and Multi-Stage Fusion
    Liu, Lei
    Jiao, Yidi
    Li, Xiaoran
    Li, Jing
    Wang, Haitao
    Cao, Xinyu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2024,
  • [2] Multi-Stage Feature Interaction Model with Abundant Semantic Information for Image Captioning
    Li, Xueting
    An, Gaoyun
    Ruan, Qiuqi
    PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020), 2020, : 407 - 410
  • [3] MuSeFFF: Multi-stage feature fusion framework for traffic prediction
    Kumar A.
    Sunitha R.
    Intelligent Systems with Applications, 2023, 18
  • [4] Multi-Stage Multi-Task Feature Learning
    Gong, Pinghua
    Ye, Jieping
    Zhang, Changshui
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 2979 - 3010
  • [5] Multi-stage multi-task feature learning
    Gong, Pinghua
    Ye, Jieping
    Zhang, Changshui
    Journal of Machine Learning Research, 2013, 14 : 2979 - 3010
  • [6] CNN Pruning with Multi-Stage Feature Decorrelation
    Zhu, Qiuyu
    Liu, Chengfei
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (15)
  • [7] Multi-stage convex relaxation for feature selection
    Zhang, Tong
    BERNOULLI, 2013, 19 (5B) : 2277 - 2293
  • [8] Multi-stage Feature Selection for On-Line Flow Peer-to-Peer Traffic Identification
    Abdalla, Bushra Mohammed Ali
    Jamil, Haitham A.
    Hamdan, Mosab
    Bassi, Joseph Stephen
    Ismail, Ismahani
    Marsono, Muhammad Nadzir
    MODELING, DESIGN AND SIMULATION OF SYSTEMS, ASIASIM 2017, PT II, 2017, 752 : 509 - 523
  • [9] Multi-stage Progressive Speech Enhancement Network
    Xu, Xinmeng
    Wang, Yang
    Xu, Dongxiang
    Peng, Yiyuan
    Zhang, Cong
    Jia, Jie
    Chen, Binbin
    INTERSPEECH 2021, 2021, : 2691 - 2695
  • [10] Multi-Stage Feature Enhancement Pyramid Network for Detecting Objects in Optical Remote Sensing Images
    Zhang, Kaihua
    Shen, Haikuo
    REMOTE SENSING, 2022, 14 (03)