Dense Image Captioning Based on Precise Feature Extraction

被引:4
|
作者
Zhang, Zhiqiang [1 ]
Zhang, Yunye [1 ]
Shi, Yan [1 ]
Yu, Wenxin [1 ]
Nie, Li [1 ]
He, Gang [2 ]
Fan, Yibo [3 ]
Yang, Zhuo [4 ]
机构
[1] Southwest Univ Sci & Technol, Mianyang, Sichuan, Peoples R China
[2] Xidian Univ, Xian, Peoples R China
[3] Fudan Univ, State Key Lab ASIC & Syst, Shanghai, Peoples R China
[4] Guangdong Univ Technol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Dense captioning; Computer vision; Feature extraction; Location and description; Deep learning;
D O I
10.1007/978-3-030-36802-9_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is a challenging problem in computer vision, which has numerous practical applications. Recently, the method of dense image captioning has emerged, which realizes the full understanding of the image by localizing and describing multiple salient regions covering the image. Despite there are state-of-the-art approaches encouraging progress, the ability to position and to describe the target area correspondingly is not enough as we expect. To alleviate this challenge, a precise feature extraction method (PFE) is proposed in this paper to further enhance the effect of dense image captioning. Our model is evaluated on the Visual Genome dataset. It demonstrated that our method is better than other state-of-the-art methods.
引用
收藏
页码:83 / 90
页数:8
相关论文
共 50 条
  • [1] Salient Feature Extraction Mechanism for Image Captioning
    Wang X.
    Song Y.-H.
    Zhang Y.-L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (03): : 735 - 746
  • [2] Dense Image Captioning in Hindi
    Gill, Karanjit
    Saha, Sriparna
    Mishra, Santosh Kumar
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2894 - 2899
  • [3] Delving into Precise Attention in Image Captioning
    Hu, Shaohan
    Huang, Shenglei
    Wang, Guolong
    Li, Zhipeng
    Qin, Zheng
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 74 - 82
  • [4] Image Graph Production by Dense Captioning
    Sahba, Amin
    Das, Arun
    Rad, Paul
    Jamshidi, Mo
    2018 WORLD AUTOMATION CONGRESS (WAC), 2018, : 193 - 198
  • [5] CASCADE ATTENTION: MULTIPLE FEATURE BASED LEARNING FOR IMAGE CAPTIONING
    Shi, Jiahe
    Li, Yali
    Wang, Shengjin
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1970 - 1974
  • [6] An Object Localization-based Dense Image Captioning Framework in Hindi
    Mishra, Santosh Kumar
    Harshit
    Saha, Sriparna
    Bhattacharyya, Pushpak
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (02)
  • [7] Dense semantic embedding network for image captioning
    Xiao, Xinyu
    Wang, Lingfeng
    Ding, Kun
    Xiang, Shiming
    Pan, Chunhong
    PATTERN RECOGNITION, 2019, 90 : 285 - 296
  • [8] Incorporating retrieval-based method for feature enhanced image captioning
    Zhao, Shanshan
    Li, Lixiang
    Peng, Haipeng
    APPLIED INTELLIGENCE, 2023, 53 (08) : 9731 - 9743
  • [9] Auxiliary feature extractor and dual attention-based image captioning
    Qian Zhao
    Guichang Wu
    Signal, Image and Video Processing, 2024, 18 : 3615 - 3626
  • [10] Multiple-Level Feature-Based Network for Image Captioning
    Zheng, Kaidi
    Zhu, Chen
    Lu, Shaopeng
    Liu, Yonggang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 94 - 103