Research on image content description in Chinese based on fusion of image global and local features

被引:0
|
作者
Kong, Dongyi [1 ]
Zhao, Hong [1 ]
Zeng, Xiangyan [2 ]
机构
[1] Lanzhou Univ Technol, Sch Comp & Commun, Lanzhou, Peoples R China
[2] Ft Valley State Univ, Dept Math & Comp Sci, Fort Valley, GA USA
来源
PLOS ONE | 2022年 / 17卷 / 08期
基金
中国国家自然科学基金;
关键词
GENERATION;
D O I
10.1371/journal.pone.0271322
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Most image content modelling methods are designed for English description which is different form Chinese in syntax structure. The few existing Chinese image description models do not fully integrate the global features and the local features of an image, limiting the capability of the models to represent the details of the image. In this paper, an encoder-decoder architecture based on the fusion of global and local features is used to describe the Chinese image content. In the encoding stage, the global and local features of the image are extracted by the Convolutional Neural Network (CNN) and the target detection network, and fed to the feature fusion module. In the decoding stage, an image feature attention mechanism is used to calculate the weights of word vectors, and a new gating mechanism is added to the traditional Long Short-Term Memory (LSTM) network to emphasize the fused image features, and the corresponding word vectors. In the description generation stage, the beam search algorithm is used to optimize the word vector generation process. The integration of global and local features of the image is strengthened to allow the model to fully understand the details of the image through the above three stages. The experimental results show that the model improves the quality of Chinese description of image content. Compared with the baseline model, the score of CIDEr evaluation index improves by 20.07%, and other evaluation indices also improve significantly.
引用
下载
收藏
页数:16
相关论文
共 50 条
  • [1] Content-based image retrieval using a fusion of global and local features
    Bu, Hee Hyung
    Kim, Nam Chul
    Kim, Sung Ho
    ETRI JOURNAL, 2023, 45 (03) : 505 - 518
  • [2] Content-Based Image Retrieval Based on Visual Words Fusion Versus Features Fusion of Local and Global Features
    Zahid Mehmood
    Fakhar Abbas
    Toqeer Mahmood
    Muhammad Arshad Javid
    Amjad Rehman
    Tabassam Nawaz
    Arabian Journal for Science and Engineering, 2018, 43 : 7265 - 7284
  • [3] Content-Based Image Retrieval Based on Visual Words Fusion Versus Features Fusion of Local and Global Features
    Mehmood, Zahid
    Abbas, Fakhar
    Mahmood, Toqeer
    Javid, Muhammad Arshad
    Rehman, Amjad
    Nawaz, Tabassam
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (12) : 7265 - 7284
  • [4] Texture image retrieval based on fusion of local and global features
    Wang, Hengbin
    Qu, Huaijing
    Xu, Jia
    Wang, Jiwei
    Wei, Yanan
    Zhang, Zhisheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 14081 - 14104
  • [5] Texture image retrieval based on fusion of local and global features
    Hengbin Wang
    Huaijing Qu
    Jia Xu
    Jiwei Wang
    Yanan Wei
    Zhisheng Zhang
    Multimedia Tools and Applications, 2022, 81 : 14081 - 14104
  • [6] Content based image retrieval on big image data using local and global features
    Kanaparthi S.K.
    Raju U.S.N.
    International Journal of Information Technology, 2022, 14 (1) : 49 - 68
  • [7] Incomplete image perception: Local features and global description
    Shelepin, Y.
    Harauzov, A.
    Chihman, V.
    Pronin, S.
    Fokin, V.
    Foreman, N.
    INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY, 2008, 69 (03) : 164 - 164
  • [8] Fusion of Local and Global Features using Stationary Wavelet Transform for Efficient Content Based Image Retrieval
    Chaudhary, Manoj D.
    Upadhyay, Abhay B.
    2014 IEEE STUDENTS' CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER SCIENCE (SCEECS), 2014,
  • [9] Content Based Image Retrieval using Local and Global features descriptor
    Kabbai, Leila
    Abdellaoui, Mehrez
    Douik, Ali
    2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, : 151 - 154
  • [10] Local versus global features for content-based image retrieval
    Shyu, CR
    Brodley, CE
    Kak, AC
    Kosaka, A
    Aisen, A
    Broderick, L
    IEEE WORKSHOP ON CONTENT-BASED ACCESS OF IMAGE AND VIDEO LIBRARIES - PROCEEDINGS, 1998, : 30 - 34