Deep Convolutional Neural Network for Bidirectional Image-Sentence Mapping

被引:1
|
作者
Yu, Tianyuan [1 ]
Bai, Liang [1 ]
Guo, Jinlin [1 ]
Yang, Zheng [1 ]
Xie, Yuxiang [1 ]
机构
[1] Natl Univ Def Technol, Coll Informat Syst & Management, Changsha 410073, Hunan, Peoples R China
来源
关键词
D O I
10.1007/978-3-319-51814-5_12
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapid development of the Internet and the explosion of data volume, it is important to access the cross-media big data including text, image, audio, and video, etc., efficiently and accurately. However, the content heterogeneity and semantic gap make it challenging to retrieve such cross-media archives. The existing approaches try to learn the connection between multiple modalities by direct utilization of hand-crafted low-level features, and the learned correlations are merely constructed with high-level feature representations without considering semantic information. To further exploit the intrinsic structures of multimodal data representations, it is essential to build up an interpretable correlation between these heterogeneous representations. In this paper, a deep model is proposed to first learn the high-level feature representation shared by different modalities like texts and images, with convolutional neural network (CNN). Moreover, the learned CNN features can reflect the salient objects as well as the details in the images and sentences. Experimental results demonstrate that proposed approach outperforms the current state-of-the-art base methods on public dataset of Flickr8K.
引用
收藏
页码:136 / 147
页数:12
相关论文
共 50 条
  • [31] Deep Convolutional Neural Network for Mapping Smallholder Agriculture Using High Spatial Resolution Satellite Image
    Xie, Bin
    Zhang, Hankui K.
    Xue, Jie
    [J]. SENSORS, 2019, 19 (10)
  • [32] EvoDCNN: An evolutionary deep convolutional neural network for image classification
    Hassanzadeh, Tahereh
    Essam, Daryl
    Sarker, Ruhul
    [J]. NEUROCOMPUTING, 2022, 488 : 271 - 283
  • [33] Wound image segmentation using deep convolutional neural network
    Kang, Hyunyoung
    Seo, Kyungdeok
    Lee, Sena
    Oh, Byung Ho
    Yang, Sejung
    [J]. PHOTONICS IN DERMATOLOGY AND PLASTIC SURGERY 2023, 2023, 12352
  • [34] Learn a Deep Convolutional Neural Network for Image Smoke Detection
    Liu, Maoshen
    Gu, Ke
    Wu, Li
    Xu, Xin
    Qiao, Junfei
    [J]. DIGITAL TV AND MULTIMEDIA COMMUNICATION, 2019, 1009 : 217 - 226
  • [35] A deep convolutional neural network for rock fracture image segmentation
    Hoon Byun
    Jineon Kim
    Dongyoung Yoon
    Il-Seok Kang
    Jae-Joon Song
    [J]. Earth Science Informatics, 2021, 14 : 1937 - 1951
  • [36] Deep primitive convolutional neural network for image super resolution
    Greeshma, M. S. M.
    Bindu, V. R. V.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 253 - 278
  • [37] A deep convolutional neural network for rock fracture image segmentation
    Byun, Hoon
    Kim, Jineon
    Yoon, Dongyoung
    Kang, Il-Seok
    Song, Jae-Joon
    [J]. EARTH SCIENCE INFORMATICS, 2021, 14 (04) : 1937 - 1951
  • [38] Image Classification And Recognition Based On The Deep Convolutional Neural Network
    Wang, Yuan-yuan
    Zhang, Long-jun
    Xiao, Yang
    Xu, Jing
    Zhang, You-jun
    [J]. PROCEEDINGS OF THE 2017 2ND JOINT INTERNATIONAL INFORMATION TECHNOLOGY, MECHANICAL AND ELECTRONIC ENGINEERING CONFERENCE (JIMEC 2017), 2017, 62 : 171 - 174
  • [39] Stereoscopic Image Retargeting Based on Deep Convolutional Neural Network
    Fan, Xiaoting
    Lei, Jianjun
    Liang, Jie
    Fang, Yuming
    Ling, Nam
    Huang, Qingming
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4759 - 4770
  • [40] A Deep Normalization and Convolutional Neural Network for Image Smoke Detection
    Yin, Zhijian
    Wan, Boyang
    Yuan, Feiniu
    Xia, Xue
    Shi, Jinting
    [J]. IEEE ACCESS, 2017, 5 : 18429 - 18438