Deep Visual Semantic Embedding with Text Data Augmentation and Word Embedding Initialization

被引:4
|
作者
He, Hai [1 ]
Yang, Haibo [2 ]
机构
[1] Chongqing City Management Coll, Sch Big Data & Informat Ind, Chongqing 401331, Peoples R China
[2] Chongqing Med Univ, Informat Ctr, Chongqing 400016, Peoples R China
关键词
SENTIMENT CLASSIFICATION;
D O I
10.1155/2021/6654071
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Language and vision are the two most essential parts of human intelligence for interpreting the real world around us. How to make connections between language and vision is the key point in current research. Multimodality methods like visual semantic embedding have been widely studied recently, which unify images and corresponding texts into the same feature space. Inspired by the recent development of text data augmentation and a simple but powerful technique proposed called EDA (easy data augmentation), we can expand the information with given data using EDA to improve the performance of models. In this paper, we take advantage of the text data augmentation technique and word embedding initialization for multimodality retrieval. We utilize EDA for text data augmentation, word embedding initialization for text encoder based on recurrent neural networks, and minimizing the gap between the two spaces by triplet ranking loss with hard negative mining. On two Flickr-based datasets, we achieve the same recall with only 60% of the training dataset as the normal training with full available data. Experiment results show the improvement of our proposed model; and, on all datasets in this paper (Flickr8k, Flickr30k, and MS-COCO), our model performs better on image annotation and image retrieval tasks; the experiments also demonstrate that text data augmentation is more suitable for smaller datasets, while word embedding initialization is suitable for larger ones.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Text Semantic Steganalysis Based on Word Embedding
    Zuo, Xin
    Hu, Huanhuan
    Zhang, Weiming
    Yu, Nenghai
    CLOUD COMPUTING AND SECURITY, PT IV, 2018, 11066 : 485 - 495
  • [2] Enhancing Semantic Word Representations by Embedding Deep Word Relationships
    Nugaliyadde, Anupiya
    Wong, Kok Wai
    Sohel, Ferdous
    Xie, Hong
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2019), 2019, : 82 - 87
  • [3] Retaining Semantic Data in Binarized Word Embedding
    Sherki, Praneet
    Navali, Samarth
    Inturi, Ramesh
    Vala, Vanraj
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 130 - 133
  • [4] Multilabel Deep Visual-Semantic Embedding
    Yeh, Mei-Chen
    Li, Yi-Nan
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1530 - 1536
  • [5] Short Text Clustering based on Word Semantic Graph with Word Embedding Model
    Jinarat, Supakpong
    Manaskasemsak, Bundit
    Rungsawang, Arnon
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1427 - 1432
  • [6] Sentiment Weighted Word Embedding for Big Text Data
    Dhanani J.
    Rana D.
    Mehta R.
    International Journal of Web-Based Learning and Teaching Technologies, 2021, 16 (06)
  • [7] Short Text Embedding for Clustering based on Word and Topic Semantic Information
    Chen, Ziheng
    Ren, Jiangtao
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 61 - 70
  • [8] Word embedding and text classification based on deep learning methods
    Li, Saihan
    Gong, Bing
    2020 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE COMMUNICATION AND NETWORK SECURITY (CSCNS2020), 2021, 336
  • [9] Visual Embedding Augmentation in Fourier Domain for Deep Metric Learning
    Wang, Zheng
    Gao, Zhenwei
    Wang, Guoqing
    Yang, Yang
    Shen, Heng Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5538 - 5548
  • [10] Embedding Semantic Relations into Word Representations
    Bollegala, Danushka
    Maehara, Takanori
    Kawarabayashi, Ken-ichi
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1222 - 1228