Exploiting Query Knowledge Embedding and Trilinear Joint Embedding for Visual Question Answering

被引:0
|
作者
Chen, Zheng [1 ]
Wen, Yaxin [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 611731, Sichuan, Peoples R China
关键词
Visual question answering; Attention mechanism; Knowledge base; Joint embedding;
D O I
10.1007/978-981-99-4752-2_64
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Visual Question Answering (VQA) aims to answer natural language questions about a given image. Researchers generally believe that incorporating external knowledge can improve VQA task's performance. However, existing methods face limitations in acquiring and utilizing such knowledge, preventing them from effectively enhancing a model's question-answering capability. In this paper, we propose a novel VQA approach based on question-query for Knowledge Embedding. In our approach, we design question query rules to obtain critical external knowledge and then embed this knowledge by integrating it with the question as input features for text modalities. Traditional multimodal feature fusion techniques rely solely on local features, which may result in the loss of global information. To address this issue, we introduce a feature fusion method based on Trilinear Joint Embedding. Utilizing an attention mechanism, we generate a feature matrix composed of question, knowledge, and image components. This matrix is then trilinearly joint embedded to form a novel global feature vector. Due to the computational challenges associated with high-dimensional vectors produced during the trilinear joint embedding process, we employ Tensor Decomposition to break down this vector into a sum of several low-rank tensors. Subsequently, we input the global feature vector into a classifier to obtain the answer in a multicategory classification fashion. Experimental results on the VQAv2, OKVQA, and VizWiz public datasets demonstrate that our approach can achieve accuracy improvements of 1.78%, 3.95%, and 1.16%. Our code are available at https://git hub.com/yxNoth/KB-VLT.
引用
收藏
页码:780 / 791
页数:12
相关论文
共 50 条
  • [41] Incorporating anticipation embedding into reinforcement learning framework for multi-hop knowledge graph question answering
    Cui, Hai
    Peng, Tao
    Xiao, Feng
    Han, Jiayu
    Han, Ridong
    Liu, Lu
    INFORMATION SCIENCES, 2023, 619 : 745 - 761
  • [42] Embedding-based approximate query for knowledge graph
    Qiu, Jingyi
    Zhang, Duxi
    Song, Aibo
    Wang, Honglin
    Zhang, Tianbo
    Jin, Jiahui
    Fang, Xiaolin
    Li, Yaqi
    Journal of Southeast University (English Edition), 2024, 40 (04) : 417 - 424
  • [43] On Integrating Knowledge Graph Embedding into SPARQL Query Processing
    Kang, Hyunjoong
    Hong, Sanghyun
    Lee, Kookjin
    Park, Noseong
    Kwon, Soonhyun
    2018 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2018), 2018, : 371 - 374
  • [44] Structured query construction via knowledge graph embedding
    Wang, Ruijie
    Wang, Meng
    Liu, Jun
    Cochez, Michael
    Decker, Stefan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (05) : 1819 - 1846
  • [45] Structured query construction via knowledge graph embedding
    Ruijie Wang
    Meng Wang
    Jun Liu
    Michael Cochez
    Stefan Decker
    Knowledge and Information Systems, 2020, 62 : 1819 - 1846
  • [46] Joint Knowledge Graph Completion and Question Answering
    Liu, Lihui
    Du, Boxin
    Xu, Jiejun
    Xia, Yinglong
    Tong, Hanghang
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1098 - 1108
  • [47] Question-Answering Pair Matching Based on Question Classification and Ensemble Sentence Embedding
    Jang J.-S.
    Kwon H.-Y.
    Computer Systems Science and Engineering, 2023, 46 (03): : 3471 - 3489
  • [48] A Relation Embedding Assistance Networks for Multi-hop Question Answering
    Jiao, Songlin
    Zhu, Zhenfang
    Qi, Jiangtao
    Xu, Fuyong
    Pei, Hongli
    Wang, Wenling
    Song, Ze
    Liu, Peiyu
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
  • [49] Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification
    Ullah, Md Zia
    Shajalal, Md
    Chy, Abu Nowshed
    Aono, Masaki
    INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2016, 2016, 9994 : 308 - 314
  • [50] Comparative Analysis of Open Source and Commercial Embedding Models for Question Answering
    Balikas, Georgios
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5232 - 5233