MLSFF: Multi-level structural features fusion for multi-modal knowledge graph completion

被引:1
|
作者
Zhai, Hanming [1 ]
Lv, Xiaojun [2 ]
Hou, Zhiwen [1 ]
Tong, Xin [1 ]
Bu, Fanliang [1 ]
机构
[1] Peoples Publ Secur Univ China, Sch Informat Network Secur, Beijing 100038, Peoples R China
[2] China Acad Railway Sci Corp Ltd, Inst Comp Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
knowledge graph completion; multi-modal knowledge graph; link prediction; multi-modal feature fusion; graph neural network; transformer;
D O I
10.3934/mbe.2023630
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the rise of multi-modal methods, multi-modal knowledge graphs have become a better choice for storing human knowledge. However, knowledge graphs often suffer from the problem of incompleteness due to the infinite and constantly updating nature of knowledge, and thus the task of knowledge graph completion has been proposed. Existing multi-modal knowledge graph completion methods mostly rely on either embedding-based representations or graph neural networks, and there is still room for improvement in terms of interpretability and the ability to handle multi-hop tasks. Therefore, we propose a new method for multi-modal knowledge graph completion. Our method aims to learn multi-level graph structural features to fully explore hidden relationships within the knowledge graph and to improve reasoning accuracy. Specifically, we first use a Transformer architecture to separately learn about data representations for both the image and text modalities. Then, with the help of multimodal gating units, we filter out irrelevant information and perform feature fusion to obtain a unified encoding of knowledge representations. Furthermore, we extract multi-level path features using a width-adjustable sliding window and learn about structural feature information in the knowledge graph using graph convolutional operations. Finally, we use a scoring function to evaluate the probability of the truthfulness of encoded triplets and to complete the prediction task. To demonstrate the effectiveness of the model, we conduct experiments on two publicly available datasets, FB15K-237-IMG and WN18-IMG, and achieve improvements of 1.8 and 0.7%, respectively, in the Hits@1 metric.
引用
收藏
页码:14096 / 14116
页数:21
相关论文
共 50 条
  • [1] NativE: Multi-modal Knowledge Graph Completion in the Wild
    Zhang, Yichi
    Chen, Zhuo
    Guo, Lingbing
    Xu, Yajing
    Hu, Binbin
    Liu, Ziqi
    Zhang, Wen
    Chen, Huajun
    [J]. PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 91 - 101
  • [2] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
    Chen, Xiang
    Zhang, Ningyu
    Li, Lei
    Deng, Shumin
    Tan, Chuanqi
    Xu, Changliang
    Huang, Fei
    Si, Luo
    Chen, Huajun
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 904 - 915
  • [3] Multi-Modal fusion with multi-level attention for Visual Dialog
    Zhang, Jingping
    Wang, Qiang
    Han, Yahong
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (04)
  • [4] Multi-hop neighbor fusion enhanced hierarchical transformer for multi-modal knowledge graph completion
    Wang, Yunpeng
    Ning, Bo
    Wang, Xin
    Li, Guanyu
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (05):
  • [5] Multi-Level Interaction Based Knowledge Graph Completion
    Wang, Jiapu
    Wang, Boyue
    Gao, Junbin
    Hu, Simin
    Hu, Yongli
    Yin, Baocai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 386 - 396
  • [6] News video story segmentation using fusion of multi-level multi-modal features in TRECVID 2003
    Hsu, W
    Kennedy, L
    Huang, CW
    Chang, SF
    Lin, CY
    Iyengar, G
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, : 645 - 648
  • [7] Multi-level Shared Knowledge Guided Learning for Knowledge Graph Completion
    Shan, Yongxue
    Zhou, Jie
    Peng, Jie
    Zhou, Xin
    Yin, Jiaqian
    Wang, Xiaodong
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1027 - 1042
  • [8] Multi-level Fusion of Multi-modal Semantic Embeddings for Zero Shot Learning
    Kong, Zhe
    Wang, Xin
    Gao, Neng
    Zhang, Yifei
    Liu, Yuhan
    Tu, Chenyang
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 310 - 318
  • [9] SiamMMF: multi-modal multi-level fusion object tracking based on Siamese networks
    Yang, Zhen
    Huang, Peng
    He, Dunyun
    Cai, Zhongwang
    Yin, Zhijian
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (01)
  • [10] MLMFNet: A multi-level modality fusion network for multi-modal accelerated MRI reconstruction
    Zhou, Xiuyun
    Zhang, Zhenxi
    Du, Hongwei
    Qiu, Bensheng
    [J]. MAGNETIC RESONANCE IMAGING, 2024, 111 : 246 - 255