Graph structure prefix injection transformer for multi-modal entity alignment

被引:0
|
作者
Zhang, Yan [1 ,2 ,3 ,4 ,5 ]
Luo, Xiangyu [2 ]
Hu, Jing [2 ]
Zhang, Miao [1 ,3 ,4 ]
Xiao, Kui [1 ,3 ,4 ]
Li, Zhifei [1 ,2 ,3 ,4 ,5 ]
机构
[1] School of Computer Science, Hubei University, Wuhan,430062, China
[2] School of Cyber Science and Technology, Hubei University, Wuhan,430062, China
[3] Hubei Key Laboratory of Big Data Intelligent Analysis and Application, Hubei University, Wuhan,430062, China
[4] Key Laboratory of Intelligent Sensing System and Security (Hubei University), Ministry of Education, Wuhan,430062, China
[5] Hubei Provincial Engineering Research Center of Intelligent Connected Vehicle Network Security, Hubei University, Wuhan,430062, China
来源
Information Processing and Management | 2025年 / 62卷 / 03期
关键词
Contrastive Learning;
D O I
10.1016/j.ipm.2024.104048
中图分类号
学科分类号
摘要
Multi-modal entity alignment aims to integrate corresponding entities across different MMKGs. However, previous studies have not adequately considered the impact of graph structural heterogeneity on EA tasks. Different MMKGs typically exhibit variations in graph structural features, leading to distinct structural representations of the same entity relationships. Additionally, the topological structure of the graph also differs. To tackle these challenges, we introduce GSIEA, the MMEA framework that integrates structural prefix injection and modality fusion. Different from other methods that directly fuse structural data with multi-modal features to perform the alignment task, GSIEA separately processes structural data and multi-modal data such as images and attributes, incorporating a prefix injection interaction module within a multi-head attention mechanism to optimize the utilization of multi-modal information and minimize the impact of graph structural differences. Additionally, GSIEA employs a convolutional enhancement module to extract fine-grained multi-modal features and computes cross-modal weights to achieve feature fusion. We conduct experimental evaluations on two public datasets, containing 12,846 and 11,199 entity pairs, respectively, demonstrating that GSIEA outperforms baseline models, with an average improvement of 3.26% in MRR and a maximum gain of 12.5%. Furthermore, the average improvement in Hits@1 is 4.96%, with a maximum increase of 16.92%. The code of our model is stored at https://github.com/HubuKG/GSIEA. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] On Graph Calculi for Multi-modal Logics
    Veloso, Paulo A. S.
    Veloso, Sheila R. M.
    Benevides, Mario R. F.
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2015, 312 : 231 - 252
  • [32] PSNEA: Pseudo-Siamese Network for Entity Alignment between Multi-modal Knowledge Graphs
    Ni, Wenxin
    Xu, Qianqian
    Jiang, Yangbangyan
    Cao, Zongsheng
    Cao, Xiaochun
    Huang, Qingming
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3489 - 3497
  • [33] A Multi-Modal Transformer network for action detection
    Korban, Matthew
    Youngs, Peter
    Acton, Scott T.
    PATTERN RECOGNITION, 2023, 142
  • [34] Multi-Modal Adversarial Example Detection with Transformer
    Ding, Chaoyue
    Sun, Shiliang
    Zhao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [35] Multi-modal Transformer for Brain Tumor Segmentation
    Cho, Jihoon
    Park, Jinah
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 138 - 148
  • [36] Multi-modal transformer for fake news detection
    Yang, Pingping
    Ma, Jiachen
    Liu, Yong
    Liu, Meng
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14699 - 14717
  • [37] Multi-hop neighbor fusion enhanced hierarchical transformer for multi-modal knowledge graph completion
    Wang, Yunpeng
    Ning, Bo
    Wang, Xin
    Li, Guanyu
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2024, 27 (05):
  • [38] Multi-modal Alignment using Representation Codebook
    Duan, Jiali
    Chen, Liqun
    Tran, Son
    Yang, Jinyu
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15630 - 15639
  • [39] Representation, Alignment, Fusion: A Generic Transformer-Based Framework for Multi-modal Glaucoma Recognition
    Zhou, You
    Yang, Gang
    Zhou, Yang
    Ding, Dayong
    Zhao, Jianchun
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VII, 2023, 14226 : 704 - 713
  • [40] Multi-Modal Decentralized Interaction in Multi-Entity Systems
    Olaru, Andrei
    Pricope, Monica
    SENSORS, 2023, 23 (06)