Visual Relation Extraction via Multi-modal Translation Embedding Based Model

被引：0

作者：

Li, Zhichao ^{[1
]}

Han, Yuping ^{[1
]}

Xu, Yajing ^{[1
]}

Gao, Sheng ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China

来源：

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT I | 2018年 / 10937卷

关键词：

Visual relation extraction; Multi-modal network; Translation embedding;

D O I：

10.1007/978-3-319-93034-3_43

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual relation, such as "person holds dog" is an effective semantic unit for image understanding, as well as a bridge to connect computer vision and natural language. Recent work has been proposed to extract the object features in the image with the aid of respective textual description. However, very little work has been done to combine the multi-modal information to model the subject-predicate-object relation triplets to obtain deeper scene understanding. In this paper, we propose a novel visual relation extraction model named Multi-modal Translation Embedding Based Model to integrate the visual information and respective textual knowledge base. For that, our proposed model places objects of the image as well as their semantic relationships in two different low-dimensional spaces where the relation can be modeled as a simple translation vector to connect the entity descriptions in the knowledge graph. Moreover, we also propose a visual phrase learning method to capture the interactions between objects of the image to enhance the performance of visual relation extraction. Experiments are conducted on two real world datasets, which show that our proposed model can benefit from incorporating the language information into the relation embeddings and provide significant improvement compared to the state-of-the-art methods.

引用

页码：538 / 548

页数：11

共 50 条

[1] Multi-modal semantics fusion model for domain relation extraction via information bottleneck
Tian, Zhao
Zhao, Xuan
Li, Xiwang
Ma, Xiaoping
Li, Yinghao
Wang, Youwei
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 244
[2] TransFusion: Multi-Modal Fusion for Video Tag Inference via Translation-based Knowledge Embedding
Jin, Di
Qi, Zhongang
Luo, Yingmin
Shan, Ying
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1093 - 1101
[3] Video Visual Relation Detection via Multi-modal Feature Fusion
Sun, Xu
Ren, Tongwei
Zi, Yuan
Wu, Gangshan
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2657 - 2661
[4] A Chinese Multi-modal Relation Extraction Model for Internet Security of Finance
Lai, Qinghan
Ding, Shuai
Gong, Jinghao
Cui, Jin'an
Liu, Song
52ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOP VOLUME (DSN-W 2022), 2022, : 123 - 128
[5] Latent Variable Model for Multi-modal Translation
Calixto, Iacer
Rios, Miguel
Aziz, Wilker
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 6392 - 6405
[6] Visual Agreement Regularized Training for Multi-Modal Machine Translation
Yang, Pengcheng
Chen, Boxing
Zhang, Pei
Sun, Xu
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9418 - 9425
[7] Adding visual attention into encoder-decoder model for multi-modal machine translation
Xu, Chun
Yu, Zhengqing
Shi, Xiayang
Chen, Fang
JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (02):
[8] Visual Entity Linking via Multi-modal Learning
Zheng, Qiushuo
Wen, Hao
Wang, Meng
Qi, Guilin
DATA INTELLIGENCE, 2022, 4 (01) : 1 - 19
[9] Metaknowledge Extraction Based on Multi-Modal Documents
Liu, Shu-Kan
Xu, Rui-Lin
Geng, Bo-Ying
Sun, Qiao
Duan, Li
Liu, Yi-Ming
IEEE ACCESS, 2021, 9 : 50050 - 50060
[10] MUSE: MULTI-MODAL TARGET SPEAKER EXTRACTION WITH VISUAL CUES
Pan, Zexu
Tao, Ruijie
Xu, Chenglin
Li, Haizhou
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6678 - 6682

← 1 2 3 4 5 →