Multimodal Logical Inference System for Visual-Textual Entailment

被引：0

作者：

Suzuki, Riko ^{[1
]}

Yanaka, Hitomi ^{[1
,2
]}

Yoshikawa, Masashi ^{[3
]}

Mineshima, Koji ^{[1
]}

Bekki, Daisuke ^{[1
]}

机构：

[1] Ochanomizu Univ, Tokyo, Japan

[2] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan

[3] Nara Inst Sci & Technol, Nara, Japan

来源：

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations. In this paper, we use logic-based representations as unified meaning representations for texts and images and present an unsupervised multimodal logical inference system that can effectively prove entailment relations between them. We show that by combining semantic parsing and theorem proving, the system can handle semantically complex sentences for visual-textual inference.

引用

页码：386 / 392

页数：7

共 50 条

[21] Visual-textual prototyping of 4D scenes
Duecker, M
Geiger, C
Hunstock, R
Lehrenfeld, G
Mueller, W
1997 IEEE SYMPOSIUM ON VISUAL LANGUAGES, PROCEEDINGS, 1997, : 328 - 335
[22] Visual-Textual Attentive Semantic Consistency for Medical Report Generation
Zhou, Yi
Huang, Lei
Zhou, Tao
Fu, Huazhu
Shao, Ling
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3965 - 3974
[23] Speech Grammars for Textual Entailment Patterns in Multimodal Question Answering
Sonntag, Daniel
Sacaleanu, Bogdan
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 3554 - 3558
[24] Visual-textual framework for serverless computation: a Luna Language approach
Moczurad, Piotr
Malawski, Maciej
2018 IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING COMPANION (UCC COMPANION), 2018, : 169 - 174
[25] Visual-textual adversarial learning for person re-identification
Yin, Pengqi
MULTIMEDIA SYSTEMS, 2025, 31 (01)
[26] MUTATT: VISUAL-TEXTUAL MUTUAL GUIDANCE FOR REFERRING EXPRESSION COMPREHENSION
Wang, Shuai
Lyu, Fan
Feng, Wei
Wang, Song
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[27] Heterogeneous Dual-Task Clustering with Visual-Textual Information
Yan, Xiaoqiang
Mao, Yiqiao
Hu, Shizhe
Ye, Yangdong
PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM), 2020, : 658 - 666
[28] Visual-Textual Matching Attention for Lesion Segmentation in Chest Images
Phuoc-Nguyen Bui
Duc-Tai Le
Choo, Hyunseung
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IX, 2024, 15009 : 702 - 711
[29] Visual-Textual Integration: Emoji as a Supplement in Health Information Design
Lin, Tingyi S.
Luo, Yue
INTERNATIONAL JOURNAL OF DESIGN, 2024, 18 (02): : 37 - 58
[30] Visual-Textual Encounters with a German Grandfather: The Work of Angela Findlay
Pettitt, Joanne
JEWISH FILM & NEW MEDIA-AN INTERNATIONAL JOURNAL, 2023, 11 (01)

← 1 2 3 4 5 →