ECENet: Explainable and Context-Enhanced Network for Multi-modal Fact Verification

被引:1
|
作者
Zhang, Fanrui [1 ]
Liu, Jiawei [1 ]
Zhang, Qiang [1 ]
Sun, Esther [2 ]
Xie, Jingyi [1 ]
Zha, Zheng-Jun [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Univ Toronto, Toronto, ON, Canada
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Muti-modal fact verification; Attention mechanism; Deep reinforcement learning; Interpretability;
D O I
10.1145/3581783.3612183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, falsified claims incorporating both text and images have been disseminated more effectively than those containing text alone, raising significant concerns for multi-modal fact verification. Existing research makes contributions to multi-modal feature extraction and interaction, but fails to fully utilize and enhance the valuable and intricate semantic relationships between distinct features. Moreover, most detectors merely provide a single outcome judgment and lack an inference process or explanation. Taking these factors into account, we propose a novel Explainable and Context-Enhanced Network (ECENet) for multi-modal fact verification, making the first attempt to integrate multi-clue feature extraction, multi-level feature reasoning, and justification (explanation) generation within a unified framework. Specifically, we propose an Improved Coarse- and Fine-grained Attention Network, equipped with two types of level-grained attention mechanisms, to facilitate a comprehensive understanding of contextual information. Furthermore, we propose a novel justification generation module via deep reinforcement learning that does not require additional labels. In this module, a sentence extractor agent measures the importance between the query claim and all document sentences at each time step, selecting a suitable amount of high-scoring sentences to be rewritten as the explanation of the model. Extensive experiments demonstrate the effectiveness of the proposed method.
引用
收藏
页码:1231 / 1240
页数:10
相关论文
共 50 条
  • [21] A context-enhanced neural network model for biomedical event trigger detection
    Wang, Zilin
    Ren, Yafeng
    Peng, Qiong
    Ji, Donghong
    Information Sciences, 2025, 691
  • [22] A Modality-Enhanced Multi-Channel Attention Network for Multi-Modal Dialogue Summarization
    Lu, Ming
    Liu, Yang
    Zhang, Xiaoming
    Applied Sciences (Switzerland), 2024, 14 (20):
  • [23] An improved Mandarin keyword spotting system using mce training and context-enhanced verification
    Liang, JiaEn
    Meng, Meng
    Wang, XiaoRui
    Ding, Peng
    Xu, Bo
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 1145 - 1148
  • [24] Unsupervised Fact-finding with Multi-modal Data in Social Sensing
    Shao, Huajie
    Yao, Shuochao
    Zhao, Yiran
    Su, Lu
    Wang, Zhibo
    Liu, Dongxin
    Liu, Shengzhong
    Kaplan, Lance
    Abdelzaher, Tarek
    2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
  • [25] Hierarchical Attention Network of Multi-Modal Biometric for a Secure Cloud-Based User Verification
    Bansong C.
    Tseng K.-K.
    Yung K.L.
    Ip W.H.
    IEEE Internet of Things Magazine, 2022, 5 (03): : 122 - 127
  • [26] XBully: Cyberbullying Detection within a Multi-Modal Context
    Cheng, Lu
    Li, Jundong
    Silva, Yasin N.
    Hall, Deborah L.
    Liu, Huan
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 339 - 347
  • [27] Multi-modal network evolution in polycentric regions
    Cats, Oded
    Birch, Nigel
    JOURNAL OF TRANSPORT GEOGRAPHY, 2021, 96
  • [28] SEANet: A Multi-modal Speech Enhancement Network
    Tagliasacchi, Marco
    Li, Yunpeng
    Misiunas, Karolis
    Roblek, Dominik
    INTERSPEECH 2020, 2020, : 1126 - 1130
  • [29] Multi-modal Experts Network for Autonomous Driving
    Fang, Shihong
    Choromanska, Anna
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 6439 - 6445
  • [30] Distributed modular toolbox for multi-modal context recognition
    Bannach, D
    Kunze, K
    Lukowicz, P
    Amft, O
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2006, PROCEEDINGS, 2006, 3894 : 99 - 113