Operation-Augmented Numerical Reasoning for Question Answering

被引:1
|
作者
Zhou, Yongwei [1 ]
Bao, Junwei [2 ]
Wu, Youzheng [2 ]
He, Xiaodong [2 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Machine Intelligence & Translat Lab, Harbin 150001, Peoples R China
[2] JD AI Res, Beijing 101111, Peoples R China
基金
中国国家自然科学基金;
关键词
Cognition; Task analysis; Semantics; Speech processing; Sorting; Question answering (information retrieval); Predictive models; Numerical reasoning; symbolic operations; semantic augmentation; mixture-of-experts;
D O I
10.1109/TASLP.2023.3316448
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Question answering requiring numerical reasoning, which generally involves symbolic operations such as sorting, counting, and addition, is a challenging task. To address such a problem, existing mixture-of-experts (MoE)-based methods design several specific answer predictors to handle different types of questions and achieve promising performance. However, they ignore the modeling and exploitation of fine-grained reasoning-related operations to support numerical reasoning, encountering the inadequacy in reasoning capability and interpretability. To alleviate this issue, we propose OPERA, an operation-augmented numerical reasoning framework. Concretely, we systematically define a scalable operation set to model numerical reasoning. We first identify reasoning-related operations based on context and then softly execute them to imitate the answer reasoning procedure via an operation-aware cross-attention mechanism. Finally, we utilize the operation-augmented semantic representation of execution results to support answer prediction. We verify the effectiveness and generalization of OPERA in two scenarios with different knowledge sources and reasoning capabilities. Specifically, we conduct extensive experiments on two textual datasets, DROP and RACENum, and a table-text hybrid dataset TAT-QA. Experiment results show that OPERA outperforms previous strong methods on the DROP, RACENum, and TAT-QA datasets. Further, we statistically and visually analyze its interpretability.
引用
收藏
页码:15 / 28
页数:14
相关论文
共 50 条
  • [31] Multimodal Knowledge Reasoning for Enhanced Visual Question Answering
    Hussain, Afzaal
    Maqsood, Ifrah
    Shahzad, Muhammad
    Fraz, Muhammad Moazam
    2022 16TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS, SITIS, 2022, : 224 - 230
  • [32] Instance-sequence reasoning for video question answering
    Liu, Rui
    Han, Yahong
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (06)
  • [33] Reasoning with large language models for medical question answering
    Lucas, Mary M.
    Yang, Justin
    Pomeroy, Jon K.
    Yang, Christopher C.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)
  • [34] Compositional Substitutivity of Visual Reasoning for Visual Question Answering
    Li, Chuanhao
    Li, Zhen
    Jing, Chenchen
    Wu, Yuwei
    Zhai, Mingliang
    Jia, Yunde
    COMPUTER VISION - ECCV 2024, PT XLVIII, 2025, 15106 : 143 - 160
  • [35] Relational reasoning and adaptive fusion for visual question answering
    Shen, Xiang
    Han, Dezhi
    Zong, Liang
    Guo, Zihan
    Hua, Jie
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5062 - 5080
  • [36] INTERPRETABLE VISUAL QUESTION ANSWERING VIA REASONING SUPERVISION
    Parelli, Maria
    Mallis, Dimitrios
    Diomataris, Markos
    Pitsikalis, Vassilis
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2525 - 2529
  • [37] MUREL: Multimodal Relational Reasoning for Visual Question Answering
    Cadene, Remi
    Ben-younes, Hedi
    Cord, Matthieu
    Thome, Nicolas
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1989 - 1998
  • [38] Maintaining Reasoning Consistency in Compositional Visual Question Answering
    Jing, Chenchen
    Jia, Yunde
    Wu, Yuwei
    Liu, Xinyu
    Wu, Qi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5089 - 5098
  • [39] Instance-sequence reasoning for video question answering
    Rui Liu
    Yahong Han
    Frontiers of Computer Science, 2022, 16
  • [40] A DIAGNOSTIC STUDY OF VISUAL QUESTION ANSWERING WITH ANALOGICAL REASONING
    Huang, Ziqi
    Zhu, Hongyuan
    Sun, Ying
    Choi, Dongkyu
    Tan, Cheston
    Lim, Joo-Hwee
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2463 - 2467