Template-Based Contrastive Distillation Pretraining for Math Word Problem Solving

被引:0
|
作者
Qin, Jinghui [1 ]
Yang, Zhicheng [2 ,3 ]
Chen, Jiaqi [1 ]
Liang, Xiaodan [2 ]
Lin, Liang [1 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510275, Peoples R China
[2] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Shenzhen Campus, Shenzhen, Peoples R China
[3] Dark Matter Inc, Guangzhou 511457, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Mathematical models; Task analysis; Semantics; Problem-solving; Linguistics; Representation learning; Predictive models; Contrastive learning; math word problem (MWP) solving automatically; model pretraining; natural language understanding;
D O I
10.1109/TNNLS.2023.3265173
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since math word problem (MWP) solving aims to transform natural language problem description into executable solution equations, an MWP solver needs to not only comprehend the real-world narrative described in the problem text but also identify the relationships among the quantifiers and variables implied in the problem and maps them into a reasonable solution equation logic. Recently, although deep learning models have made great progress in MWPs, they ignore the grounding equation logic implied by the problem text. Besides, as we all know, pretrained language models (PLM) have a wealth of knowledge and high-quality semantic representations, which may help solve MWPs, but they have not been explored in the MWP-solving task. To harvest the equation logic and real-world knowledge, we propose a template-based contrastive distillation pretraining (TCDP) approach based on a PLM-based encoder to incorporate mathematical logic knowledge by multiview contrastive learning while retaining rich real-world knowledge and high-quality semantic representation via knowledge distillation. We named the pretrained PLM-based encoder by our approach as MathEncoder. Specifically, the mathematical logic is first summarized by clustering the symbolic solution templates among MWPs and then injected into the deployed PLM-based encoder by conducting supervised contrastive learning based on the symbolic solution templates, which can represent the underlying solving logic in the problems. Meanwhile, the rich knowledge and high-quality semantic representation are retained by distilling them from a well-trained PLM-based teacher encoder into our MathEncoder. To validate the effectiveness of our pretrained MathEncoder, we construct a new solver named MathSolver by replacing the GRU-based encoder with our pretrained MathEncoder in GTS, which is a state-of-the-art MWP solver. The experimental results demonstrate that our method can carry a solver's understanding ability of MWPs to a new stage by outperforming existing state-of-the-art methods on two widely adopted benchmarks Math23K and CM17K. Code will be available at https://github.com/QinJinghui/tcdp.
引用
收藏
页码:12823 / 12835
页数:13
相关论文
共 50 条
  • [1] Template-Based Math Word Problem Solvers with Recursive Neural Networks
    Wang, Lei
    Zhang, Dongxiang
    Zhang, Jipeng
    Xu, Xing
    Gao, Lianli
    Dai, Bing Tian
    Shen, Heng Tao
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7144 - 7151
  • [2] A diversity-enhanced knowledge distillation model for practical math word problem solving
    Zhang, Yi
    Zhou, Guangyou
    Xie, Zhiwen
    Ma, Jinjin
    Huang, Jimmy Xiangji
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [3] Math Word Problem Solving: Operator and Template Techniques with Multi-Head Attention
    Sarkar, Sandip
    Das, Dipankar
    Pakray, Partha
    Pinto-Avendano, David Eduardo
    COMPUTACION Y SISTEMAS, 2023, 27 (04): : 1075 - 1088
  • [4] Solving Math Word Problems Following Logically Consistent Template
    Huang, Zeyu
    Zhang, Xiaofeng
    Bai, Jun
    Rong, Wenge
    Ouyang, Yuanxin
    Xiong, Zhang
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [5] TM-generation model: a template-based method for automatically solving mathematical word problems
    Lee, Donggeon
    Ki, Kyung Seo
    Kim, Bugeun
    Gweon, Gahgene
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (12): : 14583 - 14599
  • [6] Unifying the syntax and semantics for math word problem solving
    Tao, Xingyu
    Zhang, Yi
    Xie, Zhiwen
    Zhao, Zhuo
    Zhou, Guangyou
    Lu, Yongchun
    NEUROCOMPUTING, 2025, 636
  • [7] TM-generation model: a template-based method for automatically solving mathematical word problems
    Donggeon Lee
    Kyung Seo Ki
    Bugeun Kim
    Gahgene Gweon
    The Journal of Supercomputing, 2021, 77 : 14583 - 14599
  • [8] The Context-Oriented System Based on ELECTRA for Solving Math Word Problem
    Meng, Hao
    Wu, Hao
    Yu, Xinguo
    IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 976 - 981
  • [9] Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
    Raiyan, Syed Rifat
    Faiyaz, Md. Nafis
    Kabir, Shah Md. Jawad
    Kabir, Mohsinul
    Mahmud, Hasan
    Hasan, Md. Kamrul
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 362 - 378
  • [10] Solving Math Word Problem with External Knowledge and Entailment Loss
    Lu, Rizhongtian
    Tan, Yongmei
    Niu, Shaozhang
    Lin, Yunze
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX, 2023, 14262 : 320 - 331