Template-Based Contrastive Distillation Pretraining for Math Word Problem Solving

被引：0

作者：

Qin, Jinghui ^{[1
]}

Yang, Zhicheng ^{[2
,3
]}

Chen, Jiaqi ^{[1
]}

Liang, Xiaodan ^{[2
]}

Lin, Liang ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510275, Peoples R China

[2] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Shenzhen Campus, Shenzhen, Peoples R China

[3] Dark Matter Inc, Guangzhou 511457, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 09期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Mathematical models; Task analysis; Semantics; Problem-solving; Linguistics; Representation learning; Predictive models; Contrastive learning; math word problem (MWP) solving automatically; model pretraining; natural language understanding;

D O I：

10.1109/TNNLS.2023.3265173

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since math word problem (MWP) solving aims to transform natural language problem description into executable solution equations, an MWP solver needs to not only comprehend the real-world narrative described in the problem text but also identify the relationships among the quantifiers and variables implied in the problem and maps them into a reasonable solution equation logic. Recently, although deep learning models have made great progress in MWPs, they ignore the grounding equation logic implied by the problem text. Besides, as we all know, pretrained language models (PLM) have a wealth of knowledge and high-quality semantic representations, which may help solve MWPs, but they have not been explored in the MWP-solving task. To harvest the equation logic and real-world knowledge, we propose a template-based contrastive distillation pretraining (TCDP) approach based on a PLM-based encoder to incorporate mathematical logic knowledge by multiview contrastive learning while retaining rich real-world knowledge and high-quality semantic representation via knowledge distillation. We named the pretrained PLM-based encoder by our approach as MathEncoder. Specifically, the mathematical logic is first summarized by clustering the symbolic solution templates among MWPs and then injected into the deployed PLM-based encoder by conducting supervised contrastive learning based on the symbolic solution templates, which can represent the underlying solving logic in the problems. Meanwhile, the rich knowledge and high-quality semantic representation are retained by distilling them from a well-trained PLM-based teacher encoder into our MathEncoder. To validate the effectiveness of our pretrained MathEncoder, we construct a new solver named MathSolver by replacing the GRU-based encoder with our pretrained MathEncoder in GTS, which is a state-of-the-art MWP solver. The experimental results demonstrate that our method can carry a solver's understanding ability of MWPs to a new stage by outperforming existing state-of-the-art methods on two widely adopted benchmarks Math23K and CM17K. Code will be available at https://github.com/QinJinghui/tcdp.

引用

页码：12823 / 12835

页数：13

共 50 条

[1] Template-Based Math Word Problem Solvers with Recursive Neural Networks
Wang, Lei
Zhang, Dongxiang
Zhang, Jipeng
Xu, Xing
Gao, Lianli
Dai, Bing Tian
Shen, Heng Tao
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7144 - 7151
[2] A diversity-enhanced knowledge distillation model for practical math word problem solving
Zhang, Yi
Zhou, Guangyou
Xie, Zhiwen
Ma, Jinjin
Huang, Jimmy Xiangji
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
[3] Math Word Problem Solving: Operator and Template Techniques with Multi-Head Attention
Sarkar, Sandip
Das, Dipankar
Pakray, Partha
Pinto-Avendano, David Eduardo
COMPUTACION Y SISTEMAS, 2023, 27 (04): : 1075 - 1088
[4] Solving Math Word Problems Following Logically Consistent Template
Huang, Zeyu
Zhang, Xiaofeng
Bai, Jun
Rong, Wenge
Ouyang, Yuanxin
Xiong, Zhang
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[5] TM-generation model: a template-based method for automatically solving mathematical word problems
Lee, Donggeon
Ki, Kyung Seo
Kim, Bugeun
Gweon, Gahgene
JOURNAL OF SUPERCOMPUTING, 2021, 77 (12): : 14583 - 14599
[6] Unifying the syntax and semantics for math word problem solving
Tao, Xingyu
Zhang, Yi
Xie, Zhiwen
Zhao, Zhuo
Zhou, Guangyou
Lu, Yongchun
NEUROCOMPUTING, 2025, 636
[7] TM-generation model: a template-based method for automatically solving mathematical word problems
Donggeon Lee
Kyung Seo Ki
Bugeun Kim
Gahgene Gweon
The Journal of Supercomputing, 2021, 77 : 14583 - 14599
[8] The Context-Oriented System Based on ELECTRA for Solving Math Word Problem
Meng, Hao
Wu, Hao
Yu, Xinguo
IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 976 - 981
[9] Math Word Problem Solving by Generating Linguistic Variants of Problem Statements
Raiyan, Syed Rifat
Faiyaz, Md. Nafis
Kabir, Shah Md. Jawad
Kabir, Mohsinul
Mahmud, Hasan
Hasan, Md. Kamrul
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 362 - 378
[10] Solving Math Word Problem with External Knowledge and Entailment Loss
Lu, Rizhongtian
Tan, Yongmei
Niu, Shaozhang
Lin, Yunze
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT IX, 2023, 14262 : 320 - 331

← 1 2 3 4 5 →