EBERT: A lightweight expression-enhanced large-scale pre-trained language model for mathematics education

被引:0
|
作者
Duan, Zhiyi [1 ,2 ]
Gu, Hengnian [1 ]
Ke, Yuan [1 ]
Zhou, Dongdai [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot 010021, Inner Mongolia, Peoples R China
关键词
Pre-trained model; Question&Answer tree; Expression enhanced matrix; Question&Answer matching;
D O I
10.1016/j.knosys.2024.112118
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within the realm of mathematics education, there exist several challenging supervised tasks that educators and researchers encounter, such as question difficulty prediction and mathematical expression understanding. To address these challenges, researchers have introduced unsupervised pre-trained models specifically tailored for mathematics education, yielding promising outcomes. However, the existing literature fails to consider the domain-specific characteristics of mathematics, particularly the structural features in pre-trained corpora and extensive expressions, which makes them costly expensive and time-consuming. To tackle this problem, we propose a lightweight expression-enhanced large-scale pre-trained language model, called EBERT, for mathematics education. Specifically, we select a large number of expression-enriched exercises to further pre-train the original BERT. To depict the inherent structural features existed in expressions, the initial step involves the creation of an Operator Tree for each expression. Subsequently, each exercise is transformed into a corresponding Question&Answer tree (QAT) to serve as the model input. Notably, to ensure the preservation of semantic integrity within the QAT, a specialized Expression Enhanced Matrix is devised to confine the visibility of individual tokens. Additionally, a new pre-training task, referred to as Question&Answer Matching, is introduced to capture exercise-related structural information at the semantic level. Through three downstream tasks in mathematical education, we prove that EBERT outperforms several state-of-the-art baselines (such as MathBERT and GPT-3) in terms of ACC and F1-score.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] CPM: A large-scale generative Chinese Pre-trained language model
    Zhang, Zhengyan
    Han, Xu
    Zhou, Hao
    Ke, Pei
    Gu, Yuxian
    Ye, Deming
    Qin, Yujia
    Su, Yusheng
    Ji, Haozhe
    Guan, Jian
    Qi, Fanchao
    Wang, Xiaozhi
    Zheng, Yanan
    Zeng, Guoyang
    Cao, Huanqi
    Chen, Shengqi
    Li, Daixuan
    Sun, Zhenbo
    Liu, Zhiyuan
    Huang, Minlie
    Han, Wentao
    Tang, Jie
    Li, Juanzi
    Zhu, Xiaoyan
    Sun, Maosong
    AI OPEN, 2021, 2 : 93 - 99
  • [2] Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models
    Wu, Qingyang
    Zhang, Yichi
    Li, Yu
    Yu, Zhou
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1292 - 1301
  • [3] CPM-2: Large-scale cost-effective pre-trained language models
    Zhang, Zhengyan
    Gu, Yuxian
    Han, Xu
    Chen, Shengqi
    Xiao, Chaojun
    Sun, Zhenbo
    Yao, Yuan
    Qi, Fanchao
    Guan, Jian
    Ke, Pei
    Cai, Yanzheng
    Zeng, Guoyang
    Tan, Zhixing
    Liu, Zhiyuan
    Huang, Minlie
    Han, Wentao
    Liu, Yang
    Zhu, Xiaoyan
    Sun, Maosong
    AI OPEN, 2021, 2 : 216 - 224
  • [4] Knowledge Enhanced Pre-trained Language Model for Product Summarization
    Yin, Wenbo
    Ren, Junxiang
    Wu, Yuejiao
    Song, Ruilin
    Liu, Lang
    Cheng, Zhen
    Wang, Sibo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II, 2022, 13552 : 263 - 273
  • [5] Parameter-efficient fine-tuning of large-scale pre-trained language models
    Ning Ding
    Yujia Qin
    Guang Yang
    Fuchao Wei
    Zonghan Yang
    Yusheng Su
    Shengding Hu
    Yulin Chen
    Chi-Min Chan
    Weize Chen
    Jing Yi
    Weilin Zhao
    Xiaozhi Wang
    Zhiyuan Liu
    Hai-Tao Zheng
    Jianfei Chen
    Yang Liu
    Jie Tang
    Juanzi Li
    Maosong Sun
    Nature Machine Intelligence, 2023, 5 : 220 - 235
  • [6] Parameter-efficient fine-tuning of large-scale pre-trained language models
    Ding, Ning
    Qin, Yujia
    Yang, Guang
    Wei, Fuchao
    Yang, Zonghan
    Su, Yusheng
    Hu, Shengding
    Chen, Yulin
    Chan, Chi-Min
    Chen, Weize
    Yi, Jing
    Zhao, Weilin
    Wang, Xiaozhi
    Liu, Zhiyuan
    Zheng, Hai-Tao
    Chen, Jianfei
    Liu, Yang
    Tang, Jie
    Li, Juanzi
    Sun, Maosong
    NATURE MACHINE INTELLIGENCE, 2023, 5 (03) : 220 - +
  • [7] Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model
    Yuan, Mingruo
    Kao, Ben
    Wu, Tien-Hsuan
    Cheung, Michael M. K.
    Chan, Henry W. H.
    Cheung, Anne S. Y.
    Chan, Felix W. H.
    Chen, Yongxi
    ARTIFICIAL INTELLIGENCE AND LAW, 2024, 32 (03) : 769 - 805
  • [8] Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models
    Huber, Patrick
    Carenini, Giuseppe
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2376 - 2394
  • [9] Large-Scale Relation Learning for Question Answering over Knowledge Bases with Pre-trained Language Models
    Yam, Yuanmeng
    Li, Rumei
    Wang, Sirui
    Zhang, Hongzhi
    Zan, Daoguang
    Zhang, Fuzheng
    Wu, Wei
    Xu, Weiran
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3653 - 3660
  • [10] Adapting Large-Scale Pre-trained Models for Uni ed Dialect Speech Recognition Model
    Toyama, T.
    Kai, A.
    Kamiya, Y.
    Takahashi, N.
    Acta Physica Polonica A, 2024, 146 (04) : 413 - 418