EBERT: A lightweight expression-enhanced large-scale pre-trained language model for mathematics education

被引:0
|
作者
Duan, Zhiyi [1 ,2 ]
Gu, Hengnian [1 ]
Ke, Yuan [1 ]
Zhou, Dongdai [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot 010021, Inner Mongolia, Peoples R China
关键词
Pre-trained model; Question&Answer tree; Expression enhanced matrix; Question&Answer matching;
D O I
10.1016/j.knosys.2024.112118
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within the realm of mathematics education, there exist several challenging supervised tasks that educators and researchers encounter, such as question difficulty prediction and mathematical expression understanding. To address these challenges, researchers have introduced unsupervised pre-trained models specifically tailored for mathematics education, yielding promising outcomes. However, the existing literature fails to consider the domain-specific characteristics of mathematics, particularly the structural features in pre-trained corpora and extensive expressions, which makes them costly expensive and time-consuming. To tackle this problem, we propose a lightweight expression-enhanced large-scale pre-trained language model, called EBERT, for mathematics education. Specifically, we select a large number of expression-enriched exercises to further pre-train the original BERT. To depict the inherent structural features existed in expressions, the initial step involves the creation of an Operator Tree for each expression. Subsequently, each exercise is transformed into a corresponding Question&Answer tree (QAT) to serve as the model input. Notably, to ensure the preservation of semantic integrity within the QAT, a specialized Expression Enhanced Matrix is devised to confine the visibility of individual tokens. Additionally, a new pre-training task, referred to as Question&Answer Matching, is introduced to capture exercise-related structural information at the semantic level. Through three downstream tasks in mathematical education, we prove that EBERT outperforms several state-of-the-art baselines (such as MathBERT and GPT-3) in terms of ACC and F1-score.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
    Jing, Liqiang
    Li, Yiren
    Xu, Junhao
    Yu, Yongcan
    Shen, Pei
    Song, Xuemeng
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 289 - 298
  • [22] Classifying informative tweets using feature enhanced pre-trained language model
    Yandrapati, Prakash Babu
    Eswari, R.
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [23] A Light Bug Triage Framework for Applying Large Pre-trained Language Model
    Lee, Jaehyung
    Han, Kisun
    Yu, Hwanjo
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [24] DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding
    Zhang, Taolin
    Wang, Chengyu
    Hu, Nan
    Qiu, Minghui
    Tang, Chengguang
    He, Xiaofeng
    Huang, Jun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11703 - 11711
  • [25] Solving Large-Scale Pursuit-Evasion Games Using Pre-trained Strategies
    Li, Shuxin
    Wang, Xinrun
    Zhang, Youzhi
    Xue, Wanqi
    Cerny, Jakub
    An, Bo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 11586 - 11594
  • [26] Semantic Scene Difference Detection in Daily Life Patroling by Mobile Robots using Pre-Trained Large-Scale Vision-Language Model
    Obinata, Yoshiki
    Kawaharazuka, Kento
    Kanazawa, Naoaki
    Yamaguchi, Naoya
    Tsukamoto, Naoto
    Yanokura, Iori
    Kitagawa, Shingo
    Shinjo, Koki
    Okada, Kei
    Inaba, Masayuki
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 3228 - 3233
  • [27] Interactive Design by Integrating a Large Pre-Trained Language Model and Building Information Modeling
    Jang, Suhyung
    Lee, Ghang
    COMPUTING IN CIVIL ENGINEERING 2023-VISUALIZATION, INFORMATION MODELING, AND SIMULATION, 2024, : 291 - 299
  • [28] POOE: predicting oomycete effectors based on a pre-trained large protein language model
    Zhao, Miao
    Lei, Chenping
    Zhou, Kewei
    Huang, Yan
    Fu, Chen
    Yang, Shiping
    Zhang, Ziding
    MSYSTEMS, 2024, 9 (01)
  • [29] LightToken: A Task and Model-agnostic Lightweight Token Embedding Framework for Pre-trained Language Models
    Wang, Haoyu
    Li, Ruirui
    Jiang, Haoming
    Wang, Zhengyang
    Tang, Xianfeng
    Bi, Bin
    Cheng, Monica
    Yin, Bing
    Wang, Yaqing
    Zhao, Tuo
    Gao, Jing
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 2302 - 2313
  • [30] 3D Semantic Novelty Detection via Large-Scale Pre-Trained Models
    Rabino, Paolo
    Alliegro, Antonio
    Tommasi, Tatiana
    IEEE ACCESS, 2024, 12 : 135352 - 135361