EBERT: A lightweight expression-enhanced large-scale pre-trained language model for mathematics education

被引:0
|
作者
Duan, Zhiyi [1 ,2 ]
Gu, Hengnian [1 ]
Ke, Yuan [1 ]
Zhou, Dongdai [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot 010021, Inner Mongolia, Peoples R China
关键词
Pre-trained model; Question&Answer tree; Expression enhanced matrix; Question&Answer matching;
D O I
10.1016/j.knosys.2024.112118
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within the realm of mathematics education, there exist several challenging supervised tasks that educators and researchers encounter, such as question difficulty prediction and mathematical expression understanding. To address these challenges, researchers have introduced unsupervised pre-trained models specifically tailored for mathematics education, yielding promising outcomes. However, the existing literature fails to consider the domain-specific characteristics of mathematics, particularly the structural features in pre-trained corpora and extensive expressions, which makes them costly expensive and time-consuming. To tackle this problem, we propose a lightweight expression-enhanced large-scale pre-trained language model, called EBERT, for mathematics education. Specifically, we select a large number of expression-enriched exercises to further pre-train the original BERT. To depict the inherent structural features existed in expressions, the initial step involves the creation of an Operator Tree for each expression. Subsequently, each exercise is transformed into a corresponding Question&Answer tree (QAT) to serve as the model input. Notably, to ensure the preservation of semantic integrity within the QAT, a specialized Expression Enhanced Matrix is devised to confine the visibility of individual tokens. Additionally, a new pre-training task, referred to as Question&Answer Matching, is introduced to capture exercise-related structural information at the semantic level. Through three downstream tasks in mathematical education, we prove that EBERT outperforms several state-of-the-art baselines (such as MathBERT and GPT-3) in terms of ACC and F1-score.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4
    Zhou, Juexiao
    He, Xiaonan
    Sun, Liyuan
    Xu, Jiannan
    Chen, Xiuying
    Chu, Yuetan
    Zhou, Longxi
    Liao, Xingyu
    Zhang, Bin
    Afvari, Shawn
    Gao, Xin
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [42] Performance of the pre-trained large language model GPT-4 on automated short answer grading
    Kortemeyer G.
    Discover Artificial Intelligence, 2024, 4 (01):
  • [43] Automatic question-answer pairs generation using pre-trained large language models in higher education
    Ling J.
    Afzaal M.
    Computers and Education: Artificial Intelligence, 2024, 6
  • [44] Few-shot medical relation extraction via prompt tuning enhanced pre-trained language model
    He, Guoxiu
    Huang, Chen
    NEUROCOMPUTING, 2025, 633
  • [45] GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning
    Yu, Tao
    Song, Rui
    Pinto, Sandro
    Gomes, Tiago
    Tavares, Adriano
    Xu, Hao
    APPLIED INTELLIGENCE, 2024, 54 (23) : 12215 - 12229
  • [46] A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training
    Tang, Zhan
    Guo, Xuchao
    Bai, Zhao
    Diao, Lei
    Lu, Shuhan
    Li, Lin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (03): : 771 - 791
  • [47] PLLM-CS: Pre-trained Large Language Model (LLM) for cyber threat detection in satellite networks
    Hassanin, Mohammed
    Keshk, Marwa
    Salim, Sara
    Alsubaie, Majid
    Sharma, Dharmendra
    AD HOC NETWORKS, 2025, 166
  • [48] Enhancing Real-Time Semantic Segmentation with Textual Knowledge of Pre-Trained Vision-Language Model: A Lightweight Approach
    Lin, Chia-Yi
    Chen, Jun-Cheng
    Wu, Ja-Ling
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 551 - 558
  • [49] On Robustness of Prompt-based Semantic Parsing with Large Pre-trained Language Model: An Empirical Study on Codex
    Zhuo, Terry Yue
    Li, Zhuang
    Huang, Yujin
    Shiri, Fatemeh
    Wang, Weiqing
    Haffari, Gholamreza
    Li, Yuan-Fang
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1090 - 1102
  • [50] Classifying for a Mixture of Object Images and Character Patterns by Using CNN Pre-trained for Large-scale Object Image Dataset
    Shima, Yoshihiro
    Nakashima, Yumi
    Yasuda, Michio
    PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 2360 - 2365