EBERT: A lightweight expression-enhanced large-scale pre-trained language model for mathematics education

被引:0
|
作者
Duan, Zhiyi [1 ,2 ]
Gu, Hengnian [1 ]
Ke, Yuan [1 ]
Zhou, Dongdai [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot 010021, Inner Mongolia, Peoples R China
关键词
Pre-trained model; Question&Answer tree; Expression enhanced matrix; Question&Answer matching;
D O I
10.1016/j.knosys.2024.112118
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within the realm of mathematics education, there exist several challenging supervised tasks that educators and researchers encounter, such as question difficulty prediction and mathematical expression understanding. To address these challenges, researchers have introduced unsupervised pre-trained models specifically tailored for mathematics education, yielding promising outcomes. However, the existing literature fails to consider the domain-specific characteristics of mathematics, particularly the structural features in pre-trained corpora and extensive expressions, which makes them costly expensive and time-consuming. To tackle this problem, we propose a lightweight expression-enhanced large-scale pre-trained language model, called EBERT, for mathematics education. Specifically, we select a large number of expression-enriched exercises to further pre-train the original BERT. To depict the inherent structural features existed in expressions, the initial step involves the creation of an Operator Tree for each expression. Subsequently, each exercise is transformed into a corresponding Question&Answer tree (QAT) to serve as the model input. Notably, to ensure the preservation of semantic integrity within the QAT, a specialized Expression Enhanced Matrix is devised to confine the visibility of individual tokens. Additionally, a new pre-training task, referred to as Question&Answer Matching, is introduced to capture exercise-related structural information at the semantic level. Through three downstream tasks in mathematical education, we prove that EBERT outperforms several state-of-the-art baselines (such as MathBERT and GPT-3) in terms of ACC and F1-score.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models
    Pianese, Alessandro
    Poggi, Giovanni
    Cozzolino, Davide
    Verdoliva, Luisa
    PROCEEDINGS OF THE 2024 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2024, 2024, : 289 - 294
  • [32] ON THE USE OF MODALITY-SPECIFIC LARGE-SCALE PRE-TRAINED ENCODERS FOR MULTIMODAL SENTIMENT ANALYSIS
    Ando, Atsushi
    Masumura, Ryo
    Takashima, Akihiko
    Suzuki, Satoshi
    Makishima, Naoki
    Suzuki, Keita
    Moriya, Takafumi
    Ashihara, Takanori
    Sato, Hiroshi
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 739 - 746
  • [33] Training-Free Video Temporal Grounding Using Large-Scale Pre-trained Models
    Zheng, Minghang
    Cai, Xinhao
    Chen, Qingchao
    Peng, Yuxin
    Liu, Yang
    COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 20 - 37
  • [34] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
    Fang Li
    Hang Shen
    Jieai Mai
    Tianjing Wang
    Yuanfei Dai
    Xiaodong Miao
    Peer-to-Peer Networking and Applications, 2024, 17 : 227 - 245
  • [35] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
    Li, Fang
    Shen, Hang
    Mai, Jieai
    Wang, Tianjing
    Dai, Yuanfei
    Miao, Xiaodong
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2024, 17 (01) : 227 - 245
  • [36] Efficient Federated Learning with Pre-Trained Large Language Model Using Several Adapter Mechanisms
    Kim, Gyunyeop
    Yoo, Joon
    Kang, Sangwoo
    MATHEMATICS, 2023, 11 (21)
  • [37] Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device Classification
    Xu, Zongzhe
    SOUTHEASTCON 2023, 2023, : 159 - 166
  • [38] CURE: A deep learning framework pre-trained on large-scale patient data for treatment effect estimation
    Liu, Ruoqi
    Chen, Pin-Yu
    Zhang, Ping
    PATTERNS, 2024, 5 (06):
  • [39] Self-supervised Bidirectional Prompt Tuning for Entity-enhanced Pre-trained Language Model
    Zou, Jiaxin
    Xu, Xianghong
    Hou, Jiawei
    Yang, Qiang
    Zheng, Hai-Tao
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [40] Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
    An, Jieyu
    Zainon, Wan Mohd Nazmee Wan
    Ding, Binfen
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 1673 - 1689