EBERT: A lightweight expression-enhanced large-scale pre-trained language model for mathematics education

被引：0

作者：

Duan, Zhiyi ^{[1
,2
]}

Gu, Hengnian ^{[1
]}

Ke, Yuan ^{[1
]}

Zhou, Dongdai ^{[1
]}

机构：

[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China

[2] Inner Mongolia Univ, Dept Comp Sci, Hohhot 010021, Inner Mongolia, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 300卷

关键词：

Pre-trained model; Question&Answer tree; Expression enhanced matrix; Question&Answer matching;

D O I：

10.1016/j.knosys.2024.112118

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Within the realm of mathematics education, there exist several challenging supervised tasks that educators and researchers encounter, such as question difficulty prediction and mathematical expression understanding. To address these challenges, researchers have introduced unsupervised pre-trained models specifically tailored for mathematics education, yielding promising outcomes. However, the existing literature fails to consider the domain-specific characteristics of mathematics, particularly the structural features in pre-trained corpora and extensive expressions, which makes them costly expensive and time-consuming. To tackle this problem, we propose a lightweight expression-enhanced large-scale pre-trained language model, called EBERT, for mathematics education. Specifically, we select a large number of expression-enriched exercises to further pre-train the original BERT. To depict the inherent structural features existed in expressions, the initial step involves the creation of an Operator Tree for each expression. Subsequently, each exercise is transformed into a corresponding Question&Answer tree (QAT) to serve as the model input. Notably, to ensure the preservation of semantic integrity within the QAT, a specialized Expression Enhanced Matrix is devised to confine the visibility of individual tokens. Additionally, a new pre-training task, referred to as Question&Answer Matching, is introduced to capture exercise-related structural information at the semantic level. Through three downstream tasks in mathematical education, we prove that EBERT outperforms several state-of-the-art baselines (such as MathBERT and GPT-3) in terms of ACC and F1-score.

引用

页数：8

共 50 条

[31] Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models
Pianese, Alessandro
Poggi, Giovanni
Cozzolino, Davide
Verdoliva, Luisa
PROCEEDINGS OF THE 2024 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2024, 2024, : 289 - 294
[32] ON THE USE OF MODALITY-SPECIFIC LARGE-SCALE PRE-TRAINED ENCODERS FOR MULTIMODAL SENTIMENT ANALYSIS
Ando, Atsushi
Masumura, Ryo
Takashima, Akihiko
Suzuki, Satoshi
Makishima, Naoki
Suzuki, Keita
Moriya, Takafumi
Ashihara, Takanori
Sato, Hiroshi
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 739 - 746
[33] Training-Free Video Temporal Grounding Using Large-Scale Pre-trained Models
Zheng, Minghang
Cai, Xinhao
Chen, Qingchao
Peng, Yuxin
Liu, Yang
COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 20 - 37
[34] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
Fang Li
Hang Shen
Jieai Mai
Tianjing Wang
Yuanfei Dai
Xiaodong Miao
Peer-to-Peer Networking and Applications, 2024, 17 : 227 - 245
[35] Pre-trained language model-enhanced conditional generative adversarial networks for intrusion detection
Li, Fang
Shen, Hang
Mai, Jieai
Wang, Tianjing
Dai, Yuanfei
Miao, Xiaodong
PEER-TO-PEER NETWORKING AND APPLICATIONS, 2024, 17 (01) : 227 - 245
[36] Efficient Federated Learning with Pre-Trained Large Language Model Using Several Adapter Mechanisms
Kim, Gyunyeop
Yoo, Joon
Kang, Sangwoo
MATHEMATICS, 2023, 11 (21)
[37] Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device Classification
Xu, Zongzhe
SOUTHEASTCON 2023, 2023, : 159 - 166
[38] CURE: A deep learning framework pre-trained on large-scale patient data for treatment effect estimation
Liu, Ruoqi
Chen, Pin-Yu
Zhang, Ping
PATTERNS, 2024, 5 (06):
[39] Self-supervised Bidirectional Prompt Tuning for Entity-enhanced Pre-trained Language Model
Zou, Jiaxin
Xu, Xianghong
Hou, Jiawei
Yang, Qiang
Zheng, Hai-Tao
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[40] Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
An, Jieyu
Zainon, Wan Mohd Nazmee Wan
Ding, Binfen
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 1673 - 1689

← 1 2 3 4 5 →