Math-LLMs: AI Cyberinfrastructure with Pre-trained Transformers for Math Education

被引:3
|
作者
Zhang, Fan [1 ]
Li, Chenglu [2 ]
Henkel, Owen [3 ]
Xing, Wanli [1 ]
Baral, Sami [4 ]
Heffernan, Neil [4 ]
Li, Hai [1 ]
机构
[1] Univ Florida, Gainesville, FL 32611 USA
[2] Univ Utah, Salt Lake City, UT USA
[3] Rising Acad Network, Freetown, Sierra Leone
[4] Worcester Polytech Inst, Worcester, MA USA
关键词
LLMs; Math education; Pre-train; MATHEMATICS;
D O I
10.1007/s40593-024-00416-y
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, the pre-training of Large Language Models (LLMs) in the educational domain has garnered significant attention. However, a discernible gap exists in the application of these models to mathematics education. This study aims to bridge this gap by pre-training LLMs on authentic K-12 mathematical dialogue datasets. Our research is structured around three primary research questions (RQs) that investigate the impact of fine-tuning data size and pre-training in downstream Natural Language Processing (NLP) tasks, and the efficacy of LLMs in text generation tasks within the mathematical context. Our findings indicate that data size plays a pivotal role in the performance of LLMs in downstream NLP tasks, with larger datasets yielding more consistent and improved results. Furthermore, pre-trained models consistently outperformed their non-pre-trained counterparts, emphasizing the importance of leveraging prior knowledge in LLMs. In the realm of text generation, we found that our model can not only enhance mathematical understanding and performance on downstream math tasks but also generate more engaging and human-like language.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals
    Vazquez-Rodriguez, Juan
    Lefebvre, Gregoire
    Cumin, Julien
    Crowley, James L.
    2022 10TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2022,
  • [32] Can Generative Pre-trained Transformers (GPT) Pass Assessments in Higher Education Programming Courses?
    Savelka, Jaromir
    Agarwal, Arav
    Bogart, Christopher
    Song, Yifan
    Sakr, Majd
    PROCEEDINGS OF THE 2023 CONFERENCE ON INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, ITICSE 2023, VOL 1, 2023, : 117 - 123
  • [33] NODULE DETECTION IN CHEST RADIOGRAPHS WITH UNSUPERVISED PRE-TRAINED DETECTION TRANSFORMERS
    Behrendt, Finn
    Bhattacharya, Debayan
    Krueger, Julia
    Opfer, Roland
    Schlaefer, Alexander
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [34] Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation
    Pezzelle, Sandro
    Takmaz, Ece
    Fernandez, Raquel
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1563 - 1579
  • [35] Do Syntax Trees Help Pre-trained Transformers Extract Information?
    Sachan, Devendra Singh
    Zhang, Yuhao
    Qi, Peng
    Hamilton, William
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2647 - 2661
  • [36] Unsupervised Out-of-Domain Detection via Pre-trained Transformers
    Xu, Keyang
    Ren, Tongzheng
    Zhang, Shikun
    Feng, Yihao
    Xiong, Caiming
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 1052 - 1061
  • [37] On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models
    Garcia-Pablos, Aitor
    Mandravickaite, Justina
    Versinskiene, Egidija
    BALTIC JOURNAL OF MODERN COMPUTING, 2023, 11 (04): : 591 - 606
  • [38] Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
    Bai, Jiangang
    Wang, Yujing
    Chen, Yiren
    Yang, Yaming
    Bai, Jing
    Yu, Jing
    Tong, Yunhai
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3011 - 3020
  • [39] ViTMatte: Boosting image matting with pre-trained plain vision transformers
    Yao, Jingfeng
    Wang, Xinggang
    Yang, Shusheng
    Wang, Baoyuan
    INFORMATION FUSION, 2024, 103
  • [40] Logical Transformers: Infusing Logical Structures into Pre-Trained Language Models
    Wang, Borui
    Huang, Qiuyuan
    Deb, Budhaditya
    Halfaker, Aaron
    Shao, Liqun
    McDuff, Daniel
    Awadallah, Ahmed Hassan
    Radev, Dragomir
    Gao, Jianfeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1762 - 1773