Math-LLMs: AI Cyberinfrastructure with Pre-trained Transformers for Math Education

被引:3
|
作者
Zhang, Fan [1 ]
Li, Chenglu [2 ]
Henkel, Owen [3 ]
Xing, Wanli [1 ]
Baral, Sami [4 ]
Heffernan, Neil [4 ]
Li, Hai [1 ]
机构
[1] Univ Florida, Gainesville, FL 32611 USA
[2] Univ Utah, Salt Lake City, UT USA
[3] Rising Acad Network, Freetown, Sierra Leone
[4] Worcester Polytech Inst, Worcester, MA USA
关键词
LLMs; Math education; Pre-train; MATHEMATICS;
D O I
10.1007/s40593-024-00416-y
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, the pre-training of Large Language Models (LLMs) in the educational domain has garnered significant attention. However, a discernible gap exists in the application of these models to mathematics education. This study aims to bridge this gap by pre-training LLMs on authentic K-12 mathematical dialogue datasets. Our research is structured around three primary research questions (RQs) that investigate the impact of fine-tuning data size and pre-training in downstream Natural Language Processing (NLP) tasks, and the efficacy of LLMs in text generation tasks within the mathematical context. Our findings indicate that data size plays a pivotal role in the performance of LLMs in downstream NLP tasks, with larger datasets yielding more consistent and improved results. Furthermore, pre-trained models consistently outperformed their non-pre-trained counterparts, emphasizing the importance of leveraging prior knowledge in LLMs. In the realm of text generation, we found that our model can not only enhance mathematical understanding and performance on downstream math tasks but also generate more engaging and human-like language.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers
    Pan, Haowen
    Cao, Yixin
    Wang, Xiaozhi
    Yang, Xun
    Wang, Meng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1012 - 1037
  • [42] Fast and accurate Bayesian optimization with pre-trained transformers for constrained engineering problemsFast and accurate Bayesian optimization with pre-trained transformers...R. Yu et al..
    Cyril Picard
    Faez Ahmed
    Structural and Multidisciplinary Optimization, 2025, 68 (3)
  • [43] Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation
    Yuan, Xingdi
    Wang, Tong
    Wang, Yen-Hsiang
    Fine, Emery
    Abdelgham, Rania
    Sauzeon, Helene
    Oudeyer, Pierre-Yves
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 12952 - 12965
  • [44] CAN ARTIFICIAL INTELLIGENCE (AI) LARGE LANGUAGE MODELS (LLMS) SUCH AS GENERATIVE PRE-TRAINED TRANSFORMER (GPT) BE USED TO AUTOMATE LITERATURE REVIEWS?
    Guerra, I
    Gallinaro, J.
    Rtveladze, K.
    Lambova, A.
    Asenova, E.
    VALUE IN HEALTH, 2023, 26 (12) : S410 - S411
  • [45] Contextualized and Personalized Math Word Problem Generation in Authentic Contexts Using Generative Pre-trained Transformer and Its Influences on Geometry Learning
    Utami, Ika Qutsiati
    Hwang, Wu-Yuin
    Hariyanti, Uun
    JOURNAL OF EDUCATIONAL COMPUTING RESEARCH, 2024, 62 (06) : 1604 - 1639
  • [46] Weakly Supervised Deep Learning for Arabic Tweet Sentiment Analysis on Education Reforms: Leveraging Pre-Trained Models and LLMs With Snorkel
    Alotaibi, Alanoud
    Nadeem, Farrukh
    Hamdy, Mohamed
    IEEE ACCESS, 2025, 13 : 30523 - 30542
  • [47] Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages
    MKSSS Cummins College of Engineering for Women, Maharashtra, Pune, India
    不详
    不详
    CEUR Workshop Proc., (427-434):
  • [48] Biologically Inspired Design Concept Generation Using Generative Pre-Trained Transformers
    Zhu, Qihao
    Zhang, Xinyu
    Luo, Jianxi
    JOURNAL OF MECHANICAL DESIGN, 2023, 145 (04)
  • [49] Detecting Propaganda Techniques in English News Articles using Pre-trained Transformers
    Abdullah, Malak
    Altiti, Ola
    Obiedat, Rasha
    2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 301 - 308
  • [50] EAPT: An encrypted traffic classification model via adversarial pre-trained transformers
    Zhan, Mingming
    Yang, Jin
    Jia, Dongqing
    Fu, Geyuan
    COMPUTER NETWORKS, 2025, 257