Math-LLMs: AI Cyberinfrastructure with Pre-trained Transformers for Math Education

被引:3
|
作者
Zhang, Fan [1 ]
Li, Chenglu [2 ]
Henkel, Owen [3 ]
Xing, Wanli [1 ]
Baral, Sami [4 ]
Heffernan, Neil [4 ]
Li, Hai [1 ]
机构
[1] Univ Florida, Gainesville, FL 32611 USA
[2] Univ Utah, Salt Lake City, UT USA
[3] Rising Acad Network, Freetown, Sierra Leone
[4] Worcester Polytech Inst, Worcester, MA USA
关键词
LLMs; Math education; Pre-train; MATHEMATICS;
D O I
10.1007/s40593-024-00416-y
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, the pre-training of Large Language Models (LLMs) in the educational domain has garnered significant attention. However, a discernible gap exists in the application of these models to mathematics education. This study aims to bridge this gap by pre-training LLMs on authentic K-12 mathematical dialogue datasets. Our research is structured around three primary research questions (RQs) that investigate the impact of fine-tuning data size and pre-training in downstream Natural Language Processing (NLP) tasks, and the efficacy of LLMs in text generation tasks within the mathematical context. Our findings indicate that data size plays a pivotal role in the performance of LLMs in downstream NLP tasks, with larger datasets yielding more consistent and improved results. Furthermore, pre-trained models consistently outperformed their non-pre-trained counterparts, emphasizing the importance of leveraging prior knowledge in LLMs. In the realm of text generation, we found that our model can not only enhance mathematical understanding and performance on downstream math tasks but also generate more engaging and human-like language.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Sparse Pairwise Re-ranking with Pre-trained Transformers
    Gienapp, Lukas
    Froebe, Maik
    Hagen, Matthias
    Potthast, Martin
    PROCEEDINGS OF THE 2022 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2022, 2022, : 250 - 258
  • [22] DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
    Cao, Qingqing
    Trivedi, Harsh
    Balasubramanian, Aruna
    Balasubramanian, Niranjan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4487 - 4497
  • [23] Routing Generative Pre-Trained Transformers for Printed Circuit Board
    Wang, Hao
    Tu, Jun
    Bai, Shenglong
    Zheng, Jie
    Qian, Weikang
    Chen, Jienan
    2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 160 - 165
  • [24] Towards Summarizing Code Snippets Using Pre-Trained Transformers
    Mastropaolo, Antonio
    Tufano, Rosalia
    Ciniselli, Matteo
    Aghajani, Emad
    Pascarella, Luca
    Bavota, Gabriele
    arXiv, 1600,
  • [25] Investor's ESG tendency probed by pre-trained transformers
    Li, Chao
    Keeley, Alexander Ryota
    Takeda, Shutaro
    Seki, Daikichi
    Managi, Shunsuke
    CORPORATE SOCIAL RESPONSIBILITY AND ENVIRONMENTAL MANAGEMENT, 2025, 32 (02) : 2051 - 2071
  • [26] TWilBert: Pre-trained deep bidirectional transformers for Spanish Twitter
    Gonzalez, Jose Angel
    Hurtado, Lluis-F.
    Pla, Ferran
    NEUROCOMPUTING, 2021, 426 : 58 - 69
  • [27] Causal Interpretation of Self-Attention in Pre-Trained Transformers
    Rohekar, Raanan Y.
    Gurwicz, Yaniv
    Nisimov, Shami
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] An Empirical Study of Pre-trained Transformers for Arabic Information Extraction
    Lan, Wuwei
    Chen, Yang
    Xu, Wei
    Ritter, Alan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4727 - 4734
  • [29] Handwritten Document Recognition Using Pre-trained Vision Transformers
    Parres, Daniel
    Anitei, Dan
    Paredes, Roberto
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 173 - 190
  • [30] Experiments in News Bias Detection with Pre-trained Neural Transformers
    Menzner, Tim
    Leidner, Jochen L.
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT IV, 2024, 14611 : 270 - 284