Zero-shot Mathematical Problem Solving via Generative Pre-trained Transformers

被引:0
|
作者
Galatolo, Federico A. [1 ]
Cimino, Mario G. C. A. [1 ]
Vaglini, Gigliola [1 ]
机构
[1] Univ Pisa, Dept Informat Engn, Largo L Lazzarino 1, Pisa, Italy
关键词
Deep Learning; Natural Language Processing; Generative Pre-trained Transformers; Zero-shot Learning; Mathematical Problem Solving;
D O I
10.5220/0011032400003179
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mathematics is an effective testbed for measuring the problem-solving ability of machine learning models. The current benchmark for deep learning-based solutions is grade school math problems: given a natural language description of a problem, the task is to analyse the problem, exploit heuristics generated from a very large set of solved examples, and then generate an answer. In this paper, a descendant of the third generation of Generative Pre-trained Transformer Networks (GPT-3) is used to develop a zero-shot learning approach, to solve this problem. The proposed approach shows that coding based problem-solving is more effective than the natural language reasoning based one. Specifically, the architectural solution is built upon OpenAI Codex, a descendant of GPT-3 for programming tasks, trained on public GitHub repositories, the world's largest source code hosting service. Experimental results clearly show the potential of the approach: by exploiting the Python as programming language, proposed pipeline achieves the 18.63% solve rate against the 6.82% of GPT-3. Finally, by using a fine-tuned verifier, the correctness of the answer can be ranked at runtime, and then improved by generating a predefined number of trials. With this approach, for 10 trials and an ideal verifier, the proposed pipeline achieves 54.20% solve rate.
引用
收藏
页码:479 / 483
页数:5
相关论文
共 50 条
  • [1] SiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-trained Siamese Transformers
    Liu, Lin
    Yuan, Shanxin
    Liu, Jianzhuang
    Guo, Xin
    Yan, Youliang
    Tian, Qi
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1747 - 1755
  • [2] Zero-Shot Nuclei Detection via Visual-Language Pre-trained Models
    Wu, Yongjian
    Zhou, Yang
    Saiyin, Jiya
    Wei, Bingzheng
    lai, Maode
    Shou, Jianzhong
    Fan, Yubo
    Xu, Yan
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 693 - 703
  • [3] Zero-shot domain paraphrase with unaligned pre-trained language models
    Zheng Chen
    Hu Yuan
    Jiankun Ren
    [J]. Complex & Intelligent Systems, 2023, 9 : 1097 - 1110
  • [4] Zero-shot domain paraphrase with unaligned pre-trained language models
    Chen, Zheng
    Yuan, Hu
    Ren, Jiankun
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (01) : 1097 - 1110
  • [5] Pre-trained Language Models Can be Fully Zero-Shot Learners
    Zhao, Xuandong
    Ouyang, Siqi
    Yu, Zhiguo
    Wu, Ming
    Li, Lei
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15590 - 15606
  • [6] RAPID: Zero-Shot Domain Adaptation for Code Search with Pre-Trained Models
    Fan, Guodong
    Chen, Shizhan
    Gao, Cuiyun
    Xiao, Jianmao
    Zhang, Tao
    Feng, Zhiyong
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (05)
  • [7] What Makes Pre-trained Language Models Better Zero-shot Learners?
    Lu, Jinghui
    Zhu, Dongsheng
    Han, Weidong
    Zhao, Rui
    Mac Namee, Brian
    Tan, Fei
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2288 - 2303
  • [8] Zero-Shot Recommendations with Pre-Trained Large Language Models for Multimodal Nudging
    Harrison, Rachel M.
    Dereventsov, Anton
    Bibin, Anton
    [J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 1535 - 1542
  • [9] Generative pre-trained transformers (GPT) for surface engineering
    Kamnis, Spyros
    [J]. SURFACE & COATINGS TECHNOLOGY, 2023, 466
  • [10] GENERATIVE PRE-TRAINED TRANSFORMERS FOR BIOLOGICALLY INSPIRED DESIGN
    Zhu, Qihao
    Zhang, Xinyu
    Luo, Jianxi
    [J]. PROCEEDINGS OF ASME 2022 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2022, VOL 6, 2022,