A Mathematical Interpretation of Autoregressive Generative Pre-Trained Transformer and Self-Supervised Learning

被引：10

作者：

Lee, Minhyeok ^{[1
]}

机构：

[1] Chung Ang Univ, Sch Elect & Elect Engn, Seoul 06974, South Korea

来源：

MATHEMATICS | 2023年 / 11卷 / 11期

关键词：

generative pre-trained transformer; GPT; ChatGPT; self-supervised learning; deep learning; natural language processing; NLP;

D O I：

10.3390/math11112451

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

In this paper, we present a rigorous mathematical examination of generative pre-trained transformer (GPT) models and their autoregressive self-supervised learning mechanisms. We begin by defining natural language space and knowledge space, which are two key concepts for understanding the dimensionality reduction process in GPT-based large language models (LLMs). By exploring projection functions and their inverses, we establish a framework for analyzing the language generation capabilities of these models. We then investigate the GPT representation space, examining its implications for the models' approximation properties. Finally, we discuss the limitations and challenges of GPT models and their learning mechanisms, considering trade-offs between complexity and generalization, as well as the implications of incomplete inverse projection functions. Our findings demonstrate that GPT models possess the capability to encode knowledge into low-dimensional vectors through their autoregressive self-supervised learning mechanism. This comprehensive analysis provides a solid mathematical foundation for future advancements in GPT-based LLMs, promising advancements in natural language processing tasks such as language translation, text summarization, and question answering due to improved understanding and optimization of model training and performance.

引用

页数：19

共 50 条

[21] The application of Chat Generative Pre-trained Transformer in nursing education
Liu, Jialin
Liu, Fan
Fang, Jinbo
Liu, Siru
[J]. NURSING OUTLOOK, 2023, 71 (06)
[22] KNOWLEDGE DISTILLATION FOR NEURAL TRANSDUCERS FROM LARGE SELF-SUPERVISED PRE-TRAINED MODELS
Yang, Xiaoyu
Li, Qiujia
Woodland, Philip C.
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8527 - 8531
[23] Explore the Use of Self-supervised Pre-trained Acoustic Features on Disguised Speech Detection
Quan, Jie
Yang, Yingchun
[J]. BIOMETRIC RECOGNITION (CCBR 2021), 2021, 12878 : 483 - 490
[24] Self-Supervised Learning: Generative or Contrastive
Liu, Xiao
Zhang, Fanjin
Hou, Zhenyu
Mian, Li
Wang, Zhaoyu
Zhang, Jing
Tang, Jie
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 857 - 876
[25] Generative Pre-Trained Transformer-Based Reinforcement Learning for Testing Web Application Firewalls
Liang, Hongliang
Li, Xiangyu
Xiao, Da
Liu, Jie
Zhou, Yanjie
Wang, Aibo
Li, Jin
[J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (01) : 309 - 324
[26] ON THE USE OF SELF-SUPERVISED PRE-TRAINED ACOUSTIC AND LINGUISTIC FEATURES FOR CONTINUOUS SPEECH EMOTION RECOGNITION
Macary, Manon
Tahon, Marie
Esteve, Yannick
Rousseau, Anthony
[J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 373 - 380
[27] Employing bimodal representations to predict DNA bendability within a self-supervised pre-trained framework
Yang, Minghao
Zhang, Shichen
Zheng, Zhihang
Zhang, Pengfei
Liang, Yan
Tang, Shaojun
[J]. NUCLEIC ACIDS RESEARCH, 2024, 52 (06)
[28] The impact of Chat Generative Pre-trained Transformer (ChatGPT) on medical education
Heng, Jonathan J. Y.
Teo, Desmond B.
Tan, L. F.
[J]. POSTGRADUATE MEDICAL JOURNAL, 2023, 99 (1176) : 1125 - 1127
[29] GPT-LS: Generative Pre-Trained Transformer with Offline Reinforcement Learning for Logic Synthesis
Lv, Chenyang
Wei, Ziling
Qian, Weikang
Ye, Junjie
Feng, Chang
He, Zhezhi
[J]. 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, 2023, : 320 - 326
[30] Self-supervised Bidirectional Prompt Tuning for Entity-enhanced Pre-trained Language Model
Zou, Jiaxin
Xu, Xianghong
Hou, Jiawei
Yang, Qiang
Zheng, Hai-Tao
[J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →