Extending Context Window of Large Language Models via Semantic Compression

被引:0
|
作者
Fei, Weizhi [1 ,2 ]
Niu, Xueyan [1 ,2 ]
Zhou, Pingyi [3 ]
Hou, Lu [3 ]
Bai, Bo [2 ]
Deng, Lei [2 ]
Han, Wei [2 ]
机构
[1] Tsinghua Univ, Dept Math Sci, Beijing, Peoples R China
[2] Huawei Technol Co Ltd, Theory Lab, 2012 Labs, Shenzhen, Peoples R China
[3] Huawei Technol Co Ltd, Noahs Ark Lab, 2012 Labs, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses due to the quadratic complexity. These constraints restrict their applicability in long text scenarios. In this paper, we propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.
引用
收藏
页码:5169 / 5181
页数:13
相关论文
共 50 条
  • [31] The Inadequacy of Reinforcement Learning From Human Feedback-Radicalizing Large Language Models via Semantic Vulnerabilities
    McIntosh, Timothy R.
    Susnjak, Teo
    Liu, Tong
    Watters, Paul
    Halgamuge, Malka N.
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (04) : 1561 - 1574
  • [32] Mitigating Hallucinations in Large Language Models via Semantic Enrichment of Prompts: Insights from BioBERT and Ontological Integration
    Penkov, Stanislav
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA, CLIB 2024, 2024, : 272 - 276
  • [33] Exploring the applicability of large language models to citation context analysis
    Nishikawa, Kai
    Koshiba, Hitoshi
    SCIENTOMETRICS, 2024, 129 (11) : 6751 - 6777
  • [34] Compressing Context to Enhance Inference Efficiency of Large Language Models
    Li, Yucheng
    Dong, Bo
    Guerin, Frank
    Lin, Chenghua
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6342 - 6353
  • [35] Context is everything in regulatory application of large language models (LLMs)
    Tong, Weida
    Renaudin, Michael
    DRUG DISCOVERY TODAY, 2024, 29 (04)
  • [36] Smart-Pikachu: Extending Interactivity of Stufed Animals with Large Language Models
    Itagaki, Toma
    Li, Richard
    ADJUNCT PROCEEDINGS OF THE 36TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2023 ADJUNCT, 2023,
  • [37] Adaptive In-Context Learning with Large Language Models for Bundle
    Sun, Zhu
    Feng, Kaidong
    Yang, Jie
    Qu, Xinghua
    Fang, Hui
    Ong, Yew-Soon
    Liu, Wenyuan
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 966 - 976
  • [38] Learning to Retrieve In-Context Examples for Large Language Models
    Wang, Liang
    Yang, Nan
    Wei, Furu
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1752 - 1767
  • [39] Probing the "Creativity" of Large Language Models: Can Models Produce Divergent Semantic Association?
    Chen, Honghua
    Ding, Nai
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12881 - 12888
  • [40] Attentive Perturbation: Extending Prefix Tuning to Large Language Models Inner Representations
    Falissard, Louis
    Affeldt, Severine
    Nadif, Mohamed
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2023, PT I, 2024, 14505 : 488 - 496