Extending Context Window of Large Language Models via Semantic Compression

被引:0
|
作者
Fei, Weizhi [1 ,2 ]
Niu, Xueyan [1 ,2 ]
Zhou, Pingyi [3 ]
Hou, Lu [3 ]
Bai, Bo [2 ]
Deng, Lei [2 ]
Han, Wei [2 ]
机构
[1] Tsinghua Univ, Dept Math Sci, Beijing, Peoples R China
[2] Huawei Technol Co Ltd, Theory Lab, 2012 Labs, Shenzhen, Peoples R China
[3] Huawei Technol Co Ltd, Noahs Ark Lab, 2012 Labs, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses due to the quadratic complexity. These constraints restrict their applicability in long text scenarios. In this paper, we propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.
引用
收藏
页码:5169 / 5181
页数:13
相关论文
共 50 条
  • [41] Web-Scale Semantic Product Search with Large Language Models
    Muhamed, Aashiq
    Srinivasan, Sriram
    Teo, Choon-Hui
    Cui, Qingjun
    Zeng, Belinda
    Chilimbi, Trishul
    Vishwanathan, S. V. N.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT III, 2023, 13937 : 73 - 85
  • [42] Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles
    de Curto, J.
    de Zarza, I.
    Calafate, Carlos T.
    DRONES, 2023, 7 (02)
  • [43] Customization of Closed Captions via Large Language Models
    Chavez, Mariana Arroyo
    Thompson, Bernard
    Feanny, Molly
    Alabi, Kafayat
    Kim, Minchan
    Ming, Lu
    Glasser, Abraham
    Kushalnagar, Raja
    Vogler, Christian
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II, ICCHP 2024, 2024, 14751 : 50 - 58
  • [44] Detoxifying Large Language Models via Knowledge Editing
    Wang, Mengru
    Zhang, Ningyu
    Xu, Ziwen
    Xi, Zekun
    Deng, Shumin
    Yao, Yunzhi
    Zhang, Qishen
    Yang, Linyi
    Wang, Jindong
    Chen, Huajun
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 3093 - 3118
  • [45] Trend Extraction and Analysis via Large Language Models
    Soru, Tommaso
    Marshall, Jim
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
  • [46] High Efficiency Image Compression for Large Visual-Language Models
    Li, Binzhe
    Wang, Shurun
    Wang, Shiqi
    Ye, Yan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2870 - 2880
  • [47] Understanding the Effect of Model Compression on Social Bias in Large Language Models
    Goncalves, Gustavo
    Strubell, Emma
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2663 - 2675
  • [48] Context-Aware Behavioral Tips to Improve Sleep Quality via Machine Learning and Large Language Models
    Corda, Erica
    Massa, Silvia M.
    Riboni, Daniele
    FUTURE INTERNET, 2024, 16 (02)
  • [49] Compression of Generative Pre-trained Language Models via Quantization
    Tao, Chaofan
    Hou, Lu
    Zhang, Wei
    Shang, Lifeng
    Jiang, Xin
    Liu, Qun
    Luo, Ping
    Wong, Ngai
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4821 - 4836
  • [50] Context-Aware Abbreviation Expansion Using Large Language Models
    Cai, Shanqing
    Venugopalan, Subhashini
    Tomanek, Katrin
    Narayanan, Ajit
    Morris, Meredith Ringel
    Brenner, Michael P.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1261 - 1275