Extending Context Window of Large Language Models via Semantic Compression

被引:0
|
作者
Fei, Weizhi [1 ,2 ]
Niu, Xueyan [1 ,2 ]
Zhou, Pingyi [3 ]
Hou, Lu [3 ]
Bai, Bo [2 ]
Deng, Lei [2 ]
Han, Wei [2 ]
机构
[1] Tsinghua Univ, Dept Math Sci, Beijing, Peoples R China
[2] Huawei Technol Co Ltd, Theory Lab, 2012 Labs, Shenzhen, Peoples R China
[3] Huawei Technol Co Ltd, Noahs Ark Lab, 2012 Labs, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses due to the quadratic complexity. These constraints restrict their applicability in long text scenarios. In this paper, we propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.
引用
收藏
页码:5169 / 5181
页数:13
相关论文
共 50 条
  • [1] Extending Context Window in Large Language Models with Segmented Base Adjustment for Rotary Position Embeddings
    Li, Rongsheng
    Xu, Jin
    Cao, Zhixiong
    Zheng, Hai-Tao
    Kim, Hong-Gee
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [2] Context Compression and Extraction: Efficiency Inference of Large Language Models
    Zhou, Junyao
    Du, Ruiqing
    Tan, Yushan
    Yang, Jintao
    Yang, Zonghao
    Luo, Wei
    Luo, Zhunchen
    Zhou, Xian
    Hu, Wenpeng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14875 : 221 - 232
  • [3] Reducing hallucinations of large language models via hierarchical semantic piece
    Liu, Yanyi
    Yang, Qingwen
    Tang, Jiawei
    Guo, Tiezheng
    Wang, Chen
    Li, Pan
    Xu, Sai
    Gao, Xianlin
    Li, Zhi
    Liu, Jun
    Wen, Yingyou
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (05)
  • [4] Semantic anomaly detection with large language models
    Elhafsi, Amine
    Sinha, Rohan
    Agia, Christopher
    Schmerling, Edward
    Nesnas, Issa A. D.
    Pavone, Marco
    AUTONOMOUS ROBOTS, 2023, 47 (08) : 1035 - 1055
  • [5] Semantic anomaly detection with large language models
    Amine Elhafsi
    Rohan Sinha
    Christopher Agia
    Edward Schmerling
    Issa A. D. Nesnas
    Marco Pavone
    Autonomous Robots, 2023, 47 : 1035 - 1055
  • [6] A Survey on Model Compression for Large Language Models
    Zhu, Xunyu
    Li, Jian
    Liu, Yong
    Ma, Can
    Wang, Weiping
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1556 - 1577
  • [7] CafeLLM: Context-Aware Fine-Grained Semantic Clustering Using Large Language Models
    Huang, Ryan Yuki
    Small, Colin Robert
    GENERALIZING FROM LIMITED RESOURCES IN THE OPEN WORLD, GLOW-IJCAI 2024, 2024, 2160 : 66 - 81
  • [8] Semantic Mechanical Search with Large Vision and Language Models
    Sharma, Satvik
    Huang, Huang
    Shivakumar, Kaushik
    Chen, Lawrence Yunliang
    Hoque, Ryan
    Ichter, Brian
    Goldberg, Ken
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [9] Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
    Guo, Qingyan
    Wang, Rui
    Guo, Junliang
    Tan, Xu
    Bian, Jiang
    Yang, Yujiu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11453 - 11464
  • [10] Extending ontology language for semantic web
    Yu, Qing
    Wang, Jinlin
    CIS WORKSHOPS 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY WORKSHOPS, 2007, : 116 - +