Extending Context Window of Large Language Models via Semantic Compression

被引：0

作者：

Fei, Weizhi ^{[1
,2
]}

Niu, Xueyan ^{[1
,2
]}

Zhou, Pingyi ^{[3
]}

Hou, Lu ^{[3
]}

Bai, Bo ^{[2
]}

Deng, Lei ^{[2
]}

Han, Wei ^{[2
]}

机构：

[1] Tsinghua Univ, Dept Math Sci, Beijing, Peoples R China

[2] Huawei Technol Co Ltd, Theory Lab, 2012 Labs, Shenzhen, Peoples R China

[3] Huawei Technol Co Ltd, Noahs Ark Lab, 2012 Labs, Shenzhen, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses due to the quadratic complexity. These constraints restrict their applicability in long text scenarios. In this paper, we propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.

引用

页码：5169 / 5181

页数：13

共 50 条

[1] Extending Context Window in Large Language Models with Segmented Base Adjustment for Rotary Position Embeddings
Li, Rongsheng
Xu, Jin
Cao, Zhixiong
Zheng, Hai-Tao
Kim, Hong-Gee
APPLIED SCIENCES-BASEL, 2024, 14 (07):
[2] Context Compression and Extraction: Efficiency Inference of Large Language Models
Zhou, Junyao
Du, Ruiqing
Tan, Yushan
Yang, Jintao
Yang, Zonghao
Luo, Wei
Luo, Zhunchen
Zhou, Xian
Hu, Wenpeng
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14875 : 221 - 232
[3] Reducing hallucinations of large language models via hierarchical semantic piece
Liu, Yanyi
Yang, Qingwen
Tang, Jiawei
Guo, Tiezheng
Wang, Chen
Li, Pan
Xu, Sai
Gao, Xianlin
Li, Zhi
Liu, Jun
Wen, Yingyou
COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (05)
[4] Semantic anomaly detection with large language models
Elhafsi, Amine
Sinha, Rohan
Agia, Christopher
Schmerling, Edward
Nesnas, Issa A. D.
Pavone, Marco
AUTONOMOUS ROBOTS, 2023, 47 (08) : 1035 - 1055
[5] Semantic anomaly detection with large language models
Amine Elhafsi
Rohan Sinha
Christopher Agia
Edward Schmerling
Issa A. D. Nesnas
Marco Pavone
Autonomous Robots, 2023, 47 : 1035 - 1055
[6] A Survey on Model Compression for Large Language Models
Zhu, Xunyu
Li, Jian
Liu, Yong
Ma, Can
Wang, Weiping
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1556 - 1577
[7] CafeLLM: Context-Aware Fine-Grained Semantic Clustering Using Large Language Models
Huang, Ryan Yuki
Small, Colin Robert
GENERALIZING FROM LIMITED RESOURCES IN THE OPEN WORLD, GLOW-IJCAI 2024, 2024, 2160 : 66 - 81
[8] Semantic Mechanical Search with Large Vision and Language Models
Sharma, Satvik
Huang, Huang
Shivakumar, Kaushik
Chen, Lawrence Yunliang
Hoque, Ryan
Ichter, Brian
Goldberg, Ken
CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
[9] Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
Guo, Qingyan
Wang, Rui
Guo, Junliang
Tan, Xu
Bian, Jiang
Yang, Yujiu
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11453 - 11464
[10] Extending ontology language for semantic web
Yu, Qing
Wang, Jinlin
CIS WORKSHOPS 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY WORKSHOPS, 2007, : 116 - +

← 1 2 3 4 5 →