Context Compression and Extraction: Efficiency Inference of Large Language Models

被引:0
|
作者
Zhou, Junyao [1 ]
Du, Ruiqing [1 ]
Tan, Yushan [2 ]
Yang, Jintao [2 ]
Yang, Zonghao [2 ]
Luo, Wei [2 ]
Luo, Zhunchen [2 ]
Zhou, Xian [2 ]
Hu, Wenpeng [2 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056000, Peoples R China
[2] Acad Mil Sci Peoples Liberat Army, Beijing 1000000, Peoples R China
基金
中国国家自然科学基金;
关键词
self-information; mutual-information; context compression; large language model;
D O I
10.1007/978-981-97-5663-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models have shown great capability in dealing with long contexts. However, when applied to question-and-answer response tasks, excessively long contexts unavoidably contain redundant information, which could potentially lead to a loss of significant details. Therefore it is a challenge to retain the information related to the user's query intent in long contexts. To address this problem, our study proposes a novel Context Compression and Extraction (CCE) technique, which takes the impact of the user query into account. CCE computes the mutual information between the query and its context, integrating this with self-information to preserve query-relevant information in the compressed context. We have validated our approach across diverse datasets that require integrated context processing capabilities, such as the arXiv paper dataset and news article dataset. Our methodology exhibits efficacy in various tasks, including summarization, question-answering, and the reconstruction of original contexts. Experimental results validate the superior performance of our method compared to a strong baseline across several evaluation metrics, significantly enhancing the quality of text generated in downstream tasks.
引用
下载
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [41] Membership inference attacks against compression models
    Yong Jin
    Weidong Lou
    Yanghua Gao
    Computing, 2023, 105 : 2419 - 2442
  • [42] Membership inference attacks against compression models
    Jin, Yong
    Lou, Weidong
    Gao, Yanghua
    COMPUTING, 2023, 105 (11) : 2419 - 2442
  • [43] Automatic inference of models for statistical code compression
    Fraser, CW
    ACM SIGPLAN NOTICES, 1999, 34 (05) : 242 - 246
  • [44] Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Computing: An Active Inference Approach
    He Y.
    Fang J.
    Yu F.R.
    Leung V.C.
    IEEE Transactions on Mobile Computing, 2024, 23 (12) : 1 - 12
  • [45] Using Augmented Small Multimodal Models to Guide Large Language Models for Multimodal Relation Extraction
    He, Wentao
    Ma, Hanjie
    Li, Shaohua
    Dong, Hui
    Zhang, Haixiang
    Feng, Jie
    APPLIED SCIENCES-BASEL, 2023, 13 (22):
  • [46] A Generative Adaptive Context Learning Framework for Large Language Models in Cheapfake Detection
    Pham, Long-Khanh
    Vo-Hoang, Hoa-Vien
    Tran, Anh-Duy
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1288 - 1293
  • [47] Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Networks: An Active Inference Approach
    Fang, Jingcheng
    He, Ying
    Yu, F. Richard
    Li, Jianqiang
    Leung, Victor C.
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [48] Large Language Models are Not Models of Natural Language: They are Corpus Models
    Veres, Csaba
    IEEE ACCESS, 2022, 10 : 61970 - 61979
  • [49] Large Language Models
    Vargas, Diego Collarana
    Katsamanis, Nassos
    ERCIM NEWS, 2024, (136): : 12 - 13
  • [50] Large Language Models
    Cerf, Vinton G.
    COMMUNICATIONS OF THE ACM, 2023, 66 (08) : 7 - 7