Context Compression and Extraction: Efficiency Inference of Large Language Models

被引:0
|
作者
Zhou, Junyao [1 ]
Du, Ruiqing [1 ]
Tan, Yushan [2 ]
Yang, Jintao [2 ]
Yang, Zonghao [2 ]
Luo, Wei [2 ]
Luo, Zhunchen [2 ]
Zhou, Xian [2 ]
Hu, Wenpeng [2 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056000, Peoples R China
[2] Acad Mil Sci Peoples Liberat Army, Beijing 1000000, Peoples R China
基金
中国国家自然科学基金;
关键词
self-information; mutual-information; context compression; large language model;
D O I
10.1007/978-981-97-5663-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models have shown great capability in dealing with long contexts. However, when applied to question-and-answer response tasks, excessively long contexts unavoidably contain redundant information, which could potentially lead to a loss of significant details. Therefore it is a challenge to retain the information related to the user's query intent in long contexts. To address this problem, our study proposes a novel Context Compression and Extraction (CCE) technique, which takes the impact of the user query into account. CCE computes the mutual information between the query and its context, integrating this with self-information to preserve query-relevant information in the compressed context. We have validated our approach across diverse datasets that require integrated context processing capabilities, such as the arXiv paper dataset and news article dataset. Our methodology exhibits efficacy in various tasks, including summarization, question-answering, and the reconstruction of original contexts. Experimental results validate the superior performance of our method compared to a strong baseline across several evaluation metrics, significantly enhancing the quality of text generated in downstream tasks.
引用
下载
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [1] Measuring and Improving the Energy Efficiency of Large Language Models Inference
    Argerich, Mauricio Fadel
    Patino-Martinez, Marta
    IEEE ACCESS, 2024, 12 : 80194 - 80207
  • [2] Language Models for Lexical Inference in Context
    Schmitt, Martin
    Schuetze, Hinrich
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1267 - 1280
  • [3] Assessing Inference Time in Large Language Models
    Walkowiak, Bartosz
    Walkowiak, Tomasz
    SYSTEM DEPENDABILITY-THEORY AND APPLICATIONS, DEPCOS-RELCOMEX 2024, 2024, 1026 : 296 - 305
  • [4] Meta-in-context learning in large language models
    Coda-Forno, Julian
    Binz, Marcel
    Akata, Zeynep
    Botvinick, Matthew
    Wang, Jane X.
    Schulz, Eric
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Large language models for generative information extraction: a survey
    Xu, Derong
    Chen, Wei
    Peng, Wenjun
    Zhang, Chao
    Xu, Tong
    Zhao, Xiangyu
    Wu, Xian
    Zheng, Yefeng
    Wang, Yang
    Chen, Enhong
    Frontiers of Computer Science, 2024, 18 (06)
  • [6] Revisiting Relation Extraction in the era of Large Language Models
    Wadhwa, Somin
    Amir, Silvio
    Wallace, Byron C.
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15566 - 15589
  • [7] Trend Extraction and Analysis via Large Language Models
    Soru, Tommaso
    Marshall, Jim
    18TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC 2024, 2024, : 285 - 288
  • [8] Studying large language models as compression algorithms for human culture
    Buttrick, Nicholas
    TRENDS IN COGNITIVE SCIENCES, 2024, 28 (03) : 187 - 189
  • [9] Targeted Training Data Extraction-Neighborhood Comparison-Based Membership Inference Attacks in Large Language Models
    Xu, Huan
    Zhang, Zhanhao
    Yu, Xiaodong
    Wu, Yingbo
    Zha, Zhiyong
    Xu, Bo
    Xu, Wenfeng
    Hu, Menglan
    Peng, Kai
    APPLIED SCIENCES-BASEL, 2024, 14 (16):
  • [10] Integrating Knowledge Graph Data with Large Language Models for Explainable Inference
    Efrain Quintero-Narvaez, Carlos
    Monroy, Raul
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1198 - 1199