Context Compression and Extraction: Efficiency Inference of Large Language Models

被引:0
|
作者
Zhou, Junyao [1 ]
Du, Ruiqing [1 ]
Tan, Yushan [2 ]
Yang, Jintao [2 ]
Yang, Zonghao [2 ]
Luo, Wei [2 ]
Luo, Zhunchen [2 ]
Zhou, Xian [2 ]
Hu, Wenpeng [2 ]
机构
[1] Hebei Univ Engn, Sch Informat & Elect Engn, Handan 056000, Peoples R China
[2] Acad Mil Sci Peoples Liberat Army, Beijing 1000000, Peoples R China
基金
中国国家自然科学基金;
关键词
self-information; mutual-information; context compression; large language model;
D O I
10.1007/978-981-97-5663-6_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models have shown great capability in dealing with long contexts. However, when applied to question-and-answer response tasks, excessively long contexts unavoidably contain redundant information, which could potentially lead to a loss of significant details. Therefore it is a challenge to retain the information related to the user's query intent in long contexts. To address this problem, our study proposes a novel Context Compression and Extraction (CCE) technique, which takes the impact of the user query into account. CCE computes the mutual information between the query and its context, integrating this with self-information to preserve query-relevant information in the compressed context. We have validated our approach across diverse datasets that require integrated context processing capabilities, such as the arXiv paper dataset and news article dataset. Our methodology exhibits efficacy in various tasks, including summarization, question-answering, and the reconstruction of original contexts. Experimental results validate the superior performance of our method compared to a strong baseline across several evaluation metrics, significantly enhancing the quality of text generated in downstream tasks.
引用
下载
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [21] Distributed Inference and Fine-tuning of Large Language Models Over The Internet
    Borzunov, Alexander
    Ryabinin, Max
    Chumachenko, Artem
    Baranchuk, Dmitry
    Dettmers, Tim
    Belkada, Younes
    Samygin, Pavel
    Raffel, Colin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] Context-Aware Abbreviation Expansion Using Large Language Models
    Cai, Shanqing
    Venugopalan, Subhashini
    Tomanek, Katrin
    Narayanan, Ajit
    Morris, Meredith Ringel
    Brenner, Michael P.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1261 - 1275
  • [23] Towards a benchmark dataset for large language models in the context of process automation
    Tizaoui, Tejennour
    Tan, Ruomu
    DIGITAL CHEMICAL ENGINEERING, 2024, 13
  • [24] In-Context Impersonation Reveals Large Language Models' Strengths and Biases
    Salewski, Leonard
    Alaniz, Stephan
    Rio-Torto, Isabel
    Schulz, Eric
    Akata, Zeynep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] Structured information extraction from scientific text with large language models
    John Dagdelen
    Alexander Dunn
    Sanghoon Lee
    Nicholas Walker
    Andrew S. Rosen
    Gerbrand Ceder
    Kristin A. Persson
    Anubhav Jain
    Nature Communications, 15
  • [26] Performance of two large language models for data extraction in evidence synthesis
    Konet, Amanda
    Thomas, Ian
    Gartlehner, Gerald
    Kahwati, Leila
    Hilscher, Rainer
    Kugley, Shannon
    Crotty, Karen
    Viswanathan, Meera
    Chew, Robert
    RESEARCH SYNTHESIS METHODS, 2024,
  • [27] Contextual feature extraction hierarchies converge in large language models and the brain
    Mischler, Gavin
    Li, Yinghao Aaron
    Bickel, Stephan
    Mehta, Ashesh D.
    Mesgarani, Nima
    Nature Machine Intelligence, 2024, 6 (12) : 1467 - 1477
  • [28] Large Language Models for Data Extraction in Slot-Filling Tasks
    Bazan, Marek
    Gniazdowski, Tomasz
    Wolkiewicz, Dawid
    Sarna, Juliusz
    Marchwiany, Maciej E.
    SYSTEM DEPENDABILITY-THEORY AND APPLICATIONS, DEPCOS-RELCOMEX 2024, 2024, 1026 : 1 - 18
  • [29] Data extraction from polymer literature using large language models
    Gupta, Sonakshi
    Mahmood, Akhlak
    Shetty, Pranav
    Adeboye, Aishat
    Ramprasad, Rampi
    Communications Materials, 2024, 5 (01)
  • [30] Towards automated phenotype definition extraction using large language models
    Ramya Tekumalla
    Juan M. Banda
    Genomics & Informatics, 22 (1)