Optimizing Microservice Deployment in Edge Computing with Large Language Models: Integrating Retrieval Augmented Generation and Chain of Thought Techniques

被引:0
|
作者
Feng, Kan [1 ]
Luo, Lijun [1 ]
Xia, Yongjun [2 ]
Luo, Bin [2 ]
He, Xingfeng [1 ]
Li, Kaihong [3 ]
Zha, Zhiyong [4 ]
Xu, Bo [1 ,5 ]
Peng, Kai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Hubei Key Lab Smart Internet Technol, Wuhan 430074, Peoples R China
[2] Hubei Huazhong Elect Power Technol Dev Co Ltd, Wuhan 430079, Peoples R China
[3] Wuhan Univ, Elect Informat Sch, Wuhan 430072, Peoples R China
[4] State Grid Informat Telecommun Co Ltd, Wuhan 430048, Peoples R China
[5] Hubei ChuTianYun Co Ltd, Wuhan 430076, Peoples R China
来源
SYMMETRY-BASEL | 2024年 / 16卷 / 11期
关键词
large language models; retrieval augmented generation; microservice deployment; mobile edge computing;
D O I
10.3390/sym16111470
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large Language Models (LLMs) have demonstrated impressive capabilities in autogenerating code based on natural language instructions provided by humans. We observed that in the microservice models of edge computing, the problem of deployment latency optimization can be transformed into an NP-hard mathematical optimization problem. However, in the real world, deployment strategies at the edge often require immediate updates, while human-engineered code tends to be lagging. To bridge this gap, we innovatively integrated LLMs into the decision-making process for microservice deployment. Initially, we constructed a private Retrieval Augmented Generation (RAG) database containing prior knowledge. Subsequently, we employed meticulously designed step-by-step inductive instructions and used the chain of thought (CoT) technique to enable the LLM to learn, reason, reflect, and regenerate. We decomposed the microservice deployment latency optimization problem into a collection of granular sub-problems (described in natural language), progressively providing instructions to the fine-tuned LLM to generate corresponding code blocks. The generated code blocks underwent integration and consistency assessment. Additionally, we prompted the LLM to generate code without the use of the RAG database for comparative analysis. We executed the aforementioned code and comparison algorithm under identical operational environments and simulation parameters, conducting rigorous result analysis. Our fine-tuned model significantly reduced latencies by 22.8% in handling surges in request flows, 37.8% in managing complex microservice types, and 39.5% in processing increased network nodes compared to traditional algorithms. Moreover, our approach demonstrated marked improvements in latency performance over LLMs not utilizing RAG technology and reinforcement learning algorithms reported in other literature. The use of LLMs also highlights the concept of symmetry, as the symmetrical structure of input-output relationships in microservice deployment models aligns with the LLM's inherent ability to process and generate balanced and optimized code. Symmetry in this context allows for more efficient resource allocation and reduces redundant operations, further enhancing the model's effectiveness. We believe that LLMs hold substantial potential in optimizing microservice deployment models.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Improving Large Language Models for Radiation Oncology with an Adaptable Retrieval-Based Chain of Thought (ARCoT) Framework
    Grandinetti, J.
    McBeth, R. A.
    MEDICAL PHYSICS, 2024, 51 (09) : 6590 - 6590
  • [32] Enhancing Supply Chain Efficiency through Retrieve-Augmented Generation Approach in Large Language Models
    Zhu, Beilei
    Vuppalapati, Chandrasekar
    2024 IEEE 10TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND MACHINE LEARNING APPLICATIONS, BIGDATASERVICE 2024, 2024, : 117 - 121
  • [33] Retrieval-Augmented Generation for Large Language Models in Radiology: Another Leap Forward in Board Examination Performance
    Bhayana, Rajesh
    Fawzy, Aly
    Deng, Yangqing
    Bleakney, Robert R.
    Krishna, Satheesh
    RADIOLOGY, 2024, 313 (01)
  • [34] Optimization of hepatological clinical guidelines interpretation by large language models: a retrieval augmented generation-based framework
    Kresevic, Simone
    Giuffre, Mauro
    Ajcevic, Milos
    Accardo, Agostino
    Croce, Lory S.
    Shung, Dennis L.
    NPJ DIGITAL MEDICINE, 2024, 7 (01)
  • [35] Enhancement of the Performance of Large Language Models inDiabetes Education through Retrieval-Augmented Generation:Comparative Study
    Wang, Dingqiao
    Liang, Jiangbo
    Ye, Jinguo
    Li, Jingni
    Li, Jingpeng
    Zhang, Qikai
    Hu, Qiuling
    Pan, Caineng
    Wang, Dongliang
    Liu, Zhong
    Shi, Wen
    Shi, Danli
    Li, Fei
    Qu, Bo
    Zheng, Yingfeng
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [36] Optimizing Large Language Models for Interpreting American Heart Association Dietary Guidelines in Nutrition Education for Cardiovascular Disease Prevention: A Retrieval-Augmented Generation Framework
    Parameswaran, Vijaya
    Bernard, Jenna
    Bernard, Alec
    Deo, Neil
    Bates, David
    Lyytinen, Kalle
    Dash, Rajesh
    CIRCULATION, 2024, 150
  • [37] Performance comparison of retrieval-augmented generation and fine-tuned large language models for construction safety management knowledge retrieval
    Lee, Jungwon
    Ahn, Seungjun
    Kim, Daeho
    Kim, Dongkyun
    AUTOMATION IN CONSTRUCTION, 2024, 168
  • [38] Optimizing Recommendation Systems in E-Learning: Synergistic Integration of Lang Chain, GPT Models, and Retrieval Augmented Generation (RAG)
    El Maazouzi, Qamar
    Retbi, Asmaa
    Bennani, Samir
    SMART APPLICATIONS AND DATA ANALYSIS, SADASC 2024, PT I, 2024, 2167 : 106 - 118
  • [39] Prompting Large Language Models with Chain-of-Thought for Few-Shot Knowledge Base Question Generation
    Liang, Yuanyuan
    Wang, Jianing
    Zhu, Hanlun
    Wang, Lei
    Qian, Weining
    Lan, Yunshi
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4329 - 4343
  • [40] Enhancing Environmental Control in Broiler Production: Retrieval-Augmented Generation for Improved Decision-Making with Large Language Models
    Leite, Marcus Vinicius
    Abe, Jair Minoro
    Souza, Marcos Leandro Hoffmann
    Naas, Irenilza de Alencar
    AGRIENGINEERING, 2025, 7 (01):