Navigating Challenges and Technical Debt in Large Language Models Deployment

被引:0
|
作者
Menshawy, Ahmed [1 ]
Nawaz, Zeeshan [1 ]
Fahmy, Mahmoud [1 ]
机构
[1] Mastercard, AI Engn, Dublin, Ireland
关键词
Large Language Models (LLMs); LLMs Deployment; Technical Debt in AI; LLM Model Compression and Pruning; High-Throughput LLM Processing; LLM Deployment Challenges; Scalability Challenges in LLMs Deployment;
D O I
10.1145/3642970.3655840
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have become an essential tool in advancing artificial intelligence and machine learning, enabling outstanding capabilities in natural language processing, and understanding. However, the efficient deployment of LLMs in production environments reveals a complex landscape of challenges and technical debt. In this paper, we aim to highlight unique forms of challenges and technical debt associated with the deployment of LLMs, including those related to memory management, parallelism strategies, model compression, and attention optimization. These challenges emphasize the necessity of custom approaches to deploying LLMs, demanding customization and sophisticated engineering solutions not readily available in broad-use machine learning libraries or inference engines.
引用
收藏
页码:192 / 199
页数:8
相关论文
共 50 条
  • [2] Large language models (LLMs): survey, technical frameworks, and future challenges
    Kumar, Pranjal
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (09)
  • [3] Navigating the landscape of medical triage: Unveiling the potential and challenges of large language models and beyond
    Lansiaux, Edouard
    Baron, Marc -Antoine
    Vromant, Amelie
    [J]. AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2024, 78 : 224 - 224
  • [4] Technical Challenges Towards an AAL Large Scale Deployment
    Bellmunt, Joaquim
    Tiberghien, Thibaut
    Mokhtari, Mounir
    Aloulou, Hamdi
    Endelin, Romain
    [J]. INCLUSIVE SMART CITIES AND E-HEALTH, 2015, 9102 : 3 - 14
  • [5] Navigating Ontology Development with Large Language Models
    Saeedizade, Mohammad Javad
    Blomqvist, Eva
    [J]. SEMANTIC WEB, PT I, ESWC 2024, 2024, 14664 : 143 - 161
  • [6] Navigating social debt and its link with technical debt in large-scale agile software development projects
    Saeeda, Hina
    Ahmad, Muhammad Ovais
    Gustavsson, Tomas
    [J]. SOFTWARE QUALITY JOURNAL, 2024,
  • [7] Artificial Intelligence in the Era of Large Language Models: Technical Significance, Industry Applications, and Challenges
    Chen, Guang
    Guo, Jun
    [J]. Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2024, 47 (04): : 20 - 28
  • [8] Analyzing Declarative Deployment Code with Large Language Models
    Lanciano, Giacomo
    Stein, Manuel
    Hilt, Volker
    Cucinotta, Tommaso
    [J]. PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, CLOSER 2023, 2023, : 289 - 296
  • [9] Technical Debt Challenges and Perspectives
    Stopford, Ben
    Wallace, Ken
    Allspaw, John
    [J]. IEEE SOFTWARE, 2017, 34 (04) : 79 - 81
  • [10] Navigating Complexity: Enhancing Pediatric Diagnostics With Large Language Models
    Mitchell, James
    Bennett, Tellen D.
    [J]. PEDIATRIC CRITICAL CARE MEDICINE, 2024, 25 (06) : 577 - 580