Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization

被引:0
|
作者
Zhou, Hong [1 ,2 ]
Wang, Hong-lin [1 ,2 ]
Duan, Yu-yu [2 ,3 ]
Yan, Zi-neng [1 ,2 ]
Luo, Rui [1 ,2 ]
Lv, Xiang-xin [1 ,2 ]
Xie, Yi [1 ,2 ]
Zhang, Jia-yao [1 ,2 ]
Yang, Jia-ming [1 ,2 ]
Xue, Ming-di [1 ,2 ]
Fang, Ying [1 ,2 ]
Lu, Lin [2 ,4 ]
Liu, Peng-ran [1 ,2 ]
Ye, Zhe-wei [1 ,2 ]
机构
[1] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Dept Orthoped Surg, Wuhan 430022, Peoples R China
[2] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Lab Intelligent Med, Wuhan 430022, Peoples R China
[3] Hubei Univ Chinese Med, Coll Chinese Med, Wuhan 433065, Peoples R China
[4] Wuhan Univ, Dept Orthoped, Renmin Hosp, Wuhan 433060, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
artificial intelligence; large language models; generative articial intelligence; orthopedics; CLINICAL-PRACTICE GUIDELINE; AMERICAN ACADEMY; HIP-FRACTURES; MANAGEMENT;
D O I
10.1007/s11596-024-2929-4
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveThis study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.MethodsThis research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.ResultsCompared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.ConclusionThe optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.
引用
收藏
页码:1001 / 1005
页数:5
相关论文
共 50 条
  • [21] Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model
    Wang, Diwei
    Yuan, Kun
    Muller, Candice
    Blanc, Frederic
    Padoy, Nicolas
    Seo, Hyewon
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 251 - 261
  • [22] Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model
    Peng, Huang
    Zhang, Pengfei
    Tang, Jiuyang
    Xu, Hao
    Zeng, Weixin
    MATHEMATICS, 2024, 12 (15)
  • [23] Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam
    Kadoya, Noriyuki
    Arai, Kazuhiro
    Tanaka, Shohei
    Kimura, Yuto
    Tozuka, Ryota
    Yasui, Keisuke
    Hayashi, Naoki
    Katsuta, Yoshiyuki
    Takahashi, Haruna
    Inoue, Koki
    Jingu, Keiichi
    RADIOLOGICAL PHYSICS AND TECHNOLOGY, 2024, 17 (04) : 929 - 937
  • [24] Enhancing large language model capabilities for rumor detection with Knowledge-Powered Prompting
    Yan, Yeqing
    Zheng, Peng
    Wang, Yongjun
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [25] Enhancing Visual-Language Prompt Tuning Through Sparse Knowledge-Guided Context Optimization
    Tian, Qiangxing
    Zhang, Min
    ENTROPY, 2025, 27 (03)
  • [26] A KNOWLEDGE GRAPH MODEL FOR PERFORMANCE-BASED GENERATIVE DESIGN AND ITS APPLICATIONS IN ACCELERATED DESIGN
    Wu, Zhaoji
    Wang, Zhe
    Cheng, Jack C. P.
    Kwok, Helen H. L.
    PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE OF THE ASSOCIATION FOR COMPUTER-AIDED ARCHITECTURAL DESIGN RESEARCH IN ASIA, CAADRIA 2024, VOL 1, 2024, : 395 - 404
  • [27] Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education
    Yang, Huibo
    Hu, Mengxuan
    Most, Amoreena
    Hawkins, W. Anthony
    Murray, Brian
    Smith, Susan E.
    Li, Sheng
    Sikora, Andrea
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
  • [28] MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization
    Chen, Yuyan
    Wen, Zhihao
    Fan, Ge
    Chen, Zhengyu
    Wu, Wei
    Liu, Dayiheng
    Li, Zhixu
    Liu, Bang
    Xiao, Yanghua
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3279 - 3304
  • [29] Bats as a Model for Enhancing IUCN Red List Assessments: Real-Time Data, Contributor Networks, and Specialized Training to Address Common Challenges
    Russo, Danilo
    Cistrone, Luca
    Waldien, David L.
    CONSERVATION LETTERS, 2025, 18 (01):
  • [30] Large language model-based planning agent with generative memory strengthens performance in textualized world
    Liu, Junyang
    Hao, Wenning
    Cheng, Kai
    Jin, Dawei
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148