Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization

被引：0

作者：

Zhou, Hong ^{[1
,2
]}

Wang, Hong-lin ^{[1
,2
]}

Duan, Yu-yu ^{[2
,3
]}

Yan, Zi-neng ^{[1
,2
]}

Luo, Rui ^{[1
,2
]}

Lv, Xiang-xin ^{[1
,2
]}

Xie, Yi ^{[1
,2
]}

Zhang, Jia-yao ^{[1
,2
]}

Yang, Jia-ming ^{[1
,2
]}

Xue, Ming-di ^{[1
,2
]}

Fang, Ying ^{[1
,2
]}

Lu, Lin ^{[2
,4
]}

Liu, Peng-ran ^{[1
,2
]}

Ye, Zhe-wei ^{[1
,2
]}

机构：

[1] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Dept Orthoped Surg, Wuhan 430022, Peoples R China

[2] Huazhong Univ Sci & Technol, Union Hosp, Tongji Med Coll, Lab Intelligent Med, Wuhan 430022, Peoples R China

[3] Hubei Univ Chinese Med, Coll Chinese Med, Wuhan 433065, Peoples R China

[4] Wuhan Univ, Dept Orthoped, Renmin Hosp, Wuhan 433060, Peoples R China

来源：

CURRENT MEDICAL SCIENCE | 2024年

基金：

中国国家自然科学基金;

关键词：

artificial intelligence; large language models; generative articial intelligence; orthopedics; CLINICAL-PRACTICE GUIDELINE; AMERICAN ACADEMY; HIP-FRACTURES; MANAGEMENT;

D O I：

10.1007/s11596-024-2929-4

中图分类号：

R-3 [医学研究方法]; R3 [基础医学];

学科分类号：

1001 ;

摘要：

ObjectiveThis study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.MethodsThis research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.ResultsCompared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.ConclusionThe optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.

引用

页码：1001 / 1005

页数：5

共 50 条

[21] Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model
Wang, Diwei
Yuan, Kun
Muller, Candice
Blanc, Frederic
Padoy, Nicolas
Seo, Hyewon
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 251 - 261
[22] Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model
Peng, Huang
Zhang, Pengfei
Tang, Jiuyang
Xu, Hao
Zeng, Weixin
MATHEMATICS, 2024, 12 (15)
[23] Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam
Kadoya, Noriyuki
Arai, Kazuhiro
Tanaka, Shohei
Kimura, Yuto
Tozuka, Ryota
Yasui, Keisuke
Hayashi, Naoki
Katsuta, Yoshiyuki
Takahashi, Haruna
Inoue, Koki
Jingu, Keiichi
RADIOLOGICAL PHYSICS AND TECHNOLOGY, 2024, 17 (04) : 929 - 937
[24] Enhancing large language model capabilities for rumor detection with Knowledge-Powered Prompting
Yan, Yeqing
Zheng, Peng
Wang, Yongjun
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[25] Enhancing Visual-Language Prompt Tuning Through Sparse Knowledge-Guided Context Optimization
Tian, Qiangxing
Zhang, Min
ENTROPY, 2025, 27 (03)
[26] A KNOWLEDGE GRAPH MODEL FOR PERFORMANCE-BASED GENERATIVE DESIGN AND ITS APPLICATIONS IN ACCELERATED DESIGN
Wu, Zhaoji
Wang, Zhe
Cheng, Jack C. P.
Kwok, Helen H. L.
PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE OF THE ASSOCIATION FOR COMPUTER-AIDED ARCHITECTURAL DESIGN RESEARCH IN ASIA, CAADRIA 2024, VOL 1, 2024, : 395 - 404
[27] Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education
Yang, Huibo
Hu, Mengxuan
Most, Amoreena
Hawkins, W. Anthony
Murray, Brian
Smith, Susan E.
Li, Sheng
Sikora, Andrea
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
[28] MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization
Chen, Yuyan
Wen, Zhihao
Fan, Ge
Chen, Zhengyu
Wu, Wei
Liu, Dayiheng
Li, Zhixu
Liu, Bang
Xiao, Yanghua
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3279 - 3304
[29] Bats as a Model for Enhancing IUCN Red List Assessments: Real-Time Data, Contributor Networks, and Specialized Training to Address Common Challenges
Russo, Danilo
Cistrone, Luca
Waldien, David L.
CONSERVATION LETTERS, 2025, 18 (01):
[30] Large language model-based planning agent with generative memory strengthens performance in textualized world
Liu, Junyang
Hao, Wenning
Cheng, Kai
Jin, Dawei
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 148

← 1 2 3 4 5 →