Can a large language model create acceptable dental board-style examination questions? A cross-sectional prospective study

被引:0
|
作者
Kim, Hak-Sun [1 ]
Kim, Gyu-Tae [2 ]
机构
[1] Kyung Hee Univ, Dept Oral & Maxillofacial Radiol, Dent Hosp, Seoul, South Korea
[2] Kyung Hee Univ, Coll Dent, Dept Oral & Maxillofacial Surg, 26 Kyungheedae Ro, Seoul 02447, South Korea
关键词
Dental education; Examination questions; Professional competence; Artificial intelligence; Natural language processing;
D O I
10.1016/j.jds.2024.08.020
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
Background/purpose: Numerous studies have shown that large language models (LLMs) can score above the passing grade on various board examinations. Therefore, this study aimed to evaluate national dental board-style examination questions created by an LLM versus those created by human experts using item analysis. Materials and methods: This study was conducted in June 2024 and included senior dental students (n = 30) who participated voluntarily. An LLM, ChatGPT 4o, was used to generate 44 national dental board-style examination questions based on textbook content. Twenty questions for the LLM set were randomly selected after removing false questions. Two experts created another set of 20 questions based on the same content and in the same style as the LLM. Participating students simultaneously answered a total of 40 questions divided into two sets using Google Forms in the classroom. The responses were analyzed to assess difficulty, discrimination index, and distractor efficiency. Statistical comparisons were performed using the Wilcoxon signed rank test or linear-by-linear association test, with a confidence level of 95%. Results: The response rate was 100%. The median difficulty indices of the LLM and human set were 55.00% and 50.00%, both within the range of "excellent" range. The median discrimination indices were 0.29 for the LLM set and 0.14 for the human set. Both sets had a median distractor efficiency of 80.00%. The differences in all criteria were not statistically significant (P > 0.050). Conclusion: The LLM can create national board-style examination questions of equivalent quality to those created by human experts. (c) 2025 Association for Dental Sciences of the Republic of China. Publishing services by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons. org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:895 / 900
页数:6
相关论文
共 50 条
  • [21] Awareness and Use of ChatGPT and Large Language Models: A Prospective Cross-sectional Global Survey in Urology
    Eppler, Michael
    Ganjavi, Conner
    Ramacciotti, Lorenzo Storino
    Piazza, Pietro
    Rodler, Severin
    Checcucci, Enrico
    Rivas, Juan Gomez
    Kowalewski, Karl F.
    Belenchon, Ines Rivero
    Puliatti, Stefano
    Taratkin, Mark
    Veccia, Alessandro
    Baekelandt, Loic
    Teoh, Jeremy Y. -C.
    Somani, Bhaskar K.
    Wroclawski, Marcelo
    Abreu, Andre
    Porpiglia, Francesco
    Gill, Inderbir S.
    Murphy, Declan G.
    Canes, David
    Cacciamani, Giovanni E.
    EUROPEAN UROLOGY, 2024, 85 (02) : 146 - 153
  • [22] Clinical features of depression in Asia: Results of a large prospective, cross-sectional study
    Srisurapanont, Manit
    Hong, Jin Pyo
    Si Tian-mei
    Hatim, Ahmad
    Liu, Chia-Yih
    Udomratn, Pichet
    Bae, Jae Nam
    Fang, Yiru
    Chua, Hong Choon
    Liu, Shen-Ing
    George, Tom
    Bautista, Dianne
    Chan, Edwin
    Rush, A. John
    ASIA-PACIFIC PSYCHIATRY, 2013, 5 (04) : 259 - 267
  • [23] Patterns of Facial Profile Preference in a Large Sample of Dental Students: A Cross-Sectional Study
    Romsics, Livia
    Segatto, Angyalka
    Boa, Kristof
    Becsei, Roland
    Rozsa, Noemi
    Parkanyi, Laszlo
    Pinke, Ildiko
    Piffko, Jozsef
    Segatto, Emil
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (16)
  • [24] A novel predicted model for hypertension based on a large cross-sectional study
    Ren, Zhigang
    Rao, Benchen
    Xie, Siqi
    Li, Ang
    Wang, Lijun
    Cui, Guangying
    Li, Tiantian
    Yan, Hang
    Yu, Zujiang
    Ding, Suying
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [25] A novel predicted model for hypertension based on a large cross-sectional study
    Zhigang Ren
    Benchen Rao
    Siqi Xie
    Ang Li
    Lijun Wang
    Guangying Cui
    Tiantian Li
    Hang Yan
    Zujiang Yu
    Suying Ding
    Scientific Reports, 10
  • [26] Evaluation of Prompts to Simplify Cardiovascular Disease Information Generated Using a Large Language Model: Cross-Sectional Study
    Mishra, Vishala
    Sarraju, Ashish
    Kalwani, Neil M.
    Dexter, Joseph P.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [27] Lung function and associations with multiple dimensions of dental health: A prospective observational cross-sectional study
    Henke C.
    Budweiser S.
    Jörres R.A.
    BMC Research Notes, 9 (1)
  • [28] Awareness and Use of ChatGPT and Large Language Models: A Prospective Cross-sectional Global Survey in Urology.
    Eppler, Michael
    Ganjavi, Conner
    Abreua, Andre
    Gill, Inderbir
    Cacciamani, Giovanni E.
    EUROPEAN UROLOGY, 2024, 85 (03) : e85 - e86
  • [29] Assessing the Efficacy of Large Language Models in Health Literacy: A Comprehensive Cross-Sectional Study
    Amin, Kanhai S.
    Mayes, Linda C.
    Khosla, Pavan
    Doshi, Rushabh H.
    YALE JOURNAL OF BIOLOGY AND MEDICINE, 2024, 97 (01): : 17 - 27
  • [30] THE ASSOCIATION BETWEEN GASTRIC POLYPS AND COLORECTAL NEOPLASIA: A LARGE PROSPECTIVE CROSS-SECTIONAL STUDY
    Zhang, Shenghong
    Zheng, Danping
    Huang, Shanshan
    Li, Li
    Chen, Minhu
    GUT, 2018, 67 : A69 - A70