Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain

被引:0
|
作者
Ozduran, Erkan [1 ]
Akkoc, Ibrahim [2 ]
Buyukcoban, Sibel [3 ]
Erkin, Yueksel [4 ]
Hanci, Volkan [5 ]
机构
[1] Sivas Numune Hosp, Phys Med & Rehabil Pain Med, Sivas, Turkiye
[2] Univ Hlth Sci, Basaksehir Cam & Sakura City Hosp, Anesthesiol & Reanimat, Istanbul, Turkiye
[3] Dokuz Eylul Univ, Anesthesiol & Reanimat, Izmir, Turkiye
[4] Dokuz Eylul Univ, Anesthesiol & Reanimat, Pain Med, Izmir, Turkiye
[5] Dokuz Eylul Univ, Crit Care Med, Anesthesiol & Reanimat, Izmir, Turkiye
关键词
artificial intelligence; ChatGPT; Gemini; online medical information; pain; perplexity; INFORMATION; IMPROVE; CARE;
D O I
10.1097/MD.0000000000041780
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
It is clear that artificial intelligence-based chatbots will be popular applications in the field of healthcare in the near future. It is known that more than 30% of the world's population suffers from chronic pain and individuals try to access the health information they need through online platforms before applying to the hospital. This study aimed to examine the readability, reliability and quality of the responses given by 3 different artificial intelligence chatbots (ChatGPT, Gemini and Perplexity) to frequently asked questions about pain. In this study, the 25 most frequently used keywords related to pain were determined using Google Trend and asked to every 3 artificial intelligence chatbots. The readability of the response texts was determined by Flesch Reading Ease Score (FRES), Simple Measure of Gobbledygook, Gunning Fog and Flesch-Kincaid Grade Level readability scoring. Reliability assessment was determined by the Journal of American Medical Association (JAMA), DISCERN scales. Global Quality Score and Ensuring Quality Information for Patients (EQIP) score were used in quality assessment. As a result of Google Trend search, the first 3 keywords were determined as "back pain," "stomach pain," and "chest pain." The readability of the answers given by all 3 artificial intelligence applications was determined to be higher than the recommended 6th grade readability level (P < .001). In the readability evaluation, the order from easy to difficult was determined as Google Gemini, ChatGPT and Perplexity. Higher GQS scores (P = .008) were detected in Gemini compared to other chatbots. Perplexity had higher JAMA, DISCERN and EQIP scores compared to other chatbots, respectively (P < .001, P < .001, P < .05). It has been determined that the answers given by ChatGPT, Gemini, and Perplexity to pain-related questions are difficult to read and their reliability and quality are low. It can be stated that these artificial intelligence chatbots cannot replace a comprehensive medical consultation. In artificial intelligence applications, it may be recommended to facilitate the readability of text content, create texts containing reliable references, and control them by a supervisory expert team.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Readability, quality and accuracy of generative artificial intelligence chatbots for commonly asked questions about labor epidurals: a comparison of ChatGPT and Bard
    Lee, D.
    Brown, M.
    Hammond, J.
    Zakowski, M.
    INTERNATIONAL JOURNAL OF OBSTETRIC ANESTHESIA, 2025, 61
  • [42] The Accuracy of ChatGPT-Generated Responses in Answering Commonly Asked Patient Questions About Labor Epidurals: Correspondence In Response
    Mootz, Allison A.
    Carvalho, Brendan
    Sultan, Pervez
    Nguyen, Teresa P.
    Reale, Sharon C.
    ANESTHESIA AND ANALGESIA, 2024, 138 (06): : e37 - e38
  • [43] Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology
    Roldan-Vasquez, Estefania
    Mitri, Samir
    Bhasin, Shreya
    Bharani, Tina
    Capasso, Kathryn
    Haslinger, Michelle
    Sharma, Ranjna
    James, Ted A.
    JOURNAL OF SURGICAL ONCOLOGY, 2024, 130 (02) : 188 - 203
  • [44] Assessment of the Responses of the Artificial Intelligence-based Chatbot ChatGPT-4 to Frequently Asked Questions About Amblyopia and Childhood Myopia
    Nikdel, Mojgan
    Ghadimi, Hadi
    Tavakoli, Mehdi
    Suh, Donny W.
    JOURNAL OF PEDIATRIC OPHTHALMOLOGY & STRABISMUS, 2024, 61 (02) : 86 - 89
  • [45] Response to "Qualitatively Assessing ChatGPT Responses to Frequently Asked Questions Regarding Sexually Transmitted Diseases: Considerations"
    Moothedan, Elijah
    Jhumkhawala, Vama
    Burgoa, Sara
    Martinez, Lisa
    Sacca, Lea
    SEXUALLY TRANSMITTED DISEASES, 2025, 52 (04) : e11 - e11
  • [46] Evaluation of the reliability, usefulness, quality and readability of ChatGPT’s responses on Scoliosis
    Ayşe Merve Çıracıoğlu
    Suheyla Dal Erdoğan
    European Journal of Orthopaedic Surgery & Traumatology, 35 (1)
  • [47] Assessing the Accuracy of ChatGPT's Responses to Frequently Asked Questions Related to Radiofrequency Ablation for Varicose Veins
    Shaikh, Fareed A.
    Anees, Muhammad
    Shaikh, Hafsah Ali A.
    Siddiqui, Nadeem A.
    Rehman, Zia U.
    JOURNAL OF THE AMERICAN COLLEGE OF SURGEONS, 2024, 239 (05) : S585 - S585
  • [48] ChatGPT Responses to Frequently Asked Questions on Meniere's Disease: A Comparison to Clinical Practice Guideline Answers
    Ho, Rebecca A.
    Shaari, Ariana L.
    Cowan, Paul T.
    Yan, Kenneth
    OTO OPEN, 2024, 8 (03)
  • [49] Most frequently asked questions about the coercivity of Nd-Fe-B permanent magnets
    Li, Jiangnan
    Sepehri-Amin, Hossein
    Sasaki, Taisuke
    Ohkubo, Tadakatsu
    Hono, Kazuhiro
    SCIENCE AND TECHNOLOGY OF ADVANCED MATERIALS, 2021, 22 (01) : 386 - 403
  • [50] RESPONSES TO QUESTIONS FREQUENTLY ASKED BY MEDICAL-STUDENTS ABOUT FAMILY-PRACTICE
    SCHERGER, JE
    BEASLEY, JW
    BRUNTON, SA
    HUDSON, TW
    MISHKIN, GJ
    PATRIC, KW
    OLSON, SH
    JOURNAL OF FAMILY PRACTICE, 1983, 17 (06): : 1047 - 1052