Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain

被引:0
|
作者
Ozduran, Erkan [1 ]
Akkoc, Ibrahim [2 ]
Buyukcoban, Sibel [3 ]
Erkin, Yueksel [4 ]
Hanci, Volkan [5 ]
机构
[1] Sivas Numune Hosp, Phys Med & Rehabil Pain Med, Sivas, Turkiye
[2] Univ Hlth Sci, Basaksehir Cam & Sakura City Hosp, Anesthesiol & Reanimat, Istanbul, Turkiye
[3] Dokuz Eylul Univ, Anesthesiol & Reanimat, Izmir, Turkiye
[4] Dokuz Eylul Univ, Anesthesiol & Reanimat, Pain Med, Izmir, Turkiye
[5] Dokuz Eylul Univ, Crit Care Med, Anesthesiol & Reanimat, Izmir, Turkiye
关键词
artificial intelligence; ChatGPT; Gemini; online medical information; pain; perplexity; INFORMATION; IMPROVE; CARE;
D O I
10.1097/MD.0000000000041780
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
It is clear that artificial intelligence-based chatbots will be popular applications in the field of healthcare in the near future. It is known that more than 30% of the world's population suffers from chronic pain and individuals try to access the health information they need through online platforms before applying to the hospital. This study aimed to examine the readability, reliability and quality of the responses given by 3 different artificial intelligence chatbots (ChatGPT, Gemini and Perplexity) to frequently asked questions about pain. In this study, the 25 most frequently used keywords related to pain were determined using Google Trend and asked to every 3 artificial intelligence chatbots. The readability of the response texts was determined by Flesch Reading Ease Score (FRES), Simple Measure of Gobbledygook, Gunning Fog and Flesch-Kincaid Grade Level readability scoring. Reliability assessment was determined by the Journal of American Medical Association (JAMA), DISCERN scales. Global Quality Score and Ensuring Quality Information for Patients (EQIP) score were used in quality assessment. As a result of Google Trend search, the first 3 keywords were determined as "back pain," "stomach pain," and "chest pain." The readability of the answers given by all 3 artificial intelligence applications was determined to be higher than the recommended 6th grade readability level (P < .001). In the readability evaluation, the order from easy to difficult was determined as Google Gemini, ChatGPT and Perplexity. Higher GQS scores (P = .008) were detected in Gemini compared to other chatbots. Perplexity had higher JAMA, DISCERN and EQIP scores compared to other chatbots, respectively (P < .001, P < .001, P < .05). It has been determined that the answers given by ChatGPT, Gemini, and Perplexity to pain-related questions are difficult to read and their reliability and quality are low. It can be stated that these artificial intelligence chatbots cannot replace a comprehensive medical consultation. In artificial intelligence applications, it may be recommended to facilitate the readability of text content, create texts containing reliable references, and control them by a supervisory expert team.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Frequently Asked Questions About Quality Control vs. Quality Assurance
    Schniepp, Susan J.
    BIOPHARM INTERNATIONAL, 2021, 34 (12) : 46 - 46
  • [32] Readability and Appropriateness of Responses Generated by ChatGPT 3.5, ChatGPT 4.0, Gemini, and Microsoft Copilot for FAQs in Refractive Surgery
    Aydin, Fahri Onur
    Aksoy, Burakhan Kursat
    Ceylan, Ali
    Akbas, Yusuf Berk
    Ermis, Serhat
    Yildiz, Burcin Kepez
    Yildirim, Yusuf
    TURK OFTALMOLOJI DERGISI-TURKISH JOURNAL OF OPHTHALMOLOGY, 2024, 54 (06): : 313 - 317
  • [33] ChatGPT is capable of providing satisfactory responses to frequently asked questions regarding total shoulder arthroplasty
    Yeramosu, Teja
    Johns, William L.
    Onor, Gabriel
    Menendez, Mariano E.
    Namdari, Surena
    Hammoud, Sommer
    SHOULDER & ELBOW, 2024, 16 (04) : 407 - 412
  • [34] EVALUATING CHATGPT'S RESPONSES TO FREQUENTLY ASKED QUESTIONS REGARDING POLYCYSTIC OVARY SYNDROME.
    Pace, Lauren
    Kummer, Nicholas
    Bril, Fernando
    Hosseinzadeh, Pardis
    Azziz, Ricardo
    FERTILITY AND STERILITY, 2024, 122 (01) : E55 - E55
  • [35] ChatGPT Provides Unsatisfactory Responses to Frequently Asked Questions Regarding Anterior Cruciate Ligament Reconstruction
    Johns, William L.
    Martinazzi, Brandon J.
    Miltenberg, Benjamin
    Nam, Hannah H.
    Hammoud, Sommer
    ARTHROSCOPY-THE JOURNAL OF ARTHROSCOPIC AND RELATED SURGERY, 2024, 40 (07): : 2067 - 2079.e1
  • [36] MACOS REVISITED - COMMENTARY ON MOST FREQUENTLY ASKED QUESTIONS ABOUT MAN - COURSE OF STUDY
    DOW, PB
    SOCIAL EDUCATION, 1975, 39 (06) : 388 - &
  • [37] Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement
    Zhang, Siyuan
    Liau, Zi Qiang Glen
    Tan, Kian Loong Melvin
    Chua, Wei Liang
    KNEE SURGERY & RELATED RESEARCH, 2024, 36 (01)
  • [38] Accuracy assessment of ChatGPT responses to frequently asked questions regarding anterior cruciate ligament surgery
    Villarreal-Espinosa, Juan Bernardo
    Berreta, Rodrigo Saad
    Allende, Felicitas
    Garcia, Jose Rafael
    Ayala, Salvador
    Familiari, Filippo
    Chahla, Jorge
    KNEE, 2024, 51 : 84 - 92
  • [39] Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement
    Siyuan Zhang
    Zi Qiang Glen Liau
    Kian Loong Melvin Tan
    Wei Liang Chua
    Knee Surgery & Related Research, 36
  • [40] Responses to frequently asked questions about 12th-grade TIMSS
    Forgione, PD
    PHI DELTA KAPPAN, 1998, 79 (10) : 769 - 772