Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy

被引：19

作者：

Onder, C. E. ^{[1
]}

Koc, G. ^{[1
]}

Gokbulut, P. ^{[1
]}

Taskaldiran, I. ^{[1
]}

Kuskonmaz, S. M. ^{[1
]}

机构：

[1] Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye

来源：

SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期

关键词：

HEALTH INFORMATION;

D O I：

10.1038/s41598-023-50884-w

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Hypothyroidism is characterized by thyroid hormone deficiency and has adverse effects on both pregnancy and fetal health. Chat Generative Pre-trained Transformer (ChatGPT) is a large language model trained with a very large database from many sources. Our study was aimed to evaluate the reliability and readability of ChatGPT-4 answers about hypothyroidism in pregnancy. A total of 19 questions were created in line with the recommendations in the latest guideline of the American Thyroid Association (ATA) on hypothyroidism in pregnancy and were asked to ChatGPT-4. The reliability and quality of the responses were scored by two independent researchers using the global quality scale (GQS) and modified DISCERN tools. The readability of ChatGPT was assessed used Flesch Reading Ease (FRE) Score, Flesch-Kincaid grade level (FKGL), Gunning Fog Index (GFI), Coleman-Liau Index (CLI), and Simple Measure of Gobbledygook (SMOG) tools. No misleading information was found in any of the answers. The mean mDISCERN score of the responses was 30.26 +/- 3.14; the median GQS score was 4 (2-4). In terms of reliability, most of the answers showed moderate (78.9%) followed by good (21.1%) reliability. In the readability analysis, the median FRE was 32.20 (13.00-37.10). The years of education required to read the answers were mostly found at the university level [9 (47.3%)]. Although ChatGPT-4 has significant potential, it can be used as an auxiliary information source for counseling by creating a bridge between patients and clinicians about hypothyroidism in pregnancy. Efforts should be made to improve the reliability and readability of ChatGPT.

引用

页数：8

共 50 条

[1] Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy
C. E. Onder
G. Koc
P. Gokbulut
I. Taskaldiran
S. M. Kuskonmaz
Scientific Reports, 14
[2] Evaluation of the Appropriateness and Readability of ChatGPT-4 Responses to Patient Queries on Uveitis
Mohammadi, S. Saeed
Khatri, Anadi
Jain, Tanya
Thng, Zheng Xian
Yoo, Woong-sun
Yavari, Negin
Bazojoo, Vahid
Mobasserian, Azadeh
Akhavanrezayat, Amir
Than, Ngoc Trong Tuong
Elaraby, Osama
Ganbold, Battuya
El Feley, Dalia
Nguyen, Trung
Yasar, Cigdem
Gupta, Ankur
Hung, Jia-Horung
Nguyen, Quan Dong
OPHTHALMOLOGY SCIENCE, 2025, 5 (01):
[3] Evaluation of the reliability, usefulness, quality and readability of ChatGPT’s responses on Scoliosis
Ayşe Merve Çıracıoğlu
Suheyla Dal Erdoğan
European Journal of Orthopaedic Surgery & Traumatology, 35 (1)
[4] Acceptability and readability of ChatGPT-4 based responses for frequently asked questions about strabismus and amblyopia
Guven, S.
Ayyildiz, B.
JOURNAL FRANCAIS D OPHTALMOLOGIE, 2025, 48 (03):
[5] Evaluation of the Quality and Reliability of ChatGPT-4's Responses on Allergen Immunotherapy Using Validated Assessment Tools
Cherrez-Ojeda, Ivan
Zuberbier, Torsten
Rodas-Valero, Gabriela
Sanchez, Jorge Mario
Rudenko, Michael
Dramburg, Stephanie
Demoly, Pascal
Caimmi, Davide
Gómez, René Maximiliano
Ramon, German D.
Fouda, Ghada E.
Quimby, Kim R.
Chong-Neto, Herberto
Llosa, Oscar Calderon
Larco, Jose Ignacio
Ortega, Olga Patricia Monge
Pfaar, Oliver
Bousquet, Jean
Robles-Velasco, Karla
SSRN,
[6] ChatGPT-4: Alcohol use disorder responses
Russell, Alex M.
Acuff, Samuel F.
Kelly, John F.
Allem, Jon-Patrick
Bergman, Brandon G.
ADDICTION, 2024, 119 (12) : 2205 - 2210
[7] ChatGPT-4 Improves Readability of Institutional Heart Failure Patient Education Materials
King, Ryan
Samaan, Jamil
Haquang, Joseph
Bharani, Vishnu
Ghashghaei, Roxana
CIRCULATION, 2024, 150
[8] Evaluation of the performance of ChatGPT-4 and ChatGPT-4o as a learning tool in endodontics
Ozturk, Esra Arili
Gokduman, Ceren Turan
Canakci, Burhan Can
INTERNATIONAL ENDODONTIC JOURNAL, 2025,
[9] Evaluation of the accuracy and readability of ChatGPT-4 and Google Gemini in providing information on retinal detachment: a multicenter expert comparative study
Strzalkowski, Piotr
Strzalkowska, Alicja
Chhablani, Jay
Pfau, Kristina
Errera, Marie-Helene
Roth, Mathias
Schaub, Friederike
Bechrakis, Nikolaos E.
Hoerauf, Hans
Reiter, Constantin
Schuster, Alexander K.
Geerling, Gerd
Guthoff, Rainer
INTERNATIONAL JOURNAL OF RETINA AND VITREOUS, 2024, 10 (01)
[10] Evaluation of the accuracy and quality of ChatGPT-4 responses for hyperparathyroidism patients discussed at multidisciplinary endocrinology meetings
Taskaldiran, Isilay
Onder, Cagatay Emir
Gokbulut, Puren
Koc, Gonul
Kuskonmaz, Serife Mehlika
DIGITAL HEALTH, 2024, 10

← 1 2 3 4 5 →