Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy
被引:19
|
作者:
Onder, C. E.
论文数: 0引用数: 0
h-index: 0
机构:
Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, TurkiyeAnkara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Onder, C. E.
[1
]
Koc, G.
论文数: 0引用数: 0
h-index: 0
机构:
Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, TurkiyeAnkara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Koc, G.
[1
]
Gokbulut, P.
论文数: 0引用数: 0
h-index: 0
机构:
Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, TurkiyeAnkara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Gokbulut, P.
[1
]
Taskaldiran, I.
论文数: 0引用数: 0
h-index: 0
机构:
Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, TurkiyeAnkara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Taskaldiran, I.
[1
]
Kuskonmaz, S. M.
论文数: 0引用数: 0
h-index: 0
机构:
Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, TurkiyeAnkara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Kuskonmaz, S. M.
[1
]
机构:
[1] Ankara Numune Training & Res Hosp, Dept Endocrinol & Metab Dis, Ankara, Turkiye
Hypothyroidism is characterized by thyroid hormone deficiency and has adverse effects on both pregnancy and fetal health. Chat Generative Pre-trained Transformer (ChatGPT) is a large language model trained with a very large database from many sources. Our study was aimed to evaluate the reliability and readability of ChatGPT-4 answers about hypothyroidism in pregnancy. A total of 19 questions were created in line with the recommendations in the latest guideline of the American Thyroid Association (ATA) on hypothyroidism in pregnancy and were asked to ChatGPT-4. The reliability and quality of the responses were scored by two independent researchers using the global quality scale (GQS) and modified DISCERN tools. The readability of ChatGPT was assessed used Flesch Reading Ease (FRE) Score, Flesch-Kincaid grade level (FKGL), Gunning Fog Index (GFI), Coleman-Liau Index (CLI), and Simple Measure of Gobbledygook (SMOG) tools. No misleading information was found in any of the answers. The mean mDISCERN score of the responses was 30.26 +/- 3.14; the median GQS score was 4 (2-4). In terms of reliability, most of the answers showed moderate (78.9%) followed by good (21.1%) reliability. In the readability analysis, the median FRE was 32.20 (13.00-37.10). The years of education required to read the answers were mostly found at the university level [9 (47.3%)]. Although ChatGPT-4 has significant potential, it can be used as an auxiliary information source for counseling by creating a bridge between patients and clinicians about hypothyroidism in pregnancy. Efforts should be made to improve the reliability and readability of ChatGPT.
机构:
Univ Hlth Sci, Sehit Prof Dr Ilhan Varank Sancaktepe Training & R, Dept Ophthalmol, Istanbul, TurkiyeUniv Hlth Sci, Sehit Prof Dr Ilhan Varank Sancaktepe Training & R, Dept Ophthalmol, Istanbul, Turkiye
Balci, Ali Safa
Cakmak, Semih
论文数: 0引用数: 0
h-index: 0
机构:
Istanbul Univ, Istanbul Fac Med, Dept Ophthalmol, Istanbul, TurkiyeUniv Hlth Sci, Sehit Prof Dr Ilhan Varank Sancaktepe Training & R, Dept Ophthalmol, Istanbul, Turkiye