Bias and Inaccuracy in AI Chatbot Ophthalmologist Recommendations

被引:13
|
作者
Oca, Michael C. [1 ]
Meller, Leo [1 ]
Wilson, Katherine [1 ]
Parikh, Alomi O. [2 ]
McCoy, Allison [3 ]
Chang, Jessica [2 ]
Sudharshan, Rasika [2 ]
Gupta, Shreya [2 ]
Zhang-Nunes, Sandy [2 ]
机构
[1] Univ Calif UC San Diego Hlth, Shiley Eye Inst, Orthoped Surg, La Jolla, CA 92093 USA
[2] Univ Southern Calif, Keck Sch Med, Univ Southern Calif USC Roski Eye Inst, Ophthalmol, Los Angeles, CA USA
[3] Del Mar Plast Surg, Plast Surg, San Diego, CA USA
关键词
ai chatbot; artificial intelligence (ai) in medicine; artificial intelligence in health care; gender bias; patient education; artificial intelligence in medicine;
D O I
10.7759/cureus.45911
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Purpose and design: To evaluate the accuracy and bias of ophthalmologist recommendations made by three AI chatbots, namely ChatGPT 3.5 (OpenAI, San Francisco, CA, USA), Bing Chat (Microsoft Corp., Redmond, WA, USA), and Google Bard (Alphabet Inc., Mountain View, CA, USA). This study analyzed chatbot recommendations for the 20 most populous U.S. cities. Methods: Each chatbot returned 80 total recommendations when given the prompt "Find me four good ophthalmologists in (city)." Characteristics of the physicians, including specialty, location, gender, practice type, and fellowship, were collected. A one-proportion z-test was performed to compare the proportion of female ophthalmologists recommended by each chatbot to the national average (27.2% per the Association of American Medical Colleges (AAMC)). Pearson's chi-squared test was performed to determine differences between the three chatbots in male versus female recommendations and recommendation accuracy. Results: Female ophthalmologists recommended by Bing Chat (1.61%) and Bard (8.0%) were significantly less than the national proportion of 27.2% practicing female ophthalmologists (p<0.001, p<0.01, respectively). ChatGPT recommended fewer female (29.5%) than male ophthalmologists (p<0.722). ChatGPT (73.8%), Bing Chat (67.5%), and Bard (62.5%) gave high rates of inaccurate recommendations. Compared to the national average of academic ophthalmologists (17%), the proportion of recommended ophthalmologists in academic medicine or in combined academic and private practice was significantly greater for all three chatbots. Conclusion: This study revealed substantial bias and inaccuracy in the AI chatbots' recommendations. They struggled to recommend ophthalmologists reliably and accurately, with most recommendations being physicians in specialties other than ophthalmology or not in or near the desired city. Bing Chat and Google Bard showed a significant tendency against recommending female ophthalmologists, and all chatbots favored recommending ophthalmologists in academic medicine.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] AI Chatbot for Tourist Recommendations: A Case Study in Vietnam
    Nguyen, Hai Thanh
    Tran, Thien Thanh
    Nham, Phat Tan
    Nguyen, Nhi Uyen Bui
    Le, Anh Duy
    [J]. APPLIED COMPUTER SYSTEMS, 2023, 28 (02) : 232 - 244
  • [2] Accuracy and Bias in Artificial Intelligence Chatbot Recommendations for Oculoplastic Surgeons
    Parikh, Alomi O.
    Oca, Michael C.
    Conger, Jordan R.
    McCoy, Allison
    Chang, Jessica
    Zhang-Nunes, Sandy
    [J]. CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (04)
  • [3] Conversing on orthodontics with an AI Chatbot
    Krishnan, Vinod
    [J]. JOURNAL OF THE WORLD FEDERATION OF ORTHODONTISTS, 2023, 12 (02) : 39 - 40
  • [4] AI and racism: Tone policing by the Bing AI chatbot
    Appignani, Timothy
    Sanchez, Jacob
    [J]. DISCOURSE STUDIES, 2024,
  • [5] Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study
    Zaleski, Amanda L.
    Berkowsky, Rachel
    Craig, Kelly Jean Thomas
    Pescatello, Linda S.
    [J]. JMIR MEDICAL EDUCATION, 2024, 10
  • [6] Chatbot for Hospital Management Using AI
    Paranthaman, M.
    Gayathri, T.
    Kanishka, S.
    Lavanya, R.
    [J]. SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022, 2023, 1428 : 367 - 378
  • [7] An AI chatbot for talking therapy referrals
    Sin, Jacqueline
    [J]. NATURE MEDICINE, 2024, 30 (03) : 350 - 351
  • [8] Chatbot Testing Using AI Planning
    Bozic, Josip
    Tazl, Oliver A.
    Wotawa, Franz
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST), 2019, : 37 - 44
  • [9] An AI chatbot for talking therapy referrals
    Jacqueline Sin
    [J]. Nature Medicine, 2024, 30 : 350 - 351
  • [10] Transforming Radiology With AI Visual Chatbot
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2024, 21 (01) : 3 - 3