Residency Application Selection Committee Discriminatory Ability in Identifying Artificial Intelligence-Generated Personal Statements

被引:0
|
作者
Koleilat, Issam [1 ,3 ]
Bongu, Advaith [2 ]
Chang, Sumy [1 ]
Nieman, Dylan [2 ]
Priolo, Steven [1 ]
Patel, Nell Maloney [2 ]
机构
[1] RWJ Barnabas Hlth, Community Med Ctr, Dept Surg, Toms River, NJ USA
[2] Univ Med & Dent New Jersey, Dept Surg, New Brunswick, NJ USA
[3] RWJ Barnabas Hlth, Dept Surg, 67 Route 37 West,Riverwood 1,Suite 200B, Toms River, NJ 08755 USA
关键词
Personal statement; Artificial intelligence; Residency application;
D O I
10.1016/j.jsurg.2024.02.009
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
OBJECTIVE: Advances in artificial intelligence (AI) have given rise to sophisticated algorithms capable of generating human -like text. The goal of this study was to evaluate the ability of human reviewers to reliably differentiate personal statements (PS) written by human authors from those generated by AI software. SETTING: Four personal statements from the archives of two surgical program directors were de -identified and used as the human samples. Two AI platforms were used to generate nine additional PS. PARTICIPANTS: Four surgeons from the residency selection committees of two surgical residency programs of a large multihospital system served as blinded reviewers. AI was also asked to evaluate each PS sample for authorship. DESIGN: Sensitivity, specificity and accuracy of the reviewers in identifying the PS author were calculated. Kappa statistic for correlation between the hypothesized author and the true author were calculated. Inter -rater reliability was calculated using the kappa statistic with Light's modification given more than two reviewers in a fully -crossed design. Logistic regression was performed with to model the impact of perceived creativity, writing quality, and authorship or the likelihood of offering an interview. RESULTS: Human reviewer sensitivity for identifying an AI -generated PS was 0.87 with specificity of 0.37 and overall accuracy of 0.55. The level of agreement by kappa statistic of the reviewer estimate of authorship and the true authorship was 0.19 (slight agreement). The reviewers themselves had an inter -rater reliability of 0.067 (poor), with only complete agreement (four out of four reviewers) on two PS, both authored by humans. The odds ratio of offering an interview (compared to a composite of "backup" status or no interview) to a perceived human author was 7 times that of a perceived AI author (95% confidence interval 1.5276 to 32.0758, p=0.0144). AI hypothesized human authorship for twelve of the PS, with the last one "unsure." CONCLUSIONS: The increasing pervasiveness of AI will have far-reaching effects including on the resident application and recruitment process. Identifying AI -generated personal statements is exceedingly difficult. With the decreasing availability of objective data to assess applicants, a review and potential restructuring of the approach to resident recruitment may be warranted. ( J Surg Ed 81:780 - 785. (c) 2024 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.)
引用
收藏
页码:780 / 785
页数:6
相关论文
共 9 条
  • [1] Detecting Artificial Intelligence-Generated Personal Statements in Professional Physical Therapist Education Program Applications: A Lexical Analysis
    Hollman, John H.
    Cloud-Biebl, Beth A.
    Krause, David A.
    Calley, Darren Q.
    [J]. PHYSICAL THERAPY, 2024, 104 (04):
  • [2] The Plastic Surgery Residency Application in the Era of ChatGPT: A Personal Statement Generated by Artificial Intelligence to Statements From Actual Applicants
    Patel, Viren
    Deleonibus, Anthony
    Wells, Michael W.
    Bernard, Steven L.
    Schwarz, Graham S.
    [J]. ANNALS OF PLASTIC SURGERY, 2023, 91 (03) : 324 - 325
  • [3] Ethics of Using Artificial Intelligence for Medical Residency Personal Statements
    Kouam, John-Stephane
    Pak, Thomas Kun
    Hernandez, Cesar Eber Montelongo
    [J]. ACADEMIC PSYCHIATRY, 2024,
  • [4] GPTZero Performance in Identifying Artificial Intelligence-Generated Medical Texts: A Preliminary Study
    Habibzadeh, Farrokh
    [J]. JOURNAL OF KOREAN MEDICAL SCIENCE, 2023, 38 (38)
  • [5] AUA Guideline Committee Members Determine Quality of Artificial Intelligence-Generated Responses for Female Stress Urinary Incontinence
    Chen, Annie
    Jacob, Jerril
    Hwang, Kuemin
    Kobashi, Kathleen
    Gonzalez, Ricardo R.
    [J]. UROLOGY PRACTICE, 2024, 11 (04)
  • [6] Comparing IM Residency Application Personal Statements Generated by GPT-4 and Authentic Applicants
    Nair, Vishnu
    Nayak, Ashwin
    Ahuja, Neera
    Weng, Yingjie
    Keet, Kevin
    Hosamani, Poonam
    Hom, Jason
    [J]. JOURNAL OF GENERAL INTERNAL MEDICINE, 2024,
  • [7] Can ChatGPT Fool the Match? Artificial Intelligence Personal Statements for Plastic Surgery Residency Applications: A Comparative Study
    Chen, Jeffrey
    Tao, Brendan K.
    Park, Shihyun
    Bovill, Esta
    [J]. PLASTIC SURGERY, 2024,
  • [8] Artificial intelligence software can generate residency application personal statements that program directors find acceptable and difficult to distinguish from applicant compositions
    Johnstone, Robert E.
    Neely, Grant
    Sizemore, Daniel C.
    [J]. JOURNAL OF CLINICAL ANESTHESIA, 2023, 89
  • [9] AUA Guideline Committee Members Determine Quality of Artificial Intelligence-Generated Responses for Female Stress Urinary Incontinence Editorial Commentary
    Lemack, Gary E.
    [J]. UROLOGY PRACTICE, 2024, 11 (04)