Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting

被引:35
|
作者
Duy Van Le [1 ]
Montgomery, James [1 ]
Kirkby, Kenneth C. [2 ]
Scanlan, Joel [1 ]
机构
[1] Univ Tasmania, Coll Sci & Engn, Sch Technol Environm & Design, Private Bag 87, Hobart, Tas 7001, Australia
[2] Univ Tasmania, Coll Hlth & Med, Sch Med, Private Bag 87, Hobart, Tas 7001, Australia
关键词
Text mining; Natural language processing; Electronic health record; Mental health; Psychiatry; MEDICAL-RECORDS; ONTOLOGY; DISEASE; INFORMATION; EXTRACTION; VIOLENCE; VERSION; UMLS; TEXT;
D O I
10.1016/j.jbi.2018.08.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: Instruments rating risk of harm to self and others are widely used in inpatient forensic psychiatry settings. A potential alternate or supplementary means of risk prediction is from the automated analysis of case notes in Electronic Health Records (EHRs) using Natural Language Processing (NLP). This exploratory study rated presence or absence and frequency of words in a forensic EHR dataset, comparing four reference dictionaries. Seven machine learning algorithms and different time periods of EHR analysis were used to probe. which dictionary and which time period were most predictive of risk assessment scores on validated instruments. Materials and methods: The EHR dataset comprised de-identified forensic inpatient notes from the Wilfred Lopes Centre in Tasmania. The data comprised unstructured free-text case note entries and serial ratings of three risk assessment scales: Historical Clinical Risk Management-20 (HCR-20), Short-Term Assessment of Risk and Treatability (START) and. Dynamic Appraisal of Situational Aggression (DASA). Four NLP dictionary word lists were selected: 6865 mental health symptom words from the Unified Medical Language System (UMLS), 455 DSM-IV diagnoses from UMLS repository, 6790 English positive and negative sentiment words, and 1837 high frequency words from the Corpus of Contemporary American English (COCA). Seven machine learning methods Bagging, J48, Jrip, Logistic Model Trees (LMT), Logistic Regression, Linear Regression and Support Vector Machine (SVM) were used to identify the combination of dictionaries and algorithms that best predicted risk assessment scores. Results: The most accurate prediction was attained on the DASA dataset using the sentiment dictionary and the LMT and SVM algorithms. Conclusions: NLP, used in conjunction with NLP dictionaries and machine learning, predicted risk ratings on the HCR-20, START, and DASA, based on EHR content. Further research is required to ascertain the utility of NLP approaches in predicting endpoints of actual self-harm, harm to others or victimisation.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 50 条
  • [1] Using Natural Language Processing on Electronic Health Records to Enhance Detection and Prediction of Psychosis Risk
    Irving, Jessica
    Patel, Rashmi
    Oliver, Dominic
    Colling, Craig
    Pritchard, Megan
    Broadbent, Matthew
    Baldwin, Helen
    Stahl, Daniel
    Stewart, Robert
    Fusar-Poli, Paolo
    [J]. SCHIZOPHRENIA BULLETIN, 2021, 47 (02) : 405 - 414
  • [2] Using Natural Language Processing to Predict Risk in Electronic Health Records
    Duy Van Le
    Montgomery, James
    Kirkby, Kenneth
    Scanlan, Joel
    [J]. MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 574 - 578
  • [3] Prediction of Psychiatric Readmission Risk in Psychosis Patients With Natural Language Processing of Electronic Health Records
    Mellado, Elena Alvarez
    Holderness, Eben
    Miller, Nicholas
    Bolton, Kirsten
    Cawkwell, Philip
    Pustejovsky, James
    Hall, Mei-Hua
    [J]. NEUROPSYCHOPHARMACOLOGY, 2019, 44 (SUPPL 1) : 187 - 187
  • [4] Detecting inpatient falls by using natural language processing of electronic medical records
    Shin-ichi Toyabe
    [J]. BMC Health Services Research, 12
  • [5] Detecting inpatient falls by using natural language processing of electronic medical records
    Toyabe, Shin-ichi
    [J]. BMC HEALTH SERVICES RESEARCH, 2012, 12
  • [6] Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
    Ashburner, Jeffrey M.
    Chang, Yuchiao
    Wang, Xin
    Khurshid, Shaan
    Anderson, Christopher D.
    Dahal, Kumar
    Weisenfeld, Dana
    Cai, Tianrun
    Liao, Katherine P.
    Wagholikar, Kavishwar B.
    Murphy, Shawn N.
    Atlas, Steven J.
    Lubitz, Steven A.
    Singer, Daniel E.
    [J]. JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2022, 11 (15):
  • [8] Prediction and evaluation of combination pharmacotherapy using natural language processing, machine learning and patient electronic health records
    Ding, Pingjian
    Pan, Yiheng
    Wang, Quanqiu
    Xu, Rong
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 133
  • [9] Comparing natural language processing representations of coded disease sequences for prediction in electronic health records
    Beaney, Thomas
    Jha, Sneha
    Alaa, Asem
    Smith, Alexander
    Clarke, Jonathan
    Woodcock, Thomas
    Majeed, Azeem
    Aylin, Paul
    Barahona, Mauricio
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (07) : 1451 - 1462
  • [10] IDENTIFICATION OF PANCREATIC DUCTAL ADENOCARCINOMA RISK FACTORS FROM ELECTRONIC HEALTH RECORDS USING NATURAL LANGUAGE PROCESSING
    Sarwal, Dhruv
    Wang, Liwei
    Gandhi, Sonal
    Sagheb, Elham
    Janssens, Laurens
    Goncalves, Sandy
    Delgado, Adriana
    Doering, Karen
    Liu Hongfang
    Majumder, Shounak
    [J]. GASTROENTEROLOGY, 2022, 162 (07) : S243 - S243