Generation and evaluation of artificial mental health records for Natural Language Processing

被引:30
|
作者
Ive, Julia [1 ]
Viani, Natalia [2 ]
Kam, Joyce [2 ]
Yin, Lucia [2 ]
Verma, Somain [2 ]
Puntis, Stephen [3 ]
Cardinal, Rudolf N. [4 ,5 ]
Roberts, Angus [2 ]
Stewart, Robert [2 ,6 ]
Velupillai, Sumithra [2 ]
机构
[1] Imperial Coll London, Dept Comp, London SW7 2AZ, England
[2] Kings Coll London, IoPPN, London SE5 8AF, England
[3] Univ Oxford, Warneford Hosp, Dept Psychiat, Oxford OX3 7JX, England
[4] Univ Cambridge, Dept Psychiat, Downing St, Cambridge CB2 3EB, England
[5] Cambridgeshire & Peterborough NHS Fdn, Cambridge Biomed Campus,Box 190, Cambridge CB2 0QQ, England
[6] South London & Maudsley NHS Fdn Trust, London SE5 8AZ, England
基金
美国国家卫生研究院; 英国科研创新办公室; 英国医学研究理事会; 瑞典研究理事会; 英国工程与自然科学研究理事会;
关键词
D O I
10.1038/s41746-020-0267-x
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
A serious obstacle to the development of Natural Language Processing (NLP) methods in the clinical domain is the accessibility of textual data. The mental health domain is particularly challenging, partly because clinical documentation relies heavily on free text that is difficult to de-identify completely. This problem could be tackled by using artificial medical data. In this work, we present an approach to generate artificial clinical documents. We apply this approach to discharge summaries from a large mental healthcare provider and discharge summaries from an intensive care unit. We perform an extensive intrinsic evaluation where we (1) apply several measures of text preservation; (2) measure how much the model memorises training data; and (3) estimate clinical validity of the generated text based on a human evaluation task. Furthermore, we perform an extrinsic evaluation by studying the impact of using artificial text in a downstream NLP text classification task. We found that using this artificial data as training data can lead to classification results that are comparable to the original results. Additionally, using only a small amount of information from the original data to condition the generation of the artificial data is successful, which holds promise for reducing the risk of these artificial data retaining rare information from the original data. This is an important finding for our long-term goal of being able to generate artificial clinical data that can be released to the wider research community and accelerate advances in developing computational methods that use healthcare data.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Generation and evaluation of artificial mental health records for Natural Language Processing
    Julia Ive
    Natalia Viani
    Joyce Kam
    Lucia Yin
    Somain Verma
    Stephen Puntis
    Rudolf N. Cardinal
    Angus Roberts
    Robert Stewart
    Sumithra Velupillai
    [J]. npj Digital Medicine, 3
  • [2] Natural language generation for electronic health records
    Scott H. Lee
    [J]. npj Digital Medicine, 1
  • [3] Natural language generation for electronic health records
    Lee, Scott H.
    [J]. NPJ DIGITAL MEDICINE, 2018, 1
  • [4] Identifying Suicidal Adolescents from Mental Health Records Using Natural Language Processing
    Velupillai, Sumithra
    Epstein, Sophie
    Bittar, Andre
    Stephenson, Thomas
    Dutta, Rina
    Downs, Johnny
    [J]. MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 : 413 - 417
  • [5] Identifying Mentions of Pain in Mental Health Records Text: A Natural Language Processing Approach
    Chaturvedi, Jaya
    Velupillai, Sumithra
    Stewart, Robert
    Roberts, Angus
    [J]. MEDINFO 2023 - THE FUTURE IS ACCESSIBLE, 2024, 310 : 695 - 699
  • [6] Distributions of recorded pain in mental health records: a natural language processing based study
    Chaturvedi, Jaya
    Stewart, Robert
    Ashworth, Mark
    Roberts, Angus
    [J]. BMJ OPEN, 2024, 14 (04):
  • [7] Development of a Corpus Annotated With Mentions of Pain in Mental Health Records: Natural Language Processing Approach
    Chaturvedi, Jaya
    Chance, Natalia
    Mirza, Luwaiza
    Vernugopan, Veshalee
    Velupillai, Sumithra
    Stewart, Robert
    Roberts, Angus
    [J]. JMIR FORMATIVE RESEARCH, 2023, 7
  • [8] The Use of Natural Language Processing to Transform Health Records Information
    Roberts, A.
    [J]. EUROPEAN PSYCHIATRY, 2015, 30
  • [9] Mental Health of Seafarers: Using artificial Intelligence Natural Language processing through Deep Learning Approach
    Reck, Chiara
    Oldenburg, Marcus
    [J]. FLUGMEDIZIN TROPENMEDIZIN REISEMEDIZIN, 2022, 29 (04): : 141 - 141
  • [10] Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting
    Duy Van Le
    Montgomery, James
    Kirkby, Kenneth C.
    Scanlan, Joel
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 86 : 49 - 58