Large-scale Analysis of Free-Text Data for Mental Health Surveillance with Topic Modelling

被引:0
|
作者
Gu, Yang [1 ]
Leroy, Gondy [1 ]
机构
[1] Univ Arizona, Tucson, AZ 85721 USA
来源
关键词
Natural language processing; NLP; healthcare analytics; topic modelling; LDA; autism; ASD;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Autism spectrum disorder (ASD) affects 1 in 59 children in the US and costs the US economy $66 billion annually. The Center for Disease Control and Prevention (CDC) has collected a large set of EHR as part of surveillance in the US. In Arizona, the dataset contains 4480 EHR with 10 million free text tokens over ten years. It contains detailed descriptions of children with ASD-like behaviors. While the knowledge about and the diagnostic criteria of ASD have evolved, the data collected from earlier years have not been re-evaluated. To more efficiently leverage this data and uncover causes for the increase in ASD prevalence observed in epidemiological surveillance, we use Latent Dirichlet Allocation (LDA) to analyze the content of the text data automatically. Preliminary results suggest LDA can model topics in EHR content and show variations in content that are consistent with changes in the data collection effort.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Use of a Large Language Model to Identify and Classify Injuries With Free-Text Emergency Department Data
    Lorenzoni, Giulia
    Gregori, Dario
    Bressan, Silvia
    Ocagli, Honoria
    Azzolina, Danila
    Da Dalt, Liviana
    Berchialla, Paola
    [J]. JAMA NETWORK OPEN, 2024, 7 (05)
  • [32] How Did People Cope During the COVID-19 Pandemic? A Structural Topic Modelling Analysis of Free-Text Data From 11,000 United Kingdom Adults
    Wright, Liam
    Fluharty, Meg
    Steptoe, Andrew
    Fancourt, Daisy
    [J]. FRONTIERS IN PSYCHOLOGY, 2022, 13
  • [33] Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning
    Hari M. Trivedi
    Maryam Panahiazar
    April Liang
    Dmytro Lituiev
    Peter Chang
    Jae Ho Sohn
    Yunn-Yi Chen
    Benjamin L. Franc
    Bonnie Joe
    Dexter Hadley
    [J]. Journal of Digital Imaging, 2019, 32 : 30 - 37
  • [34] Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning
    Trivedi, Hari M.
    Panahiazar, Maryam
    Liang, April
    Lituiev, Dmytro
    Chang, Peter
    Sohn, Jae Ho
    Chen, Yunn-Yi
    Franc, Benjamin L.
    Joe, Bonnie
    Hadley, Dexter
    [J]. JOURNAL OF DIGITAL IMAGING, 2019, 32 (01) : 30 - 37
  • [35] Data Mining from Free-Text Health Records: State of the Art, New Polish Corpus
    Anetta, Kristof
    [J]. RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING (RASLAN 2020), 2020, : 13 - 22
  • [36] (How) Will it end? A qualitative analysis of free-text survey data on informal care endings
    Kirby, Emma
    Newton, Giselle
    Hofstaetter, Lukas
    Judd-Lam, Sarah
    Strnadova, Iva
    Newman, Christy E.
    [J]. INTERNATIONAL JOURNAL OF CARE AND CARING, 2022, 6 (04) : 604 - 620
  • [37] Patient perspectives on delays in diagnosis and treatment of cancer: a qualitative analysis of free-text data
    Parsonage, Rachel K.
    Hiscock, Julia
    Law, Rebecca-Jane
    Neal, Richard D.
    [J]. BRITISH JOURNAL OF GENERAL PRACTICE, 2017, 67 (654): : E49 - E56
  • [38] TAE: Topic-aware encoder for large-scale multi-label text classification
    Qin, Shaowei
    Wu, Hao
    Zhou, Lihua
    Zhao, Yiji
    Zhang, Lei
    [J]. APPLIED INTELLIGENCE, 2024, 54 (08) : 6269 - 6284
  • [39] An assessment of large-scale flood modelling based on LiDAR data
    Chone, Guenole
    Biron, Pascale M.
    Buffin-Belanger, Thomas
    Mazgareanu, Iulia
    Neal, Jeff C.
    Sampson, Christopher C.
    [J]. HYDROLOGICAL PROCESSES, 2021, 35 (08)
  • [40] Large-scale distribution modelling and the utility of detailed ground data
    Watson, FGR
    Grayson, RB
    Vertessy, RA
    McMahon, TA
    [J]. HYDROLOGICAL PROCESSES, 1998, 12 (06) : 873 - 888