Tobacco use status from clinical notes using Natural Language Processing and rule based algorithm

被引:8
|
作者
Hegde, Harshad [1 ]
Shimpi, Neel [1 ]
Glurich, Ingrid [1 ]
Acharya, Amit [1 ]
机构
[1] Marshfield Clin Fdn Med Res & Educ, Marshfield Clin Res Inst, Ctr Oral & Syst Hlth, 1000 North Oak Ave, Marshfield, WI 54449 USA
关键词
Data mining; decision support systems clinical; health information systems; smoking; electronic health records; information storage and retrieval; SMOKING; INFORMATION;
D O I
10.3233/THC-171127
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BACKGROUND: This cross-sectional retrospective study utilized Natural Language Processing (NLP) to extract tobacco-use associated variables from clinical notes documented in the Electronic Health Record (EHR). OBJECITVE: To develop a rule-based algorithm for determining the present status of the patient's tobacco-use. METHODS: Clinical notes (n = 5,371 documents) from 363 patients were mined and classified by NLP software into four classes namely: "Current Smoker", "Past Smoker", "Nonsmoker" and "Unknown". Two coders manually classified these documents into above mentioned classes (document-level gold standard classification (DLGSC)). A tobacco-use status was derived per patient (patient-level gold standard classification (PLGSC)), based on individual documents' status by the same two coders. The DLGSC and PLGSC were compared to the results derived from NLP and rule-based algorithm, respectively. RESULTS: The initial Cohen's kappa (n = 1,000 documents) was 0.9448 (95% CI = 0.9281-0.9615), indicating a strong agreement between the two raters. Subsequently, for 371 documents the Cohen's kappa was 0.9889 (95% CI = 0.979-1.000). The F-measures for the document-level classification for the four classes were 0.700, 0.753, 0.839 and 0.988 while the patient-level classifications were 0.580, 0.771, 0.730 and 0.933 respectively. CONCLUSIONS: NLP and the rule-based algorithm exhibited utility for deriving the present tobacco-use status of patients. Current strategies are targeting further improvement in precision to enhance translational value of the tool.
引用
收藏
页码:445 / 456
页数:12
相关论文
共 50 条
  • [1] Using natural language processing methods to classify use status of dietary supplements in clinical notes
    Fan, Yadan
    Zhang, Rui
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2018, 18
  • [2] Using natural language processing methods to classify use status of dietary supplements in clinical notes
    Yadan Fan
    Rui Zhang
    [J]. BMC Medical Informatics and Decision Making, 18
  • [3] Developing and validating a natural language processing algorithm to extract preoperative cannabis use status documentation from unstructured narrative clinical notes
    Sajdeya, Ruba
    Mardini, Mamoun T.
    Tighe, Patrick J.
    Ison, Ronald L.
    Bai, Chen
    Jugl, Sebastian
    Hanzhi, Gao
    Zandbiglari, Kimia
    Adiba, Farzana, I
    Winterstein, Almut G.
    Pearson, Thomas A.
    Cook, Robert L.
    Rouhizadeh, Masoud
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2023, 30 (08) : 1418 - 1428
  • [4] Classifying early infant feeding status from clinical notes using natural language processing and machine learning
    Lemas, Dominick J.
    Du, Xinsong
    Rouhizadeh, Masoud
    Lewis, Braeden
    Frank, Simon
    Wright, Lauren
    Spirache, Alex
    Gonzalez, Lisa
    Cheves, Ryan
    Magalhaes, Marina
    Zapata, Ruben
    Reddy, Rahul
    Xu, Ke
    Parker, Leslie
    Harle, Chris
    Young, Bridget
    Louis-Jaques, Adetola
    Zhang, Bouri
    Thompson, Lindsay
    Hogan, William R.
    Modave, Francois
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [5] Use of natural language processing method to identify regional anesthesia from clinical notes
    Graham, Laura A.
    Illarmo, Samantha S.
    Wren, Sherry M.
    Odden, Michelle C.
    Mudumbai, Seshadri C.
    [J]. REGIONAL ANESTHESIA AND PAIN MEDICINE, 2024,
  • [6] Development of an algorithm using natural language processing to identify metastatic breast cancer patients from clinical notes.
    Swaminathan, Krishna Kumar
    Mendonca, Emma
    Mukherjee, Pranay
    Thirumalai, Karpagavalli
    Newsome, Rachel
    Narayanan, Babu
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (15)
  • [7] Identifying Symptom Information in Clinical Notes Using Natural Language Processing
    Koleck, Theresa A.
    Tatonetti, Nicholas P.
    Bakken, Suzanne
    Mitha, Shazia
    Henderson, Morgan M.
    George, Maureen
    Miaskowski, Christine
    Smaldone, Arlene
    Topaz, Maxim
    [J]. NURSING RESEARCH, 2021, 70 (03) : 173 - 183
  • [8] Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes
    Barcelona, Veronica
    Scharp, Danielle
    Moen, Hans
    Davoudi, Anahita
    Idnay, Betina R.
    Cato, Kenrick
    Topaz, Maxim
    [J]. MATERNAL AND CHILD HEALTH JOURNAL, 2023, 28 (3) : 578 - 586
  • [9] Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes
    Veronica Barcelona
    Danielle Scharp
    Hans Moen
    Anahita Davoudi
    Betina R. Idnay
    Kenrick Cato
    Maxim Topaz
    [J]. Maternal and Child Health Journal, 2024, 28 : 578 - 586
  • [10] Identification of pancreatic cancer risk factors from clinical notes using natural language processing
    Sarwal, Dhruv
    Wang, Liwei
    Gandhi, Sonal
    Pour, Elham Sagheb Hossein
    Janssens, Laurens P.
    Delgado, Adriana M.
    Doering, Karen A.
    Mishra, Anup Kumar
    Greenwood, Jason D.
    Liu, Hongfang
    Majumder, Shounak
    [J]. PANCREATOLOGY, 2024, 24 (04) : 572 - 578