Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引:2
|
作者
Patra, Braja Gopal [1 ]
Sun, Zhaoyi [1 ]
Cheng, Zilin [1 ]
Kumar, Praneet Kasi Reddy Jagadeesh [1 ]
Altammami, Abdullah [1 ]
Liu, Yiyang [1 ]
Joly, Rochelle [2 ]
Jedlicka, Caroline [3 ,4 ]
Delgado, Diana [4 ]
Pathak, Jyotishman [1 ]
Peng, Yifan [1 ]
Zhang, Yiye [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA
[3] CUNY, Kingsborough Community Coll, New York, NY USA
[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA
来源
FRONTIERS IN PSYCHIATRY | 2023年 / 14卷
关键词
online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;
D O I
10.3389/fpsyt.2023.1258887
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Classification of Severe Maternal Morbidity from Electronic Health Records Written in Spanish Using Natural Language Processing
    Torres-Silva, Ever A.
    Rua, Santiago
    Giraldo-Forero, Andres F.
    Durango, Maria C.
    Florez-Arango, Jose F.
    Orozco-Duque, Andres
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [22] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
    Wang, Ye
    Willis, Erin
    Yeruva, Vijaya K. K.
    Ho, Duy
    Lee, Yugyung
    BMC PUBLIC HEALTH, 2023, 23 (01)
  • [23] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
    Ye Wang
    Erin Willis
    Vijaya K. Yeruva
    Duy Ho
    Yugyung Lee
    BMC Public Health, 23
  • [24] Developing a standardized framework for evaluating health apps using natural language processing
    Herpertz, Julian
    Dwyer, Bridget
    Taylor, Jacob
    Opel, Nils
    Torous, John
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [25] Surveillance of Health Care-Associated Violence Using Natural Language Processing
    Waltzman, Mark
    Al Ozonoff, Al
    Fournier, Kerri Ann
    Welcher, Jennifer
    Milliren, Carly
    Landschaft, Assaf
    Bulis, Jonathan
    Kimia, Amir A.
    PEDIATRICS, 2024, 154 (02)
  • [26] Compound Classification and Consideration of Correlation with Chemical Descriptors from Articles on Antioxidant Capacity Using Natural Language Processing
    Matsumoto, Yuto
    Gotoh, Hiroaki
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 64 (01) : 119 - 127
  • [27] Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility Study
    Wang, Xiaoyan
    Hripcsak, George
    Markatou, Marianthi
    Friedman, Carol
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (03) : 328 - 337
  • [28] OPTIMIZATION OF NATURAL LANGUAGE PROCESSING-SUPPORTED COMORBIDITY CLASSIFICATION ALGORITHMS IN ELECTRONIC HEALTH RECORDS
    Hooley, I
    Chen, R.
    Long, L.
    Cohen, A.
    Adamson, B.
    VALUE IN HEALTH, 2019, 22 : S87 - S87
  • [29] Global Research on Pandemics or Epidemics and Mental Health: A Natural Language Processing Study
    Ye, Xin
    Wang, Xinfeng
    Lin, Hugo
    JOURNAL OF EPIDEMIOLOGY AND GLOBAL HEALTH, 2024, 14 (03) : 1268 - 1280
  • [30] A Natural Language Processing System That Links Medical Terms in Electronic Health Record Notes to Lay Definitions: System Development Using Physician Reviews
    Chen, Jinying
    Druhl, Emily
    Ramesh, Balaji Polepalli
    Houston, Thomas K.
    Brandt, Cynthia A.
    Zulman, Donna M.
    Vimalananda, Varsha G.
    Malkani, Samir
    Yu, Hong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (01)