Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引:2
|
作者
Patra, Braja Gopal [1 ]
Sun, Zhaoyi [1 ]
Cheng, Zilin [1 ]
Kumar, Praneet Kasi Reddy Jagadeesh [1 ]
Altammami, Abdullah [1 ]
Liu, Yiyang [1 ]
Joly, Rochelle [2 ]
Jedlicka, Caroline [3 ,4 ]
Delgado, Diana [4 ]
Pathak, Jyotishman [1 ]
Peng, Yifan [1 ]
Zhang, Yiye [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA
[3] CUNY, Kingsborough Community Coll, New York, NY USA
[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA
来源
FRONTIERS IN PSYCHIATRY | 2023年 / 14卷
关键词
online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;
D O I
10.3389/fpsyt.2023.1258887
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Discovering social determinants of health from case reports using natural language processing: algorithmic development and validation
    Shaina Raza
    Elham Dolatabadi
    Nancy Ondrusek
    Laura Rosella
    Brian Schwartz
    BMC Digital Health, 1 (1):
  • [32] Automated Extraction of Pain Symptoms: A Natural Language Approach using Electronic Health Records
    Dave, Amisha D.
    Ruano, Gualberto
    Kost, Jonathan
    Wang, Xiaoyan
    PAIN PHYSICIAN, 2022, 25 (02) : E245 - E254
  • [33] Using natural language processing for automated classification of disease and to identify misclassified ICD codes in cardiac disease
    Falter, Maarten
    Godderis, Dries
    Scherrenberg, Martijn
    Kizilkilic, Sevda Ece
    Xu, Linqi
    Mertens, Marc
    Jansen, Jan
    Legroux, Pascal
    Kindermans, Hanne
    Sinnaeve, Peter
    Neven, Frank
    Dendale, Paul
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2024, 5 (03): : 229 - 234
  • [34] Extracting social determinants of health from electronic health records using natural language processing: a systematic review
    Patra, Braja G.
    Sharma, Mohit M.
    Vekaria, Veer
    Adekkanattu, Prakash
    Patterson, Olga, V
    Glicksberg, Benjamin
    Lepow, Lauren A.
    Ryu, Euijung
    Biernacka, Joanna M.
    Furmanchuk, Al'ona
    George, Thomas J.
    Hogan, William
    Wu, Yonghui
    Yang, Xi
    Bian, Jiang
    Weissman, Myrna
    Wickramaratne, Priya
    Mann, J. John
    Olfson, Mark
    Campion, Thomas R., Jr.
    Weiner, Mark
    Pathak, Jyotishman
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (12) : 2716 - 2727
  • [35] Dietary patterns during pregnancy and the risk of postpartum depression in Japan: the Osaka Maternal and Child Health Study
    Okubo, Hitomi
    Miyake, Yoshihiro
    Sasaki, Satoshi
    Tanaka, Keiko
    Murakami, Kentaro
    Hirota, Yoshio
    BRITISH JOURNAL OF NUTRITION, 2011, 105 (08) : 1251 - 1257
  • [36] A real-world case study for automated ticket team assignment using natural language processing and explainable models
    Pavelski, Lucas Marcondes
    Braga, Rodrigo de Souza
    PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
  • [37] Depression Detection Using Deep Learning and Natural Language Processing Techniques: A Comparative Study
    Mesquita, Francisco
    Mauricio, Jose
    Marques, Goncalo
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 327 - 342
  • [38] Multi-label Text Classification of Economic Concepts from Economic News Articles using Natural Language Processing
    Kim, Soojeong
    Lee, Minhyeok
    Seok, Junhee
    2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 417 - 420
  • [39] Retrospective study of propionic acidemia using natural language processing in Mayo Clinic electronic health record data
    Barman, Hannah
    Sikirica, Vanja
    Carlson, Katherine
    Silvert, Eli
    Carlson, Katherine Brewer
    Boyer, Suzanne
    Glaser, Ruchira
    Morava, Eva
    Wagner, Tyler
    Lanpher, Brendan
    MOLECULAR GENETICS AND METABOLISM, 2023, 140 (03)
  • [40] Exploring the Perspectives of Older Adults on a Digital Brain Health Platform Using Natural Language Processing: Cohort Study
    Ding, Huitong
    Gifford, Katherine
    Shih, Ludy C.
    Ho, Kristi
    Rahman, Salman
    Igwe, Akwaugo
    Low, Spencer
    Popp, Zachary
    Searls, Edward
    Li, Zexu
    Madan, Sanskruti
    Burk, Alexa
    Hwang, Phillip H.
    De Anda-Duran, Ileana
    Kolachalama, Vijaya B.
    Au, Rhoda
    Lin, Honghuang
    JMIR FORMATIVE RESEARCH, 2024, 8