Harnessing multimodal approaches for depression detection using large language models and facial expressions

被引:0
|
作者
Misha Sadeghi [1 ]
Robert Richer [1 ]
Bernhard Egger [2 ]
Lena Schindler-Gmelch [3 ]
Lydia Helene Rupp [3 ]
Farnaz Rahimi [1 ]
Matthias Berking [3 ]
Bjoern M. Eskofier [1 ]
机构
[1] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Machine Learning and Data Analytics Lab (MaD Lab), Department Artificial Intelligence in Biomedical Engineering (AIBE)
[2] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Visual Computing (LGDV), Department of Computer Science
[3] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Clinical Psychology and Psychotherapy (KliPs)
[4] Institute of AI for Health,Translational Digital Health Group
[5] Helmholtz Zentrum München - German Research Center for Environmental Health,undefined
来源
关键词
D O I
10.1038/s44184-024-00112-8
中图分类号
学科分类号
摘要
Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.
引用
收藏
相关论文
共 50 条
  • [21] Detection and Analysis Model for Grammatical Facial Expressions in Sign Language
    Bhuvan, M. S.
    Rao, Vinay D.
    Jain, Siddharth
    Ashwin, T. S.
    Guddetti, Ram Mohana Reddy
    Kulgod, Sutej Pramod
    2016 IEEE REGION 10 SYMPOSIUM (TENSYMP), 2016, : 155 - 160
  • [22] Verbal lie detection using Large Language Models
    Loconte, Riccardo
    Russo, Roberto
    Capuozzo, Pasquale
    Pietrini, Pietro
    Sartori, Giuseppe
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [23] Explaining Misinformation Detection Using Large Language Models
    Pendyala, Vishnu S.
    Hall, Christopher E.
    ELECTRONICS, 2024, 13 (09)
  • [24] Software Vulnerability Detection using Large Language Models
    Das Purba, Moumita
    Ghosh, Arpita
    Radford, Benjamin J.
    Chu, Bill
    2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS, ISSREW, 2023, : 112 - 119
  • [25] Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models
    Chen, Zheyi
    Xu, Liuchang
    Zheng, Hongting
    Chen, Luyao
    Tolba, Amr
    Zhao, Liang
    Yu, Keping
    Feng, Hailin
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (02): : 1753 - 1808
  • [26] Large language models and multimodal foundation models for precision oncology
    Truhn, Daniel
    Eckardt, Jan-Niklas
    Ferber, Dyke
    Kather, Jakob Nikolas
    NPJ PRECISION ONCOLOGY, 2024, 8 (01)
  • [27] Large language models and multimodal foundation models for precision oncology
    Daniel Truhn
    Jan-Niklas Eckardt
    Dyke Ferber
    Jakob Nikolas Kather
    npj Precision Oncology, 8
  • [28] PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language Models
    Prakash, Nirmalendu
    Wang, Han
    Hoang, Nguyen Khoi
    Hee, Ming Shan
    Lee, Roy Ka-Wei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 621 - 631
  • [29] ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language Models
    Yang, Jackie Junrui
    Shi, Yingtian
    Zhang, Yuhan
    Li, Karina
    Rosli, Daniel Wan
    Jain, Anisha
    Zhang, Shuning
    Li, Tianshi
    Landay, James A.
    Lam, Monica S.
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [30] AutoDep: automatic depression detection using facial expressions based on linear binary pattern descriptor
    Manjunath Tadalagi
    Amit M. Joshi
    Medical & Biological Engineering & Computing, 2021, 59 : 1339 - 1354