Harnessing multimodal approaches for depression detection using large language models and facial expressions

被引:0
|
作者
Misha Sadeghi [1 ]
Robert Richer [1 ]
Bernhard Egger [2 ]
Lena Schindler-Gmelch [3 ]
Lydia Helene Rupp [3 ]
Farnaz Rahimi [1 ]
Matthias Berking [3 ]
Bjoern M. Eskofier [1 ]
机构
[1] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Machine Learning and Data Analytics Lab (MaD Lab), Department Artificial Intelligence in Biomedical Engineering (AIBE)
[2] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Visual Computing (LGDV), Department of Computer Science
[3] Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU),Chair of Clinical Psychology and Psychotherapy (KliPs)
[4] Institute of AI for Health,Translational Digital Health Group
[5] Helmholtz Zentrum München - German Research Center for Environmental Health,undefined
来源
关键词
D O I
10.1038/s44184-024-00112-8
中图分类号
学科分类号
摘要
Detecting depression is a critical component of mental health diagnosis, and accurate assessment is essential for effective treatment. This study introduces a novel, fully automated approach to predicting depression severity using the E-DAIC dataset. We employ Large Language Models (LLMs) to extract depression-related indicators from interview transcripts, utilizing the Patient Health Questionnaire-8 (PHQ-8) score to train the prediction model. Additionally, facial data extracted from video frames is integrated with textual data to create a multimodal model for depression severity prediction. We evaluate three approaches: text-based features, facial features, and a combination of both. Our findings show the best results are achieved by enhancing text data with speech quality assessment, with a mean absolute error of 2.85 and root mean square error of 4.02. This study underscores the potential of automated depression detection, showing text-only models as robust and effective while paving the way for multimodal analysis.
引用
收藏
相关论文
共 50 条
  • [31] AutoDep: automatic depression detection using facial expressions based on linear binary pattern descriptor
    Tadalagi, Manjunath
    Joshi, Amit M.
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2021, 59 (06) : 1339 - 1354
  • [32] Harnessing the Power of Large Language Models in Agricultural Safety & Health
    Shutske, John M.
    JOURNAL OF AGRICULTURAL SAFETY AND HEALTH, 2023, 29 (04): : 205 - 224
  • [33] Omega - harnessing the power of large language models for bioimage analysis
    Royer, Loic A.
    NATURE METHODS, 2024, 21 (08) : 1371 - 1373
  • [34] Clinical decision support for bipolar depression using large language models
    Perlis, Roy H.
    Goldberg, Joseph F.
    Ostacher, Michael J.
    Schneck, Christopher D.
    NEUROPSYCHOPHARMACOLOGY, 2024, 49 (09) : 1412 - 1416
  • [35] The Detection of Depression Using Multimodal Models Based on Text and Voice Quality Features
    Solieman, Hanadi
    Pustozerov, Evgenii A.
    PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 1843 - 1848
  • [36] TSFFM: Depression detection based on latent association of facial and body expressions
    Li, Xingyun
    Yi, Xinyu
    Lu, Lin
    Wang, Hao
    Zheng, Yunshao
    Han, Mengmeng
    Wang, Qingxiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168
  • [37] Investigating the Catastrophic Forgetting in Multimodal Large Language Models
    Zhai, Yuexiang
    Tong, Shengbang
    Li, Xiao
    Cai, Mu
    Qu, Qing
    Lee, Yong Jae
    Ma, Yi
    CONFERENCE ON PARSIMONY AND LEARNING, VOL 234, 2024, 234 : 202 - 227
  • [38] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    Science China Information Sciences, 2024, 67 (12)
  • [39] A Survey on Multimodal Large Language Models for Autonomous Driving
    Cui, Can
    Ma, Yunsheng
    Cao, Xu
    Ye, Wenqian
    Zhou, Yang
    Liang, Kaizhao
    Chen, Jintai
    Lu, Juanwu
    Yang, Zichong
    Liao, Kuei-Da
    Gao, Tianren
    Li, Erlong
    Tang, Kun
    Cao, Zhipeng
    Zhou, Tong
    Liu, Ao
    Yan, Xinrui
    Mei, Shuqi
    Cao, Jianguo
    Wang, Ziran
    Zheng, Chao
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 958 - 979
  • [40] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64