An Enhanced Cross-Attention Based Multimodal Model for Depression Detection

被引:0
|
作者
Kou, Yifan [1 ]
Ge, Fangzhen [1 ,2 ]
Chen, Debao [2 ,3 ]
Shen, Longfeng [1 ,2 ,4 ]
Liu, Huaiyu [1 ]
机构
[1] School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
[2] Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), Anhui, Huaibei, China
[3] School of Physics and Electronic Information, Huaibei Normal University, Huaibei, China
[4] Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
基金
中国国家自然科学基金;
关键词
Deep learning - Neural networks;
D O I
10.1111/coin.70019
中图分类号
学科分类号
摘要
Depression, a prevalent mental disorder in modern society, significantly impacts people's daily lives. Recently, there have been advancements in developing automated diagnosis models for detecting depression. However, data scarcity, primarily due to privacy concerns, has posed a challenge. Traditional speech features have limitations in representing knowledge for depression diagnosis, and the complexity of deep learning algorithms necessitates substantial data support. Furthermore, existing multimodal methods based on neural networks overlook the heterogeneity gap between different modalities, potentially resulting in redundant information. To address these issues, we propose a multimodal depression detection model based on the Enhanced Cross-Attention (ECA) Mechanism. This model effectively explores text-speech interactions while considering modality heterogeneity. Data scarcity has been mitigated by fine-tuning pre-trained models. Additionally, we design a modal fusion module based on ECA, which emphasizes similarity responses and updates the weight of each modal feature based on the similarity information between modal features. Furthermore, for speech feature extraction, we have reduced the computational complexity of the model by integrating a multi-window self-attention mechanism with the Fourier transform. The proposed model is evaluated on the public dataset, DAIC-WOZ, achieving an accuracy of 80.0% and an average F1 value improvement of 4.3% compared with relevant methods. © 2025 Wiley Periodicals LLC.
引用
收藏
相关论文
共 50 条
  • [41] RECA: Relation Extraction Based on Cross-Attention Neural Network
    Huang, Xiaofeng
    Guo, Zhiqiang
    Zhang, Jialiang
    Cao, Hui
    Yang, Jie
    ELECTRONICS, 2022, 11 (14)
  • [42] Cross-Attention Transformer for Video Interpolation
    Kim, Hannah Halin
    Yu, Shuzhi
    Yuan, Shuai
    Tomasi, Carlo
    COMPUTER VISION - ACCV 2022 WORKSHOPS, 2023, 13848 : 325 - 342
  • [43] Eye Movement Attention Based Depression Detection Model
    Zhao, Ju
    Wang, Qingxiang
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 1062 - 1063
  • [44] Image Caption with Synchronous Cross-Attention
    Wang, Yue
    Liu, Jinlai
    Wang, Xiaojie
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 433 - 441
  • [45] A wind power forecasting model based on data decomposition and cross-attention mechanism with cosine similarity
    Jiang, Li
    Wang, Yifan
    ELECTRIC POWER SYSTEMS RESEARCH, 2024, 229
  • [46] Temporal Cross-Attention for Action Recognition
    Hashiguchi, Ryota
    Tamaki, Toru
    COMPUTER VISION - ACCV 2022 WORKSHOPS, 2023, 13848 : 283 - 294
  • [47] A depression detection model based on multimodal graph neural network
    Xia, Yujing
    Liu, Lin
    Dong, Tao
    Chen, Juan
    Cheng, Yu
    Tang, Lin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (23) : 63379 - 63395
  • [48] AN INTELLIGENT DEPRESSION DETECTION MODEL BASED ON MULTIMODAL FUSION TECHNOLOGY
    Cheng, Zixuan
    Huang, Xisheng
    Ding, Yang
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2024,
  • [49] SCAD: A Siamese Cross-Attention Discrimination Network for Bitemporal Building Change Detection
    Xu, Chuan
    Ye, Zhaoyi
    Mei, Liye
    Shen, Sen
    Zhang, Qi
    Sui, Haigang
    Yang, Wei
    Sun, Shaohua
    REMOTE SENSING, 2022, 14 (24)
  • [50] Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning
    Dasgupta, Agnibh
    Thong, Xin
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1125 - 1132