An Enhanced Cross-Attention Based Multimodal Model for Depression Detection

被引:0
|
作者
Kou, Yifan [1 ]
Ge, Fangzhen [1 ,2 ]
Chen, Debao [2 ,3 ]
Shen, Longfeng [1 ,2 ,4 ]
Liu, Huaiyu [1 ]
机构
[1] School of Computer Science and Technology, Huaibei Normal University, Huaibei, China
[2] Anhui Engineering Research Center for Intelligent Computing and Application on Cognitive Behavior (ICACB), Anhui, Huaibei, China
[3] School of Physics and Electronic Information, Huaibei Normal University, Huaibei, China
[4] Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
基金
中国国家自然科学基金;
关键词
Deep learning - Neural networks;
D O I
10.1111/coin.70019
中图分类号
学科分类号
摘要
Depression, a prevalent mental disorder in modern society, significantly impacts people's daily lives. Recently, there have been advancements in developing automated diagnosis models for detecting depression. However, data scarcity, primarily due to privacy concerns, has posed a challenge. Traditional speech features have limitations in representing knowledge for depression diagnosis, and the complexity of deep learning algorithms necessitates substantial data support. Furthermore, existing multimodal methods based on neural networks overlook the heterogeneity gap between different modalities, potentially resulting in redundant information. To address these issues, we propose a multimodal depression detection model based on the Enhanced Cross-Attention (ECA) Mechanism. This model effectively explores text-speech interactions while considering modality heterogeneity. Data scarcity has been mitigated by fine-tuning pre-trained models. Additionally, we design a modal fusion module based on ECA, which emphasizes similarity responses and updates the weight of each modal feature based on the similarity information between modal features. Furthermore, for speech feature extraction, we have reduced the computational complexity of the model by integrating a multi-window self-attention mechanism with the Fourier transform. The proposed model is evaluated on the public dataset, DAIC-WOZ, achieving an accuracy of 80.0% and an average F1 value improvement of 4.3% compared with relevant methods. © 2025 Wiley Periodicals LLC.
引用
收藏
相关论文
共 50 条
  • [31] Remote sensing image change detection based on swin transformer and cross-attention mechanism
    Weidong Yan
    Li Cao
    Pei Yan
    Chaosheng Zhu
    Mengtian Wang
    Earth Science Informatics, 2025, 18 (1)
  • [32] Ship Detection in SAR Images via Cross-Attention Mechanism
    Lv, Yilong
    Li, Min
    CANADIAN JOURNAL OF REMOTE SENSING, 2022, 48 (06) : 764 - 778
  • [33] AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network
    Xi, Ziyi
    Huang, Wenmin
    Wei, Kangkang
    Luo, Weiqi
    Zheng, Peijia
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1463 - 1470
  • [34] Personalized Fashion Recommendations for Diverse Body Shapes with Contrastive Multimodal Cross-Attention Network
    Ma, Jianghong
    Sun, Huiyue
    Yang, Dezhao
    Zhang, Haijun
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (04)
  • [35] A multimodal fusion model with multi-level attention mechanism for depression detection
    Fang, Ming
    Peng, Siyu
    Liang, Yujia
    Hung, Chih-Cheng
    Liu, Shuhua
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 82
  • [36] CoCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation Detection and Diagnosis
    Zheng, Nianzu
    Deng, Liqun
    Huang, Wenyong
    Yeung, Yu Ting
    Xu, Baohua
    Guo, Yuanyuan
    Wang, Yasheng
    Chen, Xiao
    Jiang, Xin
    Liu, Qun
    INTERSPEECH 2022, 2022, : 4352 - 4356
  • [37] An Improved Siamese Tracking Network Based On Self-Attention And Cross-Attention
    Lai Yijun
    Song Jianmei
    She Haoping
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 466 - 470
  • [38] Cross-Attention Guided Group Aggregation Network for Cropland Change Detection
    Xu, Chuan
    Ye, Zhaoyi
    Mei, Liye
    Shen, Sen
    Sun, Shaohua
    Wang, Ying
    Yang, Wei
    IEEE SENSORS JOURNAL, 2023, 23 (12) : 13680 - 13691
  • [39] CAFIN: cross-attention based face image repair network
    Li, Yaqian
    Li, Kairan
    Li, Haibin
    Zhang, Wenming
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [40] CASNet: A Cross-Attention Siamese Network for Video Salient Object Detection
    Ji, Yuzhu
    Zhang, Haijun
    Jie, Zequn
    Ma, Lin
    Wu, Q. M. Jonathan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2676 - 2690