MTDAN: A Lightweight Multi-Scale Temporal Difference Attention Networks for Automated Video Depression Detection

被引:6
|
作者
Zhang, Shiqing [1 ]
Zhang, Xingnan [1 ,2 ]
Zhao, Xiaoming [1 ]
Fang, Jiangxiong [1 ]
Niu, Mingyue [3 ]
Zhao, Ziping [3 ]
Yu, Jun [4 ]
Tian, Qi [5 ]
机构
[1] Taizhou Univ, Inst Intelligent Informat Proc, Taizhou 318000, Peoples R China
[2] Zhejiang Sci Tech Univ, Sch Informat Sci & Technol, Hangzhou 310023, Peoples R China
[3] Tianjin Normal Univ TJNU, Sch Comp & Informat Engn, Tianjin 300387, Peoples R China
[4] Hangzhou Dianzi Univ, Sch Comp Sci, Hangzhou 310018, Peoples R China
[5] Huawei Cloud & AI, Shenzhen 518129, Peoples R China
基金
美国国家科学基金会;
关键词
Depression; Behavioral sciences; Feature extraction; Deep learning; Computational modeling; Task analysis; Computational complexity; video depression detection; temporal difference; attention; multi-scale; computational complexity;
D O I
10.1109/TAFFC.2023.3312263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning based video depression analysis has been recently an interesting and challenging topic. Most of existing works focus on learning single-scale facial dynamics of participants for depression detection. Besides, they usually adopt expensive deep learning models with high computational complexity, resulting in difficulty in real-time clinical applications. To address these two issues, this work proposes a lightweight Multi-scale Temporal Difference Attention Networks (MTDAN) integrating the temporal difference and attention mechanism to model both short-term and long-term temporal facial behaviors for automated video depression detection. Initially, two simple yet effective sub-branches, i.e., a Short-term Temporal Difference Attention Network (ST-TDAN), and a Long-term Temporal Difference Attention Network (LT-TDAN), are designed to perform individually short-term and long-term depressive behavior modeling. Then, a simple Interactive Multi-head Attention Fusion (IMHAF) strategy is employed for integrating short-term and long-term spatiotemporal features, followed by a linear fully-collected layer for depression score prediction. Experiments on two public AVEC2013 and AVEC2014 datasets show that our proposed method not only achieves highly competitive performance to state-of-the-art methods, but also has much smaller computational complexity than them on video depression detection tasks.
引用
收藏
页码:1078 / 1089
页数:12
相关论文
共 50 条
  • [1] Lightweight Seizure Detection Based on Multi-Scale Channel Attention
    Wang, Ziwei
    Hou, Sujuan
    Xiao, Tiantian
    Zhang, Yongfeng
    Lv, Hongbin
    Li, Jiacheng
    Zhao, Shanshan
    Zhao, Yanna
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (12)
  • [2] Steel defect detection based on multi-scale lightweight attention
    Zhou Y.
    Meng J.-N.
    Wang D.-L.
    Tan Y.-Q.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 901 - 909
  • [3] Multi-Scale Temporal Convolutional Networks and Multi-Head Attention for Robust Log Anomaly Detection
    Zhang, Zhigang
    Li, Wei
    Wang, Yizhe
    Wang, Zhe
    Sheng, Xiang
    Zhou, Tianxiang
    INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (03):
  • [4] Multi-Scale Spatio-Temporal Memory Network for Lightweight Video Denoising
    Sun, Lu
    Wu, Fangfang
    Ding, Wei
    Li, Xin
    Lin, Jie
    Dong, Weisheng
    Shi, Guangming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5810 - 5823
  • [5] Truncated attention-aware proposal networks with multi-scale dilation for temporal action detection
    Li, Ping
    Cao, Jiachen
    Yuan, Li
    Ye, Qinghao
    Xu, Xianghua
    PATTERN RECOGNITION, 2023, 142
  • [6] Multi-scale spatial-temporal attention graph convolutional networks for driver fatigue detection
    Fa, Shuxiang
    Yang, Xiaohui
    Han, Shiyuan
    Feng, Zhiquan
    Chen, Yuehui
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 93
  • [7] Multi-Scale Attention Generative Adversarial Networks for Video Frame Interpolation
    Xiao, Jian
    Bi, Xiaojun
    IEEE ACCESS, 2020, 8 : 94842 - 94851
  • [8] Lightweight multi-scale residual networks with attention for image super-resolution
    Liu, Huan
    Cao, Feilong
    Wen, Chenglin
    Zhang, Qinghua
    KNOWLEDGE-BASED SYSTEMS, 2020, 203
  • [9] Pulmonary nodules detection based on multi-scale attention networks
    Zhang, Hui
    Peng, Yanjun
    Guo, Yanfei
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [10] Pulmonary nodules detection based on multi-scale attention networks
    Hui Zhang
    Yanjun Peng
    Yanfei Guo
    Scientific Reports, 12