ENCODING TEMPORAL INFORMATION FOR AUTOMATIC DEPRESSION RECOGNITION FROM FACIAL ANALYSIS

被引:0
|
作者
de Melo, Wheidima Carneiro [1 ]
Granger, Eric [2 ]
Lopez, Miguel Bordallo [1 ,3 ]
机构
[1] Univ Oulu, Ctr Machine Vis & Signal Anal CMVS, Oulu, Finland
[2] Ecole Technol Super, Dept Syst Engn, LIVIA, Montreal, PQ, Canada
[3] VTT Tech Res Ctr Finland, Espoo, Finland
基金
芬兰科学院;
关键词
Affective Computing; Depression Detection; Expression Recognition; Temporal Pooling; Two-stream Model;
D O I
10.1109/icassp40776.2020.9054375
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Depression is a mental illness that may be harmful to an individual's health. Using deep learning models to recognize the facial expressions of individuals captured in videos has shown promising results for automatic depression detection. Typically, depression levels are recognized using 2D-Convolutional Neural Networks (CNNs) that are trained to extract static features from video frames, which impairs the capture of dynamic spatio-temporal relations. As an alternative, 3D-CNNs may be employed to extract spatio-temporal features from short video clips, although the risk of overfitting increases due to the limited availability of labeled depression video data. To address these issues, we propose a novel temporal pooling method to capture and encode the spatio-temporal dynamic of video clips into an image map. This approach allows fine-tuning a pre-trained 2D CNN to model facial variations, and thereby improving the training process and model accuracy. Our proposed method is based on two-stream model that performs late fusion of appearance and dynamic information. Extensive experiments on two benchmark AVEC datasets indicate that the proposed method is efficient and outperforms the state-of-the-art schemes.
引用
收藏
页码:1080 / 1084
页数:5
相关论文
共 50 条
  • [1] Fully Automatic Recognition of the Temporal Phases of Facial Actions
    Valstar, Michel F.
    Pantic, Maja
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (01): : 28 - 43
  • [2] Spatial-Temporal Attention Network for Depression Recognition from facial videos
    Pan, Yuchen
    Shang, Yuanyuan
    Liu, Tie
    Shao, Zhuhong
    Guo, Guodong
    Ding, Hui
    Hu, Qiang
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [3] Automatic Depression Analysis Using Dynamic Facial Appearance Descriptor and Dirichlet Process Fisher Encoding
    He, Lang
    Jiang, Dongmei
    Sahli, Hichem
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (06) : 1476 - 1486
  • [4] Fusing the facial temporal information in videos for face recognition
    Selvam, Ithayarani Panner
    Karruppiah, Muneeswaran
    [J]. IET COMPUTER VISION, 2016, 10 (07) : 650 - 659
  • [5] Automatic Speech Recognition with Primarily Temporal Envelope Information
    Lin, Payton
    Chen, Fei
    Wang, Syu Siang
    Lai, Ying Hui
    Tsao, Yu
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 476 - 480
  • [6] Automatic stress analysis from facial videos based on deep facial action units recognition
    Giorgos Giannakakis
    Mohammad Rami Koujan
    Anastasios Roussos
    Kostas Marias
    [J]. Pattern Analysis and Applications, 2022, 25 : 521 - 535
  • [7] Automatic stress analysis from facial videos based on deep facial action units recognition
    Giannakakis, Giorgos
    Koujan, Mohammad Rami
    Roussos, Anastasios
    Marias, Kostas
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2022, 25 (03) : 521 - 535
  • [8] Automatic Recognition of Emotions from Facial Expressions
    Xue, Henry
    Gertner, Izidor
    [J]. AUTOMATIC TARGET RECOGNITION XXIV, 2014, 9090
  • [9] Information-theoretic analysis of efficiency of the phonetic encoding–decoding method in automatic speech recognition
    V. V. Savchenko
    A. V. Savchenko
    [J]. Journal of Communications Technology and Electronics, 2016, 61 : 430 - 435
  • [10] Social Risk and Depression: Evidence from Manual and Automatic Facial Expression Analysis
    Girard, Jeffrey M.
    Cohn, Jeffrey F.
    Mahoor, Mohammad H.
    Mavadati, Seyedmohammad
    Rosenwald, Dean P.
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), 2013,