Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors

被引:92
|
作者
Alghowinem, Sharifa [1 ]
Goecke, Roland [2 ]
Wagner, Michael [2 ,3 ,4 ,5 ]
Epps, Julien [6 ]
Hyett, Matthew [6 ]
Parker, Gordon [6 ]
Breakspear, Michael [7 ,8 ]
机构
[1] Prince Sultan Univ, Riyadh 11586, Saudi Arabia
[2] Univ Canberra, Canberra, ACT 2617, Australia
[3] Australian Natl Univ, Canberra, ACT 0200, Australia
[4] Natl Ctr Biometric Studies Pty Ltd, Canberra, ACT 2600, Australia
[5] Tech Univ Berlin, D-10623 Berlin, Germany
[6] Univ New South Wales, Sydney, NSW 2052, Australia
[7] QIMR Berghofer Med Res Inst, Brisbane, Qld 400, Australia
[8] Metro North Mental Hlth Serv, Brisbane, Qld 4029, Australia
基金
澳大利亚研究理事会;
关键词
Depression detection; multimodal fusion; speaking behaviour; eye activity; head pose; AUDIO;
D O I
10.1109/TAFFC.2016.2634527
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An estimated 350 million people worldwide are affected by depression. Using affective sensing technology, our long-term goal is to develop an objective multimodal system that augments clinical opinion during the diagnosis and monitoring of clinical depression. This paper steps towards developing a classification system-oriented approach, where feature selection, classification and fusion-based experiments are conducted to infer which types of behaviour (verbal and nonverbal) and behaviour combinations can best discriminate between depression and non-depression. Using statistical features extracted from speaking behaviour, eye activity, and head pose, we characterise the behaviour associated with major depression and examine the performance of the classification of individual modalities and when fused. Using a real-world, clinically validated dataset of 30 severely depressed patients and 30 healthy control subjects, a Support Vector Machine is used for classification with several feature selection techniques. Given the statistical nature of the extracted features, feature selection based on T-tests performed better than other methods. Individual modality classification results were considerably higher than chance level (83 percent for speech, 73 percent for eye, and 63 percent for head). Fusing all modalities shows a remarkable improvement compared to unimodal systems, which demonstrates the complementary nature of the modalities. Among the different fusion approaches used here, feature fusion performed best with up to 88 percent average accuracy. We believe that is due to the compatible nature of the extracted statistical features.
引用
收藏
页码:478 / 490
页数:13
相关论文
共 50 条
  • [21] Electric wheelchair control using head pose free eye-gaze tracker
    Nguyen, Q. X.
    Jo, S.
    [J]. ELECTRONICS LETTERS, 2012, 48 (13) : 750 - 752
  • [22] Gaze Estimation From Eye Appearance: A Head Pose-Free Method via Eye Image Synthesis
    Lu, Feng
    Sugano, Yusuke
    Okabe, Takahiro
    Sato, Yoichi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) : 3680 - 3693
  • [23] MMPF: Multimodal Purification Fusion for Automatic Depression Detection
    Yang, Biao
    Cao, Miaomiao
    Zhu, Xianlin
    Wang, Suhong
    Yang, Changchun
    Ni, Rongrong
    Liu, Xiaofeng
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024,
  • [24] Head pose detection based on fusion of multiple viewpoint information
    Canton-Ferrer, C.
    Casas, J. R.
    Pardas, M.
    [J]. MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2007, 4122 : 305 - 310
  • [25] Segmented Analysis of Eye Gaze Behaviors of Fluent and Stuttered Speech
    Hudock, Daniel
    Stuart, Andrew
    Saltuklaroglu, Tim
    Zhang, Jianliang
    Murray, Nicholas
    Kalinowski, Joseph
    Altieri, Nicholas
    [J]. CANADIAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY AND AUDIOLOGY, 2015, 39 (02): : 134 - 145
  • [26] A dataset for point of gaze detection using head poses and eye images
    McMurrough, Christopher D.
    Metsis, Vangelis
    Kosmopoulos, Dimitrios
    Maglogiannis, Ilias
    Makedon, Fillia
    [J]. JOURNAL ON MULTIMODAL USER INTERFACES, 2013, 7 (03) : 207 - 215
  • [27] A dataset for point of gaze detection using head poses and eye images
    Christopher D. McMurrough
    Vangelis Metsis
    Dimitrios Kosmopoulos
    Ilias Maglogiannis
    Fillia Makedon
    [J]. Journal on Multimodal User Interfaces, 2013, 7 : 207 - 215
  • [28] Application of Head Flexion Detection for Enhancing Eye Gaze Direction Classification
    Al-Rahayfeh, Amer
    Faezipour, Miad
    [J]. 2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 966 - 969
  • [29] Robot Head Pose Detection and Gaze Direction Determination Using Local Invariant Features
    Ruiz-Del-Solar, Javier
    Loncomilla, Patricio
    [J]. ADVANCED ROBOTICS, 2009, 23 (03) : 305 - 328
  • [30] Head Pose-Free Appearance-Based Gaze Sensing via Eye Image Synthesis
    Lu, Feng
    Sugano, Yusuke
    Okabe, Takahiro
    Sato, Yoichi
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1008 - 1011