Multimodal Deep Learning Framework for Mental Disorder Recognition

被引:35
|
作者
Zhang, Ziheng [1 ,4 ]
Lin, Weizhe [2 ]
Liu, Mingyu [3 ]
Mahmoud, Marwa [1 ]
机构
[1] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
[2] Univ Cambridge, Dept Engn, Cambridge, England
[3] Univ Oxford, Dept Phys, Oxford, England
[4] Tencent Jarvis Lab, Shenzhen, Peoples R China
关键词
D O I
10.1109/FG47880.2020.00033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current methods for mental disorder recognition mostly depend on clinical interviews and self-reported scores that can be highly subjective. Building an automatic recognition system can help in early detection of symptoms and providing insights into the biological markers for diagnosis. It is, however, a challenging task as it requires taking into account indicators from different modalities, such as facial expressions, gestures, acoustic features and verbal content. To address this issue, we propose a general-purpose multimodal deep learning framework, in which multiple modalities - including acoustic, visual and textual features - are processed individually with the cross-modality correlation considered. Specifically, a Multimodal Deep Denoising Autoencoder (multi-DDAE) is designed to obtain multimodal representations of audio-visual features followed by the Fisher Vector encoding which produces session-level descriptors. For textual modality, a Paragraph Vector (PV) is proposed to embed the transcripts of interview sessions into document representations capturing cues related to mental disorders. Following an early fusion strategy, both audio-visual and textual features are then fused prior to feeding them to a Multitask Deep Neural Network (DNN) as the final classifier. Our framework is evaluated on the automatic detection of two mental disorders: bipolar disorder (BD) and depression, using two datasets: Bipolar Disorder Corpus (BDC) and the Extended Distress Analysis Interview Corpus (E-DAIC), respectively. Our experimental evaluation results showed comparable performance to the state-of-the-art in BD and depression detection, thus demonstrating the effective multimodal representation learning and the capability to generalise across different mental disorders.
引用
收藏
页码:344 / 350
页数:7
相关论文
共 50 条
  • [31] EmoNets: Multimodal deep learning approaches for emotion recognition in video
    Samira Ebrahimi Kahou
    Xavier Bouthillier
    Pascal Lamblin
    Caglar Gulcehre
    Vincent Michalski
    Kishore Konda
    Sébastien Jean
    Pierre Froumenty
    Yann Dauphin
    Nicolas Boulanger-Lewandowski
    Raul Chandias Ferrari
    Mehdi Mirza
    David Warde-Farley
    Aaron Courville
    Pascal Vincent
    Roland Memisevic
    Christopher Pal
    Yoshua Bengio
    Journal on Multimodal User Interfaces, 2016, 10 : 99 - 111
  • [32] Automatic Emotion Recognition Using Temporal Multimodal Deep Learning
    Nakisa, Bahareh
    Rastgoo, Mohammad Naim
    Rakotonirainy, Andry
    Maire, Frederic
    Chandran, Vinod
    IEEE ACCESS, 2020, 8 : 225463 - 225474
  • [33] Sign Language Recognition with Multimodal Sensors and Deep Learning Methods
    Lu, Chenghong
    Kozakai, Misaki
    Jing, Lei
    ELECTRONICS, 2023, 12 (23)
  • [34] Towards Multimodal Deep Learning for Activity Recognition on Mobile Devices
    Radu, Valentin
    Lane, Nicholas D.
    Bhattacharya, Sourav
    Mascolo, Cecilia
    Marina, Mahesh K.
    Kawsar, Fahim
    UBICOMP'16 ADJUNCT: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING, 2016, : 185 - 188
  • [35] Multimodal features deep learning for robotic potential grasp recognition
    Zhong X.-G.
    Xu M.
    Zhong X.-Y.
    Peng X.-F.
    Zidonghua Xuebao/Acta Automatica Sinica, 2016, 42 (07): : 1022 - 1029
  • [36] DeepMNF: Deep Multimodal Neuroimaging Framework for Diagnosing Autism Spectrum Disorder
    Abbas, S. Qasim
    Chi, Lianhua
    Chen, Yi-Ping Phoebe
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 136
  • [37] A unified multimodal classification framework based on deep metric learning
    Peng, Liwen
    Jian, Songlei
    Li, Minne
    Kan, Zhigang
    Qiao, Linbo
    Li, Dongsheng
    NEURAL NETWORKS, 2025, 181
  • [38] Multimodal Deep Learning Framework for Enhanced Accuracy of UAV Detection
    Diamantidou, Eleni
    Lalas, Antonios
    Votis, Konstantinos
    Tzovaras, Dimitrios
    COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 768 - 777
  • [39] An Improved Deep Learning Framework for Multimodal Medical Data Analysis
    Kumar, Sachin
    Sharma, Shivani
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (10)
  • [40] SHIELD: A Multimodal Deep Learning Framework for Android Malware Detection
    Singh, Narendra
    Tripathy, Somanath
    Bezawada, Bruhadeshwar
    INFORMATION SYSTEMS SECURITY, ICISS 2022, 2022, 13784 : 64 - 83