The Detection of Depression Using Multimodal Models Based on Text and Voice Quality Features

被引:11
|
作者
Solieman, Hanadi [1 ]
Pustozerov, Evgenii A. [1 ,2 ]
机构
[1] St Petersburg Electrotech Univ LETI, St Petersburg, Russia
[2] Almazov Natl Med Res Ctr, St Petersburg, Russia
关键词
Depression; Deep Learning; text analysis; voice quality; semi-contextual; word-level; speaker-independent; DAICWOZ; CLASSIFICATION;
D O I
10.1109/ElConRus51938.2021.9396540
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The article proves the concept that an automatic diagnosis of depression can be achieved using audio recordings of the individuals' voices. DAIC-WOZ database was used as a data source. Audio and textual data were preprocessed and converted to a set of optimized parameters for two models. Appropriate Deep Learning models to detect depression in the transcripts of the audio recordings and voice quality features, were utilized. We created a text analysis model on a word-level using Natural Language Processing (NLP) techniques, and a voice quality analysis model on tense to breathy dimension. The text analysis model made its best performance with an Fl-score equal to 0.8 (0.42) for non-depressed (depressed) individuals, while the voice quality model scored 0.76 (0.38). As a result, we had two models that would be implemented in a system for the diagnosis of depression.
引用
收藏
页码:1843 / 1848
页数:6
相关论文
共 50 条
  • [41] Efficient Video Text Detection using Edge Features
    Shivakumara, Palaiahnakote
    Huang, Weihua
    Tan, Chew Lim
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1235 - 1238
  • [42] Effective video text detection using line features
    Liu, Y
    Lu, H
    Xue, XY
    Tan, YP
    2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 1528 - 1532
  • [43] Urdu Caption Text Detection using Textural Features
    Mirza, Ali
    Fayyaz, Marium
    Seher, Zunera
    Siddiqi, Imran
    PROCEEDINGS OF THE 2ND MEDITERRANEAN CONFERENCE ON PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE (MEDPRAI-2018), 2018, : 70 - 75
  • [44] TEXT DETECTION IN VIDEO FRAMES USING HYBRID FEATURES
    Ji, Zhong
    Wang, Jian
    Su, Yu-Ting
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 318 - 322
  • [45] Text-Guided Style Transfer-Based Image Manipulation Using Multimodal Generative Models
    Togo, Ren
    Kotera, Megumi
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE ACCESS, 2021, 9 : 64860 - 64870
  • [46] Automatic detection of depression symptoms in twitter using multimodal analysis
    Safa, Ramin
    Bayat, Peyman
    Moghtader, Leila
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (04): : 4709 - 4744
  • [47] Automatic detection of depression symptoms in twitter using multimodal analysis
    Ramin Safa
    Peyman Bayat
    Leila Moghtader
    The Journal of Supercomputing, 2022, 78 : 4709 - 4744
  • [48] Applying Quality Index Criterion for Flexible Multi-Detection of Heartbeat using Features of Multimodal Data
    Mollakazemi, Mohammad Javad
    Asadi, Farhad
    Ghiasi, Shadi
    Sadati, S. Hossein
    2016 COMPUTING IN CARDIOLOGY CONFERENCE (CINC), VOL 43, 2016, 43 : 1065 - 1068
  • [49] Multimodal text-emoji fusion using deep neural networks for text-based emotion detection in online communication
    Sheetal Kusal
    Shruti Patil
    Ketan Kotecha
    Journal of Big Data, 12 (1)
  • [50] Text detection in images based on color texture features
    Liu, CM
    Wang, CH
    Dai, RW
    ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 40 - 48