Glottal Source Features for Automatic Speech-based Depression Assessment

被引:15
|
作者
Simantiraki, Olympia [1 ]
Charonyktakis, Paulos [2 ]
Pampouchidou, Anastasia [3 ]
Tsiknakis, Manolis [4 ,5 ]
Cooker, Martin [1 ]
机构
[1] Univ Basque Country, Language & Speech Lab, Vitoria, Spain
[2] Gnosis Data Anal PC, Iraklion, Greece
[3] Univ Burgundy, Le2i Lab, Le Creusot, France
[4] Technol Educ Inst Greece, Iraklion, Greece
[5] FORTH, Iraklion, Greece
关键词
glottal source; Phase Distortion Deviation; binary classification; machine learning;
D O I
10.21437/Interspeech.2017-1251
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Depression is one of the most prominent mental disorders, with an increasing rate that makes it the fourth cause of disability worldwide. The field of automated depression assessment has emerged to aid clinicians in the form of a decision support system. Such a system could assist as a pre-screening tool, or even for monitoring high risk populations. Related work most commonly involves multimodal approaches, typically combining audio and visual signals to identify depression presence and/or severity. The current study explores categorical assessment of depression using audio features alone. Specifically, since depression-related vocal characteristics impact the glottal source signal, we examine Phase Distortion Deviation which has previously been applied to the recognition of voice qualities such as hoarseness, breathiness and creakiness, some of which are thought to be features of depressed speech. The proposed method uses as features DCT-coefficients of the Phase Distortion Deviation for each frequency band. An automated machine learning tool, Just Add Data, is used to classify speech samples. The method is evaluated on a benchmark dataset (AVEC2014), in two conditions: read-speech and spontaneous-speech. Our findings indicate that Phase Distortion Deviation is a promising audio-only feature for automated detection and assessment of depressed speech.
引用
收藏
页码:2700 / 2704
页数:5
相关论文
共 50 条
  • [21] AUTOMATIC SEGMENTATION AND LABELING OF SPEECH-BASED ON HIDDEN MARKOV-MODELS
    BRUGNARA, F
    FALAVIGNA, D
    OMOLOGO, M
    [J]. SPEECH COMMUNICATION, 1993, 12 (04) : 357 - 370
  • [22] Assessing speaker independence on a speech-based depression level estimation system
    Lopez-Otero, Paula
    Docio-Fernandez, Laura
    Garcia-Mateo, Carmen
    [J]. PATTERN RECOGNITION LETTERS, 2015, 68 : 343 - 350
  • [23] Spatial-Temporal Feature Network for Speech-Based Depression Recognition
    Han, Zhuojin
    Shang, Yuanyuan
    Shao, Zhuhong
    Liu, Jingyi
    Guo, Guodong
    Liu, Tie
    Ding, Hui
    Hu, Qiang
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (01) : 308 - 318
  • [24] Investigation of speech-based language-independent possibilities of depression recognition
    Kiss, Gabor
    [J]. 2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 226 - 229
  • [25] Whispered Speech Detection Using Glottal Flow-Based Features
    Phapatanaburi, Khomdet
    Pathonsuwan, Wongsathon
    Wang, Longbiao
    Anchuen, Patikorn
    Jumphoo, Talit
    Buayai, Prawit
    Uthansakul, Monthippa
    Uthansakul, Peerapong
    [J]. SYMMETRY-BASEL, 2022, 14 (04):
  • [26] Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
    Drugman, Thomas
    Bozkurt, Baris
    Dutoit, Thierry
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 108 - 111
  • [27] Analysis of Glottal Source Parameters in Parkinsonian Speech
    Hanratty, Jane
    Deegan, Catherine
    Walsh, Mary
    Kirkpatrick, Barry
    [J]. 2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 3666 - 3669
  • [28] WHITMAN AND SPEECH-BASED PROSODY
    JARVIS, DR
    [J]. WALT WHITMAN REVIEW, 1981, 27 (02): : 51 - 62
  • [29] Speech-Based Meaning of Music
    Karbanova, Alice
    [J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 385 - 397
  • [30] Speech-based Class Attendance
    Amri, Umar Faizel
    Hashim, Nik Nur Wahidah Nik
    Hanif, Noor Hazrin Hany Mohamad
    [J]. 6TH INTERNATIONAL CONFERENCE ON MECHATRONICS (ICOM'17), 2017, 260