Glottal Source Features for Automatic Speech-based Depression Assessment

被引：15

作者：

Simantiraki, Olympia ^{[1
]}

Charonyktakis, Paulos ^{[2
]}

Pampouchidou, Anastasia ^{[3
]}

Tsiknakis, Manolis ^{[4
,5
]}

Cooker, Martin ^{[1
]}

机构：

[1] Univ Basque Country, Language & Speech Lab, Vitoria, Spain

[2] Gnosis Data Anal PC, Iraklion, Greece

[3] Univ Burgundy, Le2i Lab, Le Creusot, France

[4] Technol Educ Inst Greece, Iraklion, Greece

[5] FORTH, Iraklion, Greece

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

glottal source; Phase Distortion Deviation; binary classification; machine learning;

D O I：

10.21437/Interspeech.2017-1251

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Depression is one of the most prominent mental disorders, with an increasing rate that makes it the fourth cause of disability worldwide. The field of automated depression assessment has emerged to aid clinicians in the form of a decision support system. Such a system could assist as a pre-screening tool, or even for monitoring high risk populations. Related work most commonly involves multimodal approaches, typically combining audio and visual signals to identify depression presence and/or severity. The current study explores categorical assessment of depression using audio features alone. Specifically, since depression-related vocal characteristics impact the glottal source signal, we examine Phase Distortion Deviation which has previously been applied to the recognition of voice qualities such as hoarseness, breathiness and creakiness, some of which are thought to be features of depressed speech. The proposed method uses as features DCT-coefficients of the Phase Distortion Deviation for each frequency band. An automated machine learning tool, Just Add Data, is used to classify speech samples. The method is evaluated on a benchmark dataset (AVEC2014), in two conditions: read-speech and spontaneous-speech. Our findings indicate that Phase Distortion Deviation is a promising audio-only feature for automated detection and assessment of depressed speech.

引用

页码：2700 / 2704

页数：5

共 50 条

[31] Research on Speech Under Stress Based on Glottal Source Using a Physical Speech Production Model
Yao, Xiao
Xu, Ning
Liu, Xiaofeng
Jiang, Aimin
Zhang, Xuewu
[J]. IEEE ACCESS, 2018, 6 : 44473 - 44482
[32] Speech-based Class Attendance
Amri, Umar Faizel
Hashim, Nik Nur Wahidah Nik
Hanif, Noor Hazrin Hany Mohamad
[J]. 6TH INTERNATIONAL CONFERENCE ON MECHATRONICS (ICOM'17), 2017, 260
[33] Speech-Based Meaning of Music
Karbanova, Alice
[J]. PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 385 - 397
[34] Automated speech-based screening of depression using deep convolutional neural networks
Chlasta, Karol
Wolk, Krzysztof
Krejtz, Izabela
[J]. CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, 2019, 164 : 618 - 628
[35] Automatic Topology Generation of Glottal Source HMM
Sasou, Akira
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1614 - 1617
[36] Automatic Assessment of Dysarthric Speech Intelligibility Based on Selected Phonetic Quality Features
Kim, Myung Jong
Kim, Hoirin
[J]. COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II, 2012, 7383 : 447 - 450
[37] Enhancing accuracy and privacy in speech-based depression detection through speaker disentanglement
Ravi, Vijay
Wang, Jinhan
Flint, Jonathan
Alwan, Abeer
[J]. COMPUTER SPEECH AND LANGUAGE, 2024, 86
[38] Automatic speech-based assessment to discriminate Parkinson's disease from essential tremor with a cross-language approach
Rios-Urrego, Cristian David
Rusz, Jan
Orozco-Arroyave, Juan Rafael
[J]. NPJ DIGITAL MEDICINE, 2024, 7 (01)
[39] Automatic speech-based assessment to discriminate Parkinson’s disease from essential tremor with a cross-language approach
Cristian David Rios-Urrego
Jan Rusz
Juan Rafael Orozco-Arroyave
[J]. npj Digital Medicine, 7
[40] Automatic detection of glottal stop in cleft palate speech
He, Ling
Zhang, Jing
Liu, Qi
Zhang, Junpeng
Yin, Heng
Lech, Margaret
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2018, 39 : 230 - 236

← 1 2 3 4 5 →