Improving Speech-Based Dysarthria Detection using Multi-task Learning with Gradient Projection

被引：0

作者：

Xiang, Yan ^{[1
]}

Berisha, Visar ^{[1
,2
]}

Liss, Julie ^{[2
]}

Chakrabarti, Chaitali ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA

[2] Arizona State Univ, Coll Hlth Solut, Tempe, AZ USA

来源：

INTERSPEECH 2024 | 2024年

关键词：

Dysarthria detection; speech processing; deep neural network; multi-task learning; DISEASE;

D O I：

10.21437/Interspeech.2024-1563

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech analytic models based on deep learning are popular in clinical diagnostics. However, constraints on clinical data collection and sharing place limits on available dataset sizes, which adversely impacts trained model performance. Multi-task learning (MTL) has been utilized to mitigate the effect of limited sample size by jointly training on multiple tasks that are considered to be related. However, discrepancies between clinical and non-clinical tasks can reduce MTL efficiency and can even cause it to fail, especially when there are gradient conflicts. In this paper, we enhance the performance of dysarthria detection by using MTL with an auxiliary task of learning speaker embeddings. We propose a task-specific gradient projection method to overcome gradient conflicts. Our evaluation shows that the proposed MTL paradigm outperforms both single-task learning and conventional MTL under different data availability settings.

引用

页码：902 / 906

页数：5

共 50 条

[31] Multi-Task Network Anomaly Detection using Federated Learning
Zhao, Ying
Chen, Junjun
Wu, Di
Teng, Jian
Yu, Shui
SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 273 - 279
[32] Fetal Cardiac Structure Detection Using Multi-task Learning
He, Jie
Yang, Lei
Zhu, Yunping
Li, Donglian
Ding, Zhixing
Lu, Yuhuan
Liang, Bocheng
Li, Shengli
ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT II, ICIC 2024, 2024, 14882 : 405 - 419
[33] Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures
Yuan, Lanqin
Rizoiu, Marian-Andrei
COMPUTER SPEECH AND LANGUAGE, 2025, 89
[34] A multi-task based deep learning approach for intrusion detection
Liu, Qigang
Wang, Deming
Jia, Yuhang
Luo, Suyuan
Wang, Chongren
KNOWLEDGE-BASED SYSTEMS, 2022, 238
[35] Arabic Offensive and Hate Speech Detection Using a Cross-Corpora Multi-Task Learning Model
Aldjanabi, Wassen
Dahou, Abdelghani
Al-qaness, Mohammed A. A.
Abd Elaziz, Mohamed
Helmi, Ahmed Mohamed
Damasevicius, Robertas
INFORMATICS-BASEL, 2021, 8 (04):
[36] Towards multi-task learning of speech and speaker recognition
Vaessen, Nik
van Leeuwen, David A.
INTERSPEECH 2023, 2023, : 4898 - 4902
[37] IMPROVING SPEECH RECOGNITION IN REVERBERATION USING A ROOM-AWARE DEEP NEURAL NETWORK AND MULTI-TASK LEARNING
Giri, Ritwik
Seltzer, Michael L.
Droppo, Jasha
Yu, Dong
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5014 - 5018
[38] Meta Multi-task Learning for Speech Emotion Recognition
Cai, Ruichu
Guo, Kaibin
Xu, Boyan
Yang, Xiaoyan
Zhang, Zhenjie
INTERSPEECH 2020, 2020, : 3336 - 3340
[39] Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning
Zhengqi Wen
Kehuang Li
Zhen Huang
Chin-Hui Lee
Jianhua Tao
Journal of Signal Processing Systems, 2018, 90 : 1025 - 1037
[40] Improving Low-Resource Chinese Event Detection with Multi-task Learning
Tong, Meihan
Xu, Bin
Wang, Shuai
Hou, Lei
Li, Juaizi
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 421 - 433

← 1 2 3 4 5 →