Improving Speech-Based Dysarthria Detection using Multi-task Learning with Gradient Projection

被引:0
|
作者
Xiang, Yan [1 ]
Berisha, Visar [1 ,2 ]
Liss, Julie [2 ]
Chakrabarti, Chaitali [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85281 USA
[2] Arizona State Univ, Coll Hlth Solut, Tempe, AZ USA
来源
关键词
Dysarthria detection; speech processing; deep neural network; multi-task learning; DISEASE;
D O I
10.21437/Interspeech.2024-1563
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speech analytic models based on deep learning are popular in clinical diagnostics. However, constraints on clinical data collection and sharing place limits on available dataset sizes, which adversely impacts trained model performance. Multi-task learning (MTL) has been utilized to mitigate the effect of limited sample size by jointly training on multiple tasks that are considered to be related. However, discrepancies between clinical and non-clinical tasks can reduce MTL efficiency and can even cause it to fail, especially when there are gradient conflicts. In this paper, we enhance the performance of dysarthria detection by using MTL with an auxiliary task of learning speaker embeddings. We propose a task-specific gradient projection method to overcome gradient conflicts. Our evaluation shows that the proposed MTL paradigm outperforms both single-task learning and conventional MTL under different data availability settings.
引用
收藏
页码:902 / 906
页数:5
相关论文
共 50 条
  • [1] Multi-task gradient descent for multi-task learning
    Lu Bai
    Yew-Soon Ong
    Tiantian He
    Abhishek Gupta
    Memetic Computing, 2020, 12 : 355 - 369
  • [2] Multi-task gradient descent for multi-task learning
    Bai, Lu
    Ong, Yew-Soon
    He, Tiantian
    Gupta, Abhishek
    MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
  • [3] IMPROVING SPEECH-BASED PTSD DETECTION VIA MULTI-VIEW LEARNING
    Zhuang, Xiaodan
    Rozgic, Viktor
    Crystal, Michael
    Marx, Brian P.
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 260 - 265
  • [4] MULTI-TASK LEARNING IMPROVES SYNTHETIC SPEECH DETECTION
    Mo, Yichuan
    Wang, Shilin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6392 - 6396
  • [5] Dysarthria severity classification using multi-head attention and multi-task learning
    Joshy, Amlu Anna
    Rajan, Rajeev
    SPEECH COMMUNICATION, 2023, 147 : 1 - 11
  • [6] Speech Emotion Recognition based on Multi-Task Learning
    Zhao, Huijuan
    Han Zhijie
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
  • [7] Gradient Surgery for Multi-Task Learning
    Yu, Tianhe
    Kumar, Saurabh
    Gupta, Abhishek
    Levine, Sergey
    Hausman, Karol
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition
    Park, Sunchan
    Kim, Hyung Soon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 515 - 522
  • [9] A deep neural network based multi-task learning approach to hate speech detection
    Kapil, Prashant
    Ekbal, Asif
    KNOWLEDGE-BASED SYSTEMS, 2020, 210 (210)
  • [10] TO REVERSE THE GRADIENT OR NOT: AN EMPIRICAL COMPARISON OF ADVERSARIAL AND MULTI-TASK LEARNING IN SPEECH RECOGNITION
    Adi, Yossi
    Zeghidour, Neil
    Collobert, Ronan
    Usunier, Nicolas
    Liptchinsky, Vitaliy
    Synnaeve, Gabriel
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3742 - 3746