Enhancing Speech Recognition for Parkinson’s Disease Patient Using Transfer Learning Technique

被引:7
|
作者
Yu Q. [1 ,2 ]
Ma Y. [1 ,2 ]
Li Y. [1 ,2 ]
机构
[1] Department of Micro-Nano Electronics, Shanghai Jiao Tong University, Shanghai
[2] MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai
来源
关键词
A; data augmentation; parkinson’s disease; R; 857.3; scarce data; speech recognition; transfer learning technique;
D O I
10.1007/s12204-021-2376-3
中图分类号
学科分类号
摘要
Parkinson’s disease patients suffer from disorders of speech. The most frequently reported speech problems are weak, hoarse, nasal or monotonous voice, imprecise articulation, slow or fast speech, difficulty starting speech, impaired stress or rhythm, stuttering, and tremor. To improve the speech quality and assist the patient with speech rehabilitation therapy, we have proposed the speech recognition model for Parkinson’s disease patients using transfer learning technique (PSTL), where we have pre-trained the long short-term memory (LSTM) neural network model with our developed publicly available dataset that has been obtained from healthy people through the social media platform. Then, we applied the transfer learning technique to improve the performance of the PSTL framework. The frequency spectrogram masking data augmentation method has been used to alleviate the over-fitting problem so that the word error rate (WER) is further reduced. Even with a limited dataset, our proposed model has effectively reduced the WER from 58% to 44.5% on the original speech dataset and 53.1% to 43% on the denoised speech dataset, which demonstrated the feasibility of our framework. © 2021, Shanghai Jiao Tong University and Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:90 / 98
页数:8
相关论文
共 50 条
  • [1] Enhancing Speech Recognition for Parkinson’s Disease Patient Using Transfer Learning Technique
    Yu, Qing
    Ma, Yi
    Li, Yongfu
    Journal of Shanghai Jiaotong University (Science), 2022, 27 (01): : 90 - 98
  • [3] Machine Learning Applied to Speech Recordings for Parkinson's Disease Recognition
    Aversano, Lerina
    Bernardi, Mario L.
    Cimitile, Marta
    Iammarino, Martina
    Madau, Antonella
    Verdone, Chiara
    DEEP LEARNING THEORY AND APPLICATIONS, DELTA 2023, 2023, 1875 : 101 - 114
  • [4] Enhancing Speech Emotion Recognition Using Transfer Learning from Speaker Embeddings
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Kasak, Peter
    Spisiak, Michal
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 184 - 195
  • [5] Transfer Accent Identification Learning for Enhancing Speech Emotion Recognition
    Dharshini, G. Priya
    Rao, K. Sreenivasa
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (08) : 5090 - 5120
  • [6] Transfer learning for children's speech recognition
    Tong, Rong
    Wang, Lei
    Ma, Bin
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 36 - 39
  • [7] Speech Emotion Recognition Using Transfer Learning
    Song, Peng
    Jin, Yun
    Zhao, Li
    Xin, Minghai
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2530 - 2532
  • [8] A Triplet Multimodel Transfer Learning Network for Speech Disorder Screening of Parkinson's Disease
    Zhao, Aite
    Wang, Nana
    Niu, Xuesen
    Chen, Ming
    Wu, Huimin
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2024, 2024
  • [9] Prediction of Parkinson's Disease using Speech Signal with Extreme Learning Machine
    Agarwal, Aarushi
    Chandrayan, Spriha
    Sahu, Sitanshu S.
    2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 3776 - 3779
  • [10] Effective speech recognition system for patients with Parkinson's disease
    Bak, Huiyong
    Kim, Ryul
    Lee, Sangmin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2022, 41 (06): : 655 - 661