Enhancing Speech Recognition for Parkinson’s Disease Patient Using Transfer Learning Technique

被引:7
|
作者
Yu Q. [1 ,2 ]
Ma Y. [1 ,2 ]
Li Y. [1 ,2 ]
机构
[1] Department of Micro-Nano Electronics, Shanghai Jiao Tong University, Shanghai
[2] MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai
来源
关键词
A; data augmentation; parkinson’s disease; R; 857.3; scarce data; speech recognition; transfer learning technique;
D O I
10.1007/s12204-021-2376-3
中图分类号
学科分类号
摘要
Parkinson’s disease patients suffer from disorders of speech. The most frequently reported speech problems are weak, hoarse, nasal or monotonous voice, imprecise articulation, slow or fast speech, difficulty starting speech, impaired stress or rhythm, stuttering, and tremor. To improve the speech quality and assist the patient with speech rehabilitation therapy, we have proposed the speech recognition model for Parkinson’s disease patients using transfer learning technique (PSTL), where we have pre-trained the long short-term memory (LSTM) neural network model with our developed publicly available dataset that has been obtained from healthy people through the social media platform. Then, we applied the transfer learning technique to improve the performance of the PSTL framework. The frequency spectrogram masking data augmentation method has been used to alleviate the over-fitting problem so that the word error rate (WER) is further reduced. Even with a limited dataset, our proposed model has effectively reduced the WER from 58% to 44.5% on the original speech dataset and 53.1% to 43% on the denoised speech dataset, which demonstrated the feasibility of our framework. © 2021, Shanghai Jiao Tong University and Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:90 / 98
页数:8
相关论文
共 50 条
  • [21] Transfer Learning to Detect Parkinson's Disease from Speech In Different Languages Using Convolutional Neural Networks with Layer Freezing
    David Rios-Urrego, Cristian
    Camilo Vasquez-Correa, Juan
    Rafael Orozco-Arroyave, Juan
    Noeth, Elmar
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 331 - 339
  • [22] Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition
    Rathod, Siddharth
    Charola, Monil
    Patil, Hemant A.
    SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 579 - 589
  • [23] Automatic Speech Recognition in Noise for Parkinson's Disease: A Pilot Study
    Goudarzi, Alireza
    Moya-Gale, Gemma
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 4
  • [24] Multi-source sparse broad transfer learning for parkinson's disease diagnosis via speech
    Liu, Yuchuan
    Li, Lianzhi
    Rao, Yu
    Cao, Huihua
    Tan, Xiaoheng
    Li, Yongsong
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2025,
  • [25] Few-shot learning of Parkinson's disease speech data with optimal convolution sparse kernel transfer learning
    Zhang, Xiaoheng
    Ma, Jie
    Li, Yongming
    Wang, Pin
    Liu, Yuchuan
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 69
  • [26] Transfer Learning for Speech Emotion Recognition
    Han Zhijie
    Zhao, Huijuan
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99
  • [27] Improving Parkinson?s disease recognition through voice analysis using deep learning
    Khaskhoussy, Rania
    Ben Ayed, Yassine
    PATTERN RECOGNITION LETTERS, 2023, 168 : 64 - 70
  • [28] Speech motor learning in Parkinson disease
    Schulz, GM
    Sulc, S
    Leon, S
    Gilligan, G
    JOURNAL OF MEDICAL SPEECH-LANGUAGE PATHOLOGY, 2000, 8 (04) : 243 - 247
  • [29] Machine Learning-Based Classification of Parkinson's Disease Patients Using Speech Biomarkers
    Hossain, Mohammad Amran
    Amenta, Francesco
    JOURNAL OF PARKINSONS DISEASE, 2024, 14 (01) : 95 - 109
  • [30] Implementation and Evaluation of Learning Classifiers in Detecting Parkinson's Disease Using Extensive Speech Parameters
    Mital, Matt Ervin
    PROCEEDINGS OF 2021 13TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2021, : 241 - 246