Dysarthric Speech Recognition Based on Deep Metric Learning

被引:4
|
作者
Takashima, Yuki [1 ]
Takashima, Ryoichi [2 ]
Takiguchi, Tetsuya [2 ]
Ariki, Yasuo [2 ]
机构
[1] Hitachi Ltd, Res & Dev Grp, Hitachi, Ibaraki, Japan
[2] Kobe Univ, Grad Sch Syst Informat, Kobe, Hyogo, Japan
来源
关键词
assistive technology; dysarthria; metric learning; speech recognition; FEATURES; DATABASE;
D O I
10.21437/Interspeech.2020-2267
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present in this paper an automatic speech recognition (ASR) system for a person with an articulation disorder resulting from athetoid cerebral palsy. Because their utterances are often unstable or unclear, speech recognition systems have difficulty recognizing the speech of those with this disorder. For example, their speech styles often fluctuate greatly even when they are repeating the same sentences. For this reason, their speech tends to have great variation even within recognition classes. To alleviate this intra-class variation problem, we propose an ASR system based on deep metric learning. This system learns an embedded representation that is characterized by a small distance between input utterances of the same class, while the distance of the input utterances of different classes is large. Therefore, our method makes it easy for the ASR system to distinguish dysarthric speech. Experimental results show that our proposed approach using deep metric learning improves the word-recognition accuracy consistently. Moreover, we also evaluate the combination of our proposed method and transfer learning from unimpaired speech to alleviate the low-resource problem associated with impaired speech.
引用
收藏
页码:4796 / 4800
页数:5
相关论文
共 50 条
  • [1] Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition
    Latha M.
    Shivakumar M.
    Manjula G.
    Hemakumar M.
    Kumar M.K.
    [J]. SN Computer Science, 4 (3)
  • [2] Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition
    Vachhani, Bhavik
    Bhat, Chitralekha
    Das, Biswajit
    Kopparapu, Sunil Kumar
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1854 - 1858
  • [3] Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System
    Shahamiri, Seyed Reza
    [J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2021, 29 : 852 - 861
  • [4] A Speech Command Control-Based Recognition System for Dysarthric Patients Based on Deep Learning Technology
    Lin, Yu-Yi
    Zheng, Wei-Zhong
    Chu, Wei Chung
    Han, Ji-Yan
    Hung, Ying-Hsiu
    Ho, Guan-Min
    Chang, Chia-Yuan
    Lai, Ying-Hui
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (06):
  • [5] An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks
    Jolad B.
    Khanai R.
    [J]. International Journal of Speech Technology, 2023, 26 (02) : 287 - 305
  • [6] Deep neural network architectures for dysarthric speech analysis and recognition
    Brahim Fares Zaidi
    Sid Ahmed Selouani
    Malika Boudraa
    Mohammed Sidi Yakoub
    [J]. Neural Computing and Applications, 2021, 33 : 9089 - 9108
  • [7] Deep neural network architectures for dysarthric speech analysis and recognition
    Zaidi, Brahim Fares
    Selouani, Sid Ahmed
    Boudraa, Malika
    Sidi Yakoub, Mohammed
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9089 - 9108
  • [8] Optimization of dysarthric speech recognition
    Chen, FX
    Kostov, A
    [J]. PROCEEDINGS OF THE 19TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 19, PTS 1-6: MAGNIFICENT MILESTONES AND EMERGING OPPORTUNITIES IN MEDICAL ENGINEERING, 1997, 19 : 1436 - 1439
  • [9] A SEQUENTIAL CONTRASTIVE LEARNING FRAMEWORK FOR ROBUST DYSARTHRIC SPEECH RECOGNITION
    Wu, Lidan
    Zong, Daoming
    Sun, Shiliang
    Zhao, Jing
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7303 - 7307
  • [10] Transfer Learning Using Whisper for Dysarthric Automatic Speech Recognition
    Rathod, Siddharth
    Charola, Monil
    Patil, Hemant A.
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 579 - 589