Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition

被引:28
|
作者
Vachhani, Bhavik [1 ]
Bhat, Chitralekha [1 ]
Das, Biswajit [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] TCS Innovat Labs, Mumbai, India
关键词
Autoencoders; Dysarthric Speech; Tempo adaptation; Speech Enhancement; ADAPTATION; PARAMETERS; VOICE;
D O I
10.21437/Interspeech.2017-1318
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dysarthria is a motor speech disorder, resulting in mumbled. slurred or slow speech that is generally difficult to understand by both humans and machines. Traditional Automatic Speech Recognizers (ASR) perform poorly on dysarthric speech recognition tasks. In this paper, we propose the use of deep autoencoders to enhance the Mel Frequency Cepstral Coefficients (MFCC) based features in order to improve dysarthric speech recognition. Speech from healthy control speakers is used to train an autoencoder which is in turn used to obtain improved feature representation for dysarthric speech. Additionally. we analyze the use of severity based tempo adaptation followed by autoencoder based speech feature enhancement. All evaluations were carried out on Universal Access dysarthric speech corpus. An overall absolute improvement of 16% was achieved using tempo adaptation followed by autoencoder based speech front end representation for DNN-HMM based dysarthric speech recognition.
引用
收藏
页码:1854 / 1858
页数:5
相关论文
共 50 条
  • [1] Autoencoder bottleneck features with multi-task optimisation for improved continuous dysarthric speech recognition
    Yue, Zhengjun
    Christensen, Heidi
    Barker, Jon
    [J]. INTERSPEECH 2020, 2020, : 4581 - 4585
  • [2] The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition
    Prananta, Luke
    Halpern, Bence Mark
    Feng, Siyuan
    Scharenborg, Odette
    [J]. INTERSPEECH 2022, 2022, : 36 - 40
  • [3] Dysarthric Speech Recognition Based on Deep Metric Learning
    Takashima, Yuki
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    [J]. INTERSPEECH 2020, 2020, : 4796 - 4800
  • [4] A Survey of Automatic Speech Recognition for Dysarthric Speech
    Qian, Zhaopeng
    Xiao, Kejing
    [J]. ELECTRONICS, 2023, 12 (20)
  • [5] Improved Acoustic Modeling for Automatic Dysarthric Speech Recognition
    Sriranjani, R.
    Reddy, M. Ramasubba
    Umesh, S.
    [J]. 2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
  • [6] Deep Learning of Speech Features for Improved Phonetic Recognition
    Lee, Jaehyung
    Lee, Soo-Young
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259
  • [7] Dysarthric Speech Recognition using Time-delay Neural Network based Denoising Autoencoder
    Bhat, Chitralekha
    Das, Biswajit
    Vachhani, Bhavik
    Kopparapu, Sunil Kumar
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 451 - 455
  • [8] Optimization of dysarthric speech recognition
    Chen, FX
    Kostov, A
    [J]. PROCEEDINGS OF THE 19TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 19, PTS 1-6: MAGNIFICENT MILESTONES AND EMERGING OPPORTUNITIES IN MEDICAL ENGINEERING, 1997, 19 : 1436 - 1439
  • [9] Dysarthric Speech Transformer: A Sequence-to-Sequence Dysarthric Speech Recognition System
    Shahamiri, Seyed Reza
    Lal, Vanshika
    Shah, Dhvani
    [J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2023, 31 : 3407 - 3416
  • [10] Deep Learning-Based Acoustic Feature Representations for Dysarthric Speech Recognition
    Latha M.
    Shivakumar M.
    Manjula G.
    Hemakumar M.
    Kumar M.K.
    [J]. SN Computer Science, 4 (3)