Comparative analysis of deep learning models for dysarthric speech detection

被引:0
|
作者
Shanmugapriya, P. [1 ]
Mohan, V. [1 ]
机构
[1] Saranathan Coll Engn, Dept Elect & Commun Engn, Tiruchirappalli 620012, Tamil Nadu, India
关键词
Deep learning; Dysarthria detection; Wavelet transformation; Pre-trained CNNs; INTELLIGIBILITY; RECOGNITION;
D O I
10.1007/s00500-023-09302-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dysarthria is a speech communication disorder that is associated with neurological impairments. To detect this disorder from speech, we present an experimental comparison of deep models developed based on frequency domain features. A comparative analysis of deep models is performed in the detection of dysarthria using scalogram of dysarthric speech. Also, it can assist physicians, specialists, and doctors based on the results of its detection. Since dysarthric speech signals have segments of breathy and semi-whispery, experiments are performed only on the frequency-domain representation of speech signals. Time-domain speech signal is transformed into a 2-D scalogram image through wavelet transformation. Then, the scalogram images are applied to pre-trained convolutional neural networks. The layers of pre-trained networks are tuned for our scalogram images through transfer learning. The proposed method of applying the scalogram images as input to pre-trained CNNs is evaluated on the TORGO database and the classification performance of these networks is compared. In this work, AlexNet, GoogLeNet, ResNet 50 and two pre-trained sound CNNs, namely VGGish and YAMNET are considered deep models of pre-trained convolutional neural networks. The proposed method of using pre-trained and transfer learned CNN with scalogram image feature achieved better accuracy when compared to other machine learning models in the dysarthria detection system.
引用
收藏
页码:5683 / 5698
页数:16
相关论文
共 50 条
  • [1] Comparative analysis of deep learning models for dysarthric speech detection
    P. Shanmugapriya
    V. Mohan
    [J]. Soft Computing, 2024, 28 : 5683 - 5698
  • [2] Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
    Korzekwa, Daniel
    Barra-Chicote, Roberto
    Kostek, Bozena
    Drugman, Thomas
    Lajszczak, Mateusz
    [J]. INTERSPEECH 2019, 2019, : 3890 - 3894
  • [3] Comparative landmark detection on stops of dysarthric speech
    Sunitha, S. V.
    Shivaputra
    Soundeswaran, S.
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 79
  • [4] Dysarthric Speech Recognition Based on Deep Metric Learning
    Takashima, Yuki
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    [J]. INTERSPEECH 2020, 2020, : 4796 - 4800
  • [5] Comparative Analysis of Deep Learning Models for Part of Speech Tagging in the Malay Language
    Adebayo, Bakare Mustaphaa
    Anbananthen, Kalaiarasi Sonai Muthu
    Muthaiyah, Saravanan
    Lurudusamy, Saravanan Nathan
    [J]. HighTech and Innovation Journal, 2024, 5 (02): : 272 - 281
  • [6] EXPERIMENTAL INVESTIGATION ON STFT PHASE REPRESENTATIONS FOR DEEP LEARNING-BASED DYSARTHRIC SPEECH DETECTION
    Janbakhshi, Parvaneh
    Kodrasi, Ina
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6477 - 6481
  • [7] Comparative analysis of deep learning based Afaan Oromo hate speech detection
    Gaddisa Olani Ganfure
    [J]. Journal of Big Data, 9
  • [8] Comparative analysis of deep learning based Afaan Oromo hate speech detection
    Ganfure, Gaddisa Olani
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [9] Comparative Analysis of Deep Learning Models for Sheep Detection in Aerial Imagery
    Ismail, Muhammad Syahmie
    Samad, Rosdiyana
    Pebrianti, Dwi
    Mustafa, Mahfuzah
    Abdullah, Nor Rul Hasma
    [J]. 9TH INTERNATIONAL CONFERENCE ON MECHATRONICS ENGINEERING, ICOM 2024, 2024, : 234 - 239
  • [10] Comparative Analysis of Banana Detection Models: Deep Learning and Darknet Algorithm
    Rangkuti, Abdul Haris
    Lau, Sian Lun
    Athala, Varyl Hasbi
    Aryanto, Rudi
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2024, 15 (04) : 355 - 367