Exploring Spectrogram-Based Audio Classification for Parkinson's Disease: A Study on Speech Classification and Qualitative Reliability Verification

被引:0
|
作者
Jeong, Seung-Min [1 ]
Kim, Seunghyun [1 ]
Lee, Eui Chul [2 ]
Kim, Han Joon [3 ]
机构
[1] Sangmyung Univ, Grad Sch, Dept AI & Informat, Hongjimun 2 Gil 20, Seoul 03016, South Korea
[2] Sangmyung Univ, Dept Human Centered Artificial Intelligence, Hongjimun 2 Gil 20, Seoul 03016, South Korea
[3] Seoul Natl Univ, Coll Med, Seoul Natl Univ Hosp, Dept Neurol, Daehak Ro 101, Seoul 03080, South Korea
基金
新加坡国家研究基金会;
关键词
PSLA; AST; explainable AI; Parkinson's disease; speech classification;
D O I
10.3390/s24144625
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Patients suffering from Parkinson's disease suffer from voice impairment. In this study, we introduce models to classify normal and Parkinson's patients using their speech. We used an AST (audio spectrogram transformer), a transformer-based speech classification model that has recently outperformed CNN-based models in many fields, and a CNN-based PSLA (pretraining, sampling, labeling, and aggregation), a high-performance model in the existing speech classification field, for the study. This study compares and analyzes the models from both quantitative and qualitative perspectives. First, qualitatively, PSLA outperformed AST by more than 4% in accuracy, and the AUC was also higher, with 94.16% for AST and 97.43% for PSLA. Furthermore, we qualitatively evaluated the ability of the models to capture the acoustic features of Parkinson's through various CAM (class activation map)-based XAI (eXplainable AI) models such as GradCAM and EigenCAM. Based on PSLA, we found that the model focuses well on the muffled frequency band of Parkinson's speech, and the heatmap analysis of false positives and false negatives shows that the speech features are also visually represented when the model actually makes incorrect predictions. The contribution of this paper is that we not only found a suitable model for diagnosing Parkinson's through speech using two different types of models but also validated the predictions of the model in practice.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Spectrogram-Based Audio Classification of Nutrition Intake
    Kalantarian, Haik
    Alshurafa, Nabil
    Pourhomayoun, Mohammad
    Sarin, Shruti
    Le, Tuan
    Sarrafzadeh, Majid
    2014 IEEE HEALTHCARE INNOVATION CONFERENCE (HIC), 2014, : 161 - 164
  • [2] A comprehensive study based on MFCC and spectrogram for audio classification
    Rawat, Priyanshu
    Bajaj, Madhvan
    Vats, Satvik
    Sharma, Vikrant
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2023, 44 (06): : 1057 - 1074
  • [3] Parkinson's Disease patients classification based on the speech signals
    Vadovsky, Michal
    Paralic, Jan
    2017 IEEE 15TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI), 2017, : 321 - 325
  • [4] Classification of speech intelligibility in Parkinson's disease
    Khan, Taha
    Westin, Jerker
    Dougherty, Mark
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2014, 34 (01) : 35 - 45
  • [5] Musical Feature Based Classification of Parkinson's Disease Using Dysphonic Speech
    Kurt, Ilke
    Ulukaya, Sezer
    Erdem, Oguzhan
    2018 41ST INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2018, : 405 - 408
  • [6] An algorithm for Parkinson's disease speech classification based on isolated words analysis
    Amato, Federica
    Borzi, Luigi
    Olmo, Gabriella
    Orozco-Arroyave, Juan Rafael
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2021, 9 (01)
  • [7] An algorithm for Parkinson’s disease speech classification based on isolated words analysis
    Federica Amato
    Luigi Borzì
    Gabriella Olmo
    Juan Rafael Orozco-Arroyave
    Health Information Science and Systems, 9
  • [8] Effectiveness of Speech Analysis in Classification of Neurodegenerative Diseases: A study on Parkinson's Disease
    Appakaya, Sai Bharadwaj
    Sankar, Ravi
    IEEE SOUTHEASTCON 2018, 2018,
  • [9] Parkinson's Disease Recognition by Speech Acoustic Parameters Classification
    Meghraoui, D.
    Boudraa, B.
    Merazi-Meksen, T.
    Boudraa, M.
    MODELLING AND IMPLEMENTATION OF COMPLEX SYSTEMS, MISC 2016, 2016, : 165 - 173
  • [10] Identifying Parkinson's Disease Through the Classification of Audio Recording Data
    Bielby, James
    Kuhn, Stefan
    Colreavy-Donnelly, Simon
    Caraffini, Fabio
    O'Connor, Stuart
    Anastassi, Zacharias A.
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,