Voice pathology detection on spontaneous speech data using deep learning models

被引:0
|
作者
Farazi, Sahar [1 ]
Shekofteh, Yasser [1 ]
机构
[1] Intelligent Sound Processing Laboratory (ISP-Lab), Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran
关键词
Automatic voice pathology detection; Spontaneous speech; Deep learning; MFCC; Mel-spectrogram; CNN;
D O I
10.1007/s10772-024-10134-4
中图分类号
学科分类号
摘要
Speech problems are a common issue that affects people everywhere and can affect the quality of their lives. The human speech production system involves various components. Dysfunction of any of these components can disrupt normal speech production, giving rise to speech diseases like laryngopharyngeal reflux, vocal cord paralysis, and vocal fold nodules. Early diagnosis of these disorders is very important for the patient's health. Many studies in automatic diagnosis of voice pathology have used sustained vowel sounds and read-speech as the primary source of speech data. However, it is crucial to recognize the unique value of spontaneous-speech. In addition to inheriting the characteristics of read speech, spontaneous-speech offers a more authentic glimpse into individuals' speech behavior. It captures not only linguistic features, but also subtle nuances of human emotions, such as fatigue and excitement, which may cause speech impairments, and shows their patterns in the speech signal better than in the read-speech data. Therefore, we aim to explore spontaneous speech in voice pathology detection to determine if it can help us better understand speech problems. In this research, we examine different deep learning (DL) models trained on two main features (MFCC and Mel spectrograms) for binary classification of healthy speech versus pathological speech, with a specific focus on the spontaneous speech. Extensive experimentation reveals the superiority of our proposed convolutional neural network (CNN) model trained on MFCC features. Notably, the CNN model achieves the highest accuracy, approximately 85% for test data and 92% for evaluation data. These results emphasize the potential of DL approaches in the accurate diagnosis of speech disorders through the analysis of the spontaneous-speech, offering promise for early detection and improved patient care.
引用
下载
收藏
页码:739 / 751
页数:12
相关论文
共 50 条
  • [1] Voice disorder classification using speech enhancement and deep learning models
    Chaiani, Mounira
    Selouani, Sid Ahmed
    Boudraa, Malika
    Yakoub, Mohammed Sidi
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2022, 42 (02) : 463 - 480
  • [2] Voice Pathology Detection Using Deep Learning on Mobile Healthcare Framework
    Alhussein, Musaed
    Muhammad, Ghulam
    IEEE ACCESS, 2018, 6 : 41034 - 41041
  • [3] Deep Learning Approach for Voice Pathology Detection and Classification
    Mittal, Vikas
    Sharma, R. K.
    INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2021, 16 (04)
  • [4] Voice disorder detection using machine learning algorithms: An application in speech and language pathology
    Rehman, Mujeeb Ur
    Shafique, Arslan
    Azhar, Qurat-Ul-Ain
    Jamal, Sajjad Shaukat
    Gheraibia, Youcef
    Usman, Aminu Bello
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [5] Voice pathology detection by using the deep network architecture
    Ankishan, Haydar
    Inam, Sitki Cagdas
    APPLIED SOFT COMPUTING, 2021, 106
  • [6] Voice Pathology Detection Using Machine Learning Technique
    AL-Dhief, Fahad Taha
    Mu, Nurul
    Abd Malik, Nik Noordini Nik
    Sabri, Naseer
    Baki, Marina Mat
    Albadr, Musatafa Abbas Abbood
    Abbas, Aymen Fadhil
    Hussein, Yaqdhan Mahmood
    Mohammed, Mazin Abed
    2020 IEEE 5TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATION TECHNOLOGIES (ISTT), 2020, : 99 - 104
  • [7] Automatic Speech and Voice Disorder Detection Using Deep Learning-A Systematic Literature Review
    Sindhu, Irum
    Sainin, Mohd Shamrie
    IEEE ACCESS, 2024, 12 : 49667 - 49681
  • [8] Electroencephalogram pathology detection using deep learning
    Aslam, Mohamed
    Jaisharma, K.
    Mahalakshmi, D.
    Test Engineering and Management, 2019, 81 (11-12): : 5587 - 5591
  • [9] A survey on hate speech detection and sentiment analysis using machine learning and deep learning models
    Subramanian, Malliga
    Sathiskumar, Veerappampalayam Easwaramoorthy
    Deepalakshmi, G.
    Cho, Jaehyuk
    Manikandan, G.
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 80 : 110 - 121
  • [10] Gender Detection Using Voice Through Deep Learning
    Enriquez, Vanessa Garza
    Singh, Madhusudan
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2021, 2022, 13184 : 548 - 555