Gammatone spectral latitude features extraction for pathological voice detection and classification

被引:18
|
作者
Zhou, Changwei [1 ]
Wu, Yuanbo [1 ]
Fan, Ziqi [1 ]
Zhang, Xiaojun [1 ]
Wu, Di [1 ]
Tao, Zhi [1 ]
机构
[1] Soochow Univ, Sch Optoelect Sci & Engn, 1 Shizi St, Suzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Pathological voice; Gammatone spectral latitude features; Human auditory characteristic; Machine learning; ACOUSTIC ANALYSIS; ENTROPY FEATURES; ROBUST; MODEL; PARAMETER; ENERGY;
D O I
10.1016/j.apacoust.2021.108417
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To improve the performance of pathological voice detection and classification, gammatone spectral latitude (GTSL) features were proposed. GTSL features are inspired by the nonlinear phenomena produced from the human phonation, presenting explicit physiological meaning. The features combine with human auditory perception characteristics. GTSL features quantify the turbulent noise by the nonlinear compression of peak value and dynamic range of the spectrums in each frequency channel. For pathological voice detection, gammatone spectral latitude (GTSL) features fitted better with traditional machine learning algorithms than traditional nonlinear features and gammatone ceptral coefficients (GTCCs). In the classification between healthy, neuromuscular and structural voices, the proposed features achieved average accuracy of 99.6% in the Massachusetts Eye and Ear Infirmary (MEEI) database, which is 35.6% higher than other gammatone features. The accuracies in other database, Saarbruecken Voice Database (SVD) and Hospital Universitario Principe de Asturias (HUPA), were 89.9% and 97.4% respectively. The experimental results indicate that, GTSL features can provide objective evaluation of voice diseases with low computational complexity and database dependency. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine
    Wang, Junlang
    Xu, Huoyao
    Peng, Xiangyu
    Liu, Jie
    He, Chaoming
    Journal of the Acoustical Society of America, 2023, 153 (01): : 423 - 435
  • [32] Spectral Features for Emotion Classification
    Koolagudi, Shashidhar G.
    Nandy, Sourav
    Rao, K. Sreenivasa
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1292 - 1296
  • [33] The Effect of Noise on Deep Learning for Classification of Pathological Voice
    Hasebe, Koki
    Kojima, Tsuyoshi
    Fujimura, Shintaro
    Tamura, Keiichi
    Kawai, Yoshitaka
    Kishimoto, Yo
    Omori, Koichi
    LARYNGOSCOPE, 2024, 134 (08): : 3537 - 3541
  • [34] Multiple features extraction and selection for detection and classification of stator winding faults
    Haroun, Smail
    Seghir, Amirouche Nait
    Touati, Said
    IET ELECTRIC POWER APPLICATIONS, 2018, 12 (03) : 339 - 346
  • [35] MULTIPLE FEATURES EXTRACTION FOR TIMBER DEFECTS DETECTION AND CLASSIFICATION USING SVM
    Hittawe, Mohamad Mazen
    Muddamsetty, Satya M.
    Sidibe, Desire
    Meriaudeatt, Fabrice
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 427 - 431
  • [36] Classifier Based Early Detection of Pathological Voice
    Islam, Rumana
    Tarique, Mohammed
    2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
  • [37] Text-dependent pathological voice detection
    Anumanchipalli, Gopala Krishna
    Meinedo, Hugo
    Bugalho, Miguel
    Trancoso, Isabel
    Oliveira, Luis C.
    Black, Alan W.
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 530 - 533
  • [38] Glottal Source Information for Pathological Voice Detection
    Narendra, N. P.
    Alku, Paavo
    IEEE ACCESS, 2020, 8 : 67745 - 67755
  • [39] Convolutional Neural Networks for Pathological Voice Detection
    Wu, Huiyi
    Soraghan, John
    Lowit, Anja
    Di Caterina, Gaetano
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 4784 - 4787
  • [40] Automatic age detection in normal and pathological voice
    Gomez-Garcia, J-A.
    Moro-Velazquez, L.
    Godino-Llorente, J-I.
    Castellanos-Dominguez, G.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3739 - 3743