Gammatone spectral latitude features extraction for pathological voice detection and classification

被引:18
|
作者
Zhou, Changwei [1 ]
Wu, Yuanbo [1 ]
Fan, Ziqi [1 ]
Zhang, Xiaojun [1 ]
Wu, Di [1 ]
Tao, Zhi [1 ]
机构
[1] Soochow Univ, Sch Optoelect Sci & Engn, 1 Shizi St, Suzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Pathological voice; Gammatone spectral latitude features; Human auditory characteristic; Machine learning; ACOUSTIC ANALYSIS; ENTROPY FEATURES; ROBUST; MODEL; PARAMETER; ENERGY;
D O I
10.1016/j.apacoust.2021.108417
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To improve the performance of pathological voice detection and classification, gammatone spectral latitude (GTSL) features were proposed. GTSL features are inspired by the nonlinear phenomena produced from the human phonation, presenting explicit physiological meaning. The features combine with human auditory perception characteristics. GTSL features quantify the turbulent noise by the nonlinear compression of peak value and dynamic range of the spectrums in each frequency channel. For pathological voice detection, gammatone spectral latitude (GTSL) features fitted better with traditional machine learning algorithms than traditional nonlinear features and gammatone ceptral coefficients (GTCCs). In the classification between healthy, neuromuscular and structural voices, the proposed features achieved average accuracy of 99.6% in the Massachusetts Eye and Ear Infirmary (MEEI) database, which is 35.6% higher than other gammatone features. The accuracies in other database, Saarbruecken Voice Database (SVD) and Hospital Universitario Principe de Asturias (HUPA), were 89.9% and 97.4% respectively. The experimental results indicate that, GTSL features can provide objective evaluation of voice diseases with low computational complexity and database dependency. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Pathological voice detection based on gammatone short time spectral self-similarity
    Zhao D.
    Zhou C.
    Zhu X.
    Zhang X.
    Tao Z.
    Shengwu Yixue Gongchengxue Zazhi/Journal of Biomedical Engineering, 2022, 39 (04): : 694 - 701
  • [2] Pathological voice detection and binary classification using MPEG-7 audio features
    Muhammad, Ghulam
    Melhem, Moutasem
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2014, 11 : 1 - 9
  • [3] Pathological voice classification based on features dimension optimization
    School of Precision Instrument and Opto-Electronics Engineering, Tianjin University, Tianjin 300072, China
    不详
    Trans. Tianjin Univ., 2007, 6 (456-461):
  • [4] Pathological Voice Classification Based on Features Dimension Opti mization
    彭策
    徐秋晶
    万柏坤
    陈文西
    Transactions of Tianjin University, 2007, (06) : 456 - 461
  • [5] Spectral Envelope and Periodic Component in Classification Trees for Pathological Voice Diagnostic
    Cordeiro, H.
    Fonseca, J.
    Meneses, C.
    2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 4607 - 4610
  • [6] GAMMATONE WAVELET FEATURES FOR SOUND CLASSIFICATION IN SURVEILLANCE APPLICATIONS
    Valero, Xavier
    Alias, Francesc
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 1658 - 1662
  • [7] Anomaly Detection in Voice Conversations through Spectral Features
    Akilandeswari, T.
    Balaji, Navuluri
    Nithish, Paidimarri
    Yadav, Go Su Neeraj
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 124 - 131
  • [8] Analysis and Detection of Pathological Voice Using Glottal Source Features
    Kadiri, Sudarsana Reddy
    Alku, Paavo
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (02) : 367 - 379
  • [9] Pathological voice detection using efficient combination of heterogeneous features
    Lee, Ji-Yeoun
    Jeong, Sangbae
    Hahn, Minsoo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (02): : 367 - 370
  • [10] Pathological voice classification based on a single vowel's acoustic features
    Peng, Ce
    Chen, Wenxi
    Zhu, Xin
    Wan, Baikun
    Wei, Daming
    2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 1106 - +