Gammatone spectral latitude features extraction for pathological voice detection and classification

被引:18
|
作者
Zhou, Changwei [1 ]
Wu, Yuanbo [1 ]
Fan, Ziqi [1 ]
Zhang, Xiaojun [1 ]
Wu, Di [1 ]
Tao, Zhi [1 ]
机构
[1] Soochow Univ, Sch Optoelect Sci & Engn, 1 Shizi St, Suzhou, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Pathological voice; Gammatone spectral latitude features; Human auditory characteristic; Machine learning; ACOUSTIC ANALYSIS; ENTROPY FEATURES; ROBUST; MODEL; PARAMETER; ENERGY;
D O I
10.1016/j.apacoust.2021.108417
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To improve the performance of pathological voice detection and classification, gammatone spectral latitude (GTSL) features were proposed. GTSL features are inspired by the nonlinear phenomena produced from the human phonation, presenting explicit physiological meaning. The features combine with human auditory perception characteristics. GTSL features quantify the turbulent noise by the nonlinear compression of peak value and dynamic range of the spectrums in each frequency channel. For pathological voice detection, gammatone spectral latitude (GTSL) features fitted better with traditional machine learning algorithms than traditional nonlinear features and gammatone ceptral coefficients (GTCCs). In the classification between healthy, neuromuscular and structural voices, the proposed features achieved average accuracy of 99.6% in the Massachusetts Eye and Ear Infirmary (MEEI) database, which is 35.6% higher than other gammatone features. The accuracies in other database, Saarbruecken Voice Database (SVD) and Hospital Universitario Principe de Asturias (HUPA), were 89.9% and 97.4% respectively. The experimental results indicate that, GTSL features can provide objective evaluation of voice diseases with low computational complexity and database dependency. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] A Joint Time-Frequency and Matrix Decomposition Feature Extraction Methodology for Pathological Voice Classification
    Behnaz Ghoraani
    Sridhar Krishnan
    EURASIP Journal on Advances in Signal Processing, 2009
  • [22] MSDFEN: Multi-scale dynamic feature extraction network for pathological voice detection
    Dai, Zhiyuan
    Jiang, Yuyang
    Cao, Laiyuan
    Zhang, Xiaojun
    Tao, Zhi
    APPLIED ACOUSTICS, 2025, 230
  • [23] UNDERWATER TARGET FEATURE EXTRACTION AND CLASSIFICATION BASED ON GAMMATONE FILTER AND MACHINE LEARNING
    Zhang, Wen
    Wu, Yanqun
    Wang, Dezhi
    Wang, Yongxian
    Wang, Yibo
    Zhang, Lilun
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2018, : 42 - 47
  • [24] Comparison of GIF- and SSL-based features in pathological-voice detection
    Sasou, Akira
    Chen, Yang
    INTERSPEECH 2023, 2023, : 2893 - 2897
  • [25] Ensemble and Multimodal Learning for Pathological Voice Classification
    Ariyanti, Whenty
    Hussain, Tassadaq
    Wang, Jia-Ching
    Wang, Chi-Tei
    Fang, Shih-Hau
    Tsao, Yu
    IEEE SENSORS LETTERS, 2021, 5 (07) : 1 - 4
  • [26] Normalized Modulation Spectral Features for Cross-Database Voice Pathology Detection
    Markaki, Maria
    Stylianou, Yannis
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 952 - 955
  • [27] Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification
    Valero, Xavier
    Alias, Francesc
    IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (06) : 1684 - 1689
  • [28] Spectral-Spatial Classification and Shape Features for Urban Road Centerline Extraction
    Shi, Wenzhong
    Miao, Zelang
    Wang, Qunming
    Zhang, Hua
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2014, 11 (04) : 788 - 792
  • [29] Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine
    Wang, Junlang
    Xu, Huoyao
    Peng, Xiangyu
    Liu, Jie
    He, Chaoming
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (01): : 423 - 435
  • [30] Pathological voice classification based on the features of an asymmetric fluid-structure interaction vocal cord model
    Zhang, Xiaojun
    Zhu, Xincheng
    Zhou, Changwei
    Tao, Zhi
    Zhao, Heming
    APPLIED ACOUSTICS, 2023, 207