Classification of Voice Disorders Using a One-Dimensional Convolutional Neural Network

被引:18
|
作者
Fujimura, Shintaro [1 ]
Kojima, Tsuyoshi [2 ]
Okanoue, Yusuke [2 ]
Shoji, Kazuhiko [2 ]
Inoue, Masato [3 ]
Omori, Koichi [1 ]
Hori, Ryusuke [2 ]
机构
[1] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto, Japan
[2] Tenri Hosp, Dept Otolaryngol, Tenri, Nara, Japan
[3] Waseda Univ, Sch Adv Sci & Engn, Dept Elect Engn & Biosci, Shinjuku Ku, Tokyo, Japan
关键词
Auditory perceptual voice analysis; GRBAS scale; Voice disorder; Deep learning; One-dimensional convolutional neural network; QUALITY; GRBAS; RELIABILITY; INDEX;
D O I
10.1016/j.jvoice.2020.02.009
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives. Auditory-perceptual voice analysis is a standard method for quantifying pathological voice quality, but perceptual ratings are based on subjective evaluations and therefore may vary among examiners. Although many acoustic metrics have been studied for potential use in the objective evaluation of pathological voices, the interpretation of acoustic metrics in individual cases is difficult and the technique is not widely used by clinicians. The aim of this study was to establish standardized methods to discriminate grade, roughness, breathiness, asthenia, strain (GRBAS) scale scores of pathological voices directly using one-dimensional convolutional neural network (1D-CNN) models. Methods. We constructed an original dataset utilizing 1,377 voice samples of sustained phonation of the vowel /a/. Each voice sample was rated by three experts according to the GRBAS scale and the median values were used as the correct answer label. We designed an end-to-end 1D-CNN model with a raw voice waveform input having a frame width of 9,600 samples. The models were trained with our original dataset for each GRBAS category individually and the model performance was tested by the five-fold cross validation method. Results. The accuracy, F1 score, and quadratic weighted Cohen's kappa for the testing dataset were determined. The metrics for the G scale showed the most balanced model performance, with high accuracy (0.771) and substantial agreement (kappa = 0.710). The model for the R scale had relatively high accuracy (0.765) and F1 score (0.743) with moderate agreement (kappa = 0.536). The accuracy (0.883) and the F1 score (0.865) for the S scale were the highest among the five categories, whereas the Cohen's kappa was the lowest (0.190). Conclusions. The end-to-end 1D-CNN models can evaluate overall pathological voice quality with a reliability comparable to human evaluations. The efficiency with which the machine learning models can be trained and evaluated is closely related to the dataset quality.
引用
收藏
页码:15 / 20
页数:6
相关论文
共 50 条
  • [41] Event Prediction in Complex Social Graphs using One-Dimensional Convolutional Neural Network
    Molokwu, Bonaventure
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 6450 - 6451
  • [42] Switch ON/OFF learning of one-dimensional convolutional neural network and one-dimensional generative adversarial network for fault detection
    Song, Seunghwan
    Chang, Kyuchang
    Park, Cheolsoon
    Baek, Jun-Geol
    JOURNAL OF INTELLIGENT MANUFACTURING, 2025,
  • [43] Classification of coma/brain-death EEG dataset based on one-dimensional convolutional neural network
    Li, Boning
    Cao, Jianting
    COGNITIVE NEURODYNAMICS, 2024, 18 (03) : 961 - 972
  • [44] Mixture Gases Classification Based on Multi-Label One-Dimensional Deep Convolutional Neural Network
    Zhao, Xiaojin
    Wen, Zhihuang
    Pan, Xiaofang
    Ye, Wenbin
    Bermak, Amine
    IEEE ACCESS, 2019, 7 : 12630 - 12637
  • [45] Voice Pathology Detection and Classification Using Convolutional Neural Network Model
    Mohammed, Mazin Abed
    Abdulkareem, Karrar Hameed
    Mostafa, Salama A.
    Abd Ghani, Mohd Khanapi
    Maashi, Mashael S.
    Garcia-Zapirain, Begonya
    Oleagordia, Ibon
    Alhakami, Hosam
    AL-Dhief, Fahad Taha
    APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [46] Soft reordering one-dimensional convolutional neural network for credit scoring
    Qian, Hongyi
    Ma, Ping
    Gao, Songfeng
    Song, You
    KNOWLEDGE-BASED SYSTEMS, 2023, 266
  • [47] A Haze Prediction Method Based on One-Dimensional Convolutional Neural Network
    Zhang, Ziyan
    Tian, Jiawei
    Huang, Weizheng
    Yin, Lirong
    Zheng, Wenfeng
    Liu, Shan
    ATMOSPHERE, 2021, 12 (10)
  • [48] One-dimensional convolutional neural network for children's sleep staging
    Xu L.
    Wu Y.-X.
    Xiao B.
    Xu Z.-F.
    Zhang Y.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2021, 43 (09): : 1253 - 1260
  • [49] Structural Damage Detection Based on One-Dimensional Convolutional Neural Network
    Xue, Zhigang
    Xu, Chenxu
    Wen, Dongdong
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [50] One-dimensional convolutional neural network for Jacobian in Diffuse Optical Tomography
    Yi, Huangjian
    Yang, Ruigang
    He, Xuelei
    Guo, Hongbo
    Wang, Beilei
    Hou, Yuqing
    He, Xiaowei
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,