Classification of Voice Disorders Using a One-Dimensional Convolutional Neural Network

被引:18
|
作者
Fujimura, Shintaro [1 ]
Kojima, Tsuyoshi [2 ]
Okanoue, Yusuke [2 ]
Shoji, Kazuhiko [2 ]
Inoue, Masato [3 ]
Omori, Koichi [1 ]
Hori, Ryusuke [2 ]
机构
[1] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto, Japan
[2] Tenri Hosp, Dept Otolaryngol, Tenri, Nara, Japan
[3] Waseda Univ, Sch Adv Sci & Engn, Dept Elect Engn & Biosci, Shinjuku Ku, Tokyo, Japan
关键词
Auditory perceptual voice analysis; GRBAS scale; Voice disorder; Deep learning; One-dimensional convolutional neural network; QUALITY; GRBAS; RELIABILITY; INDEX;
D O I
10.1016/j.jvoice.2020.02.009
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Objectives. Auditory-perceptual voice analysis is a standard method for quantifying pathological voice quality, but perceptual ratings are based on subjective evaluations and therefore may vary among examiners. Although many acoustic metrics have been studied for potential use in the objective evaluation of pathological voices, the interpretation of acoustic metrics in individual cases is difficult and the technique is not widely used by clinicians. The aim of this study was to establish standardized methods to discriminate grade, roughness, breathiness, asthenia, strain (GRBAS) scale scores of pathological voices directly using one-dimensional convolutional neural network (1D-CNN) models. Methods. We constructed an original dataset utilizing 1,377 voice samples of sustained phonation of the vowel /a/. Each voice sample was rated by three experts according to the GRBAS scale and the median values were used as the correct answer label. We designed an end-to-end 1D-CNN model with a raw voice waveform input having a frame width of 9,600 samples. The models were trained with our original dataset for each GRBAS category individually and the model performance was tested by the five-fold cross validation method. Results. The accuracy, F1 score, and quadratic weighted Cohen's kappa for the testing dataset were determined. The metrics for the G scale showed the most balanced model performance, with high accuracy (0.771) and substantial agreement (kappa = 0.710). The model for the R scale had relatively high accuracy (0.765) and F1 score (0.743) with moderate agreement (kappa = 0.536). The accuracy (0.883) and the F1 score (0.865) for the S scale were the highest among the five categories, whereas the Cohen's kappa was the lowest (0.190). Conclusions. The end-to-end 1D-CNN models can evaluate overall pathological voice quality with a reliability comparable to human evaluations. The efficiency with which the machine learning models can be trained and evaluated is closely related to the dataset quality.
引用
收藏
页码:15 / 20
页数:6
相关论文
共 50 条
  • [1] Convolutional neural network for voice disorders classification using kymograms
    Kumar, S. Pravin
    Narayanan, Nanthini
    Ramachandran, Janaki
    Thangavel, Bhavadharani
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 86
  • [2] Biomolecule classification by multiscale one-dimensional convolutional neural network
    Chang, Chia-En
    [J]. BIOPHYSICAL JOURNAL, 2023, 122 (03) : 141A - 141A
  • [3] A deep one-dimensional convolutional neural network for microplastics classification using Raman spectroscopy
    Zhang, Wei
    Feng, Weiwei
    Cai, Zongqi
    Wang, Huanqing
    Yan, Qi
    Wang, Qing
    [J]. VIBRATIONAL SPECTROSCOPY, 2023, 124
  • [4] Classification of Food Additives Using UV Spectroscopy and One-Dimensional Convolutional Neural Network
    Potarniche, Ioana-Adriana
    Sarosi, Codruta
    Terebes, Romulus Mircea
    Szolga, Lorant
    Galatus, Ramona
    [J]. SENSORS, 2023, 23 (17)
  • [5] Automatic ECG classification using discrete wavelet transform and one-dimensional convolutional neural network
    Armin Shoughi
    Mohammad Bagher Dowlatshahi
    Arefeh Amiri
    Marjan Kuchaki Rafsanjani
    Ranbir Singh Batth
    [J]. Computing, 2024, 106 : 1227 - 1248
  • [6] Automatic ECG classification using discrete wavelet transform and one-dimensional convolutional neural network
    Shoughi, Armin
    Dowlatshahi, Mohammad Bagher
    Amiri, Arefeh
    Rafsanjani, Marjan Kuchaki
    Batth, Ranbir Singh
    [J]. COMPUTING, 2024, 106 (04) : 1227 - 1248
  • [7] Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network
    Bai, Maoyang
    Peng, Peihao
    Zhang, Shiqi
    Wang, Xueman
    Wang, Xiao
    Wang, Juan
    Pellikka, Petri
    [J]. FORESTS, 2023, 14 (09):
  • [8] Gas pipeline event classification based on one-dimensional convolutional neural network
    An, Yang
    Ma, Xueyan
    Wang, Xiaocen
    Qu, Zhigang
    Zhu, Xixin
    Yin, Wuliang
    [J]. STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2022, 21 (03): : 826 - 834
  • [9] Mineral Spectra Classification Based on One-Dimensional Dilated Convolutional Neural Network
    Tian Qing-lin
    Guo Bang-jie
    Ye Fa-wang
    Li Yao
    Liu Peng-fei
    Chen Xue-jiao
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42 (03) : 873 - 877
  • [10] Bearing Fault Diagnosis Using One-Dimensional Convolutional Neural Network
    Gao, Zhanyuan
    Wei, Zhennan
    Chen, Yuan
    Ying, Tianqi
    Gao, Haojie
    [J]. 2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 158 - 162