The Effect of Noise on Deep Learning for Classification of Pathological Voice

被引:1
|
作者
Hasebe, Koki [1 ]
Kojima, Tsuyoshi [1 ,2 ]
Fujimura, Shintaro [1 ]
Tamura, Keiichi [1 ]
Kawai, Yoshitaka [1 ]
Kishimoto, Yo [1 ]
Omori, Koichi [1 ]
机构
[1] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto, Japan
[2] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, 54 Shogoin Kawahara Cho,Sakyo Ku, Kyoto 6068507, Japan
来源
LARYNGOSCOPE | 2024年 / 134卷 / 08期
基金
日本学术振兴会;
关键词
1D-CNN; GRBAS scale; machine learning; noise resilience; voice disorders;
D O I
10.1002/lary.31303
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveThis study aimed to evaluate the significance of background noise in machine learning models assessing the GRBAS scale for voice disorders.MethodsA dataset of 1406 voice samples was collected from retrospective data, and a 5-layer 1D convolutional neural network (CNN) model was constructed using TensorFlow. The dataset was divided into training, validation, and test data. Gaussian noise was added to test samples at various intensities to assess the model's noise resilience. The model's performance was evaluated using accuracy, F1 score, and quadratic weighted Cohen's kappa score.ResultsThe model's performance on the GRBAS scale generally declined with increasing noise intensities. For the G scale, accuracy dropped from 70.9% (original) to 8.5% (at the highest noise), F1 score from 69.2% to 1.3%, and Cohen's kappa from 0.679 to 0.0. Similar declines were observed for the remaining RBAS components.ConclusionThe model's performance was affected by background noise, with substantial decreases in evaluation metrics as noise levels intensified. Future research should explore noise-tolerant techniques, such as data augmentation, to improve the model's noise resilience in real-world settings.Level of EvidenceThis study evaluates a machine learning model using a single dataset without comparative controls. Given its non-comparative design and specific focus, it aligns with Level 4 evidence (Case-series) under the 2011 OCEBM guidelines Laryngoscope, 2024
引用
收藏
页码:3537 / 3541
页数:5
相关论文
共 50 条
  • [31] Edge preserving noise robust deep learning networks for vehicle classification
    Kiran, V. Keerthi
    Dash, Sonali
    Parida, Priyadarsan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022,
  • [32] Sparse Representations in Deep Learning for Noise-Robust Digit Classification
    Ghifary, Muhammad
    Kleijn, W. Bastiaan
    Zhang, Mengjie
    PROCEEDINGS OF 2013 28TH INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ 2013), 2013, : 340 - 345
  • [33] Pathological voice classification based on features dimension optimization
    School of Precision Instrument and Opto-Electronics Engineering, Tianjin University, Tianjin 300072, China
    不详
    Trans. Tianjin Univ., 2007, 6 (456-461):
  • [34] Using SincNet for Learning Pathological Voice Disorders
    Hung, Chao-Hsiang
    Wang, Syu-Siang
    Wang, Chi-Te
    Fang, Shih-Hau
    SENSORS, 2022, 22 (17)
  • [35] Voice disorder classification using convolutional neural network based on deep transfer learning
    Peng, Xiangyu
    Xu, Huoyao
    Liu, Jie
    Wang, Junlang
    He, Chaoming
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [36] Voice disorder classification using convolutional neural network based on deep transfer learning
    Xiangyu Peng
    Huoyao Xu
    Jie Liu
    Junlang Wang
    Chaoming He
    Scientific Reports, 13
  • [37] A learning framework of modified deep recurrent neural network for classification and recognition of voice mood
    Agarwal, Gaurav
    Om, Hari
    Gupta, Sachi
    INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2022, 36 (08) : 1835 - 1859
  • [38] Combining acoustic features and medical data in deep learning networks for voice pathology classification
    Miliaresi, Ioanna
    Poutos, Kyriakos
    Pikrakis, Aggelos
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1190 - 1194
  • [39] NORMALIZED NOISE ENERGY AS AN ACOUSTIC MEASURE TO EVALUATE PATHOLOGICAL VOICE
    KASUYA, H
    OGAWA, S
    MASHIMA, K
    EBIHARA, S
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 80 (05): : 1329 - 1334
  • [40] Dermoscopic Image Classification of Pigmented Nevus under Deep Learning and the Correlation with Pathological Features
    Yang, Shuang
    Shu, Chunmei
    Hu, Haiyou
    Ma, Guanghui
    Yang, Min
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2022, 2022