The Effect of Noise on Deep Learning for Classification of Pathological Voice

被引:1
|
作者
Hasebe, Koki [1 ]
Kojima, Tsuyoshi [1 ,2 ]
Fujimura, Shintaro [1 ]
Tamura, Keiichi [1 ]
Kawai, Yoshitaka [1 ]
Kishimoto, Yo [1 ]
Omori, Koichi [1 ]
机构
[1] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto, Japan
[2] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, 54 Shogoin Kawahara Cho,Sakyo Ku, Kyoto 6068507, Japan
来源
LARYNGOSCOPE | 2024年 / 134卷 / 08期
基金
日本学术振兴会;
关键词
1D-CNN; GRBAS scale; machine learning; noise resilience; voice disorders;
D O I
10.1002/lary.31303
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveThis study aimed to evaluate the significance of background noise in machine learning models assessing the GRBAS scale for voice disorders.MethodsA dataset of 1406 voice samples was collected from retrospective data, and a 5-layer 1D convolutional neural network (CNN) model was constructed using TensorFlow. The dataset was divided into training, validation, and test data. Gaussian noise was added to test samples at various intensities to assess the model's noise resilience. The model's performance was evaluated using accuracy, F1 score, and quadratic weighted Cohen's kappa score.ResultsThe model's performance on the GRBAS scale generally declined with increasing noise intensities. For the G scale, accuracy dropped from 70.9% (original) to 8.5% (at the highest noise), F1 score from 69.2% to 1.3%, and Cohen's kappa from 0.679 to 0.0. Similar declines were observed for the remaining RBAS components.ConclusionThe model's performance was affected by background noise, with substantial decreases in evaluation metrics as noise levels intensified. Future research should explore noise-tolerant techniques, such as data augmentation, to improve the model's noise resilience in real-world settings.Level of EvidenceThis study evaluates a machine learning model using a single dataset without comparative controls. Given its non-comparative design and specific focus, it aligns with Level 4 evidence (Case-series) under the 2011 OCEBM guidelines Laryngoscope, 2024
引用
收藏
页码:3537 / 3541
页数:5
相关论文
共 50 条
  • [41] Combining Deep Learning with Traditional Features for Classification and Segmentation of Pathological Images of Breast Cancer
    He, Simin
    Ruan, Jun
    Long, Yi
    Wang, Jianlian
    Wu, Chenchen
    Ye, Guanglu
    Zhou, Jingfan
    Yue, Junqiu
    Zhang, Yanggeling
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2018, : 3 - 6
  • [42] Classification of Kinematic and Electromyographic Signals Associated with Pathological Tremor Using Machine and Deep Learning
    Pascual-Valdunciel, Alejandro
    Lopo-Martinez, Victor
    Beltran-Carrero, Alberto J.
    Sendra-Arranz, Rafael
    Gonzalez-Sanchez, Miguel
    Ricardo Perez-Sanchez, Javier
    Grandas, Francisco
    Farina, Dario
    Pons, Jose L.
    Oliveira Barroso, Filipe
    Gutierrez, Alvaro
    ENTROPY, 2023, 25 (01)
  • [43] Hologram classification of occluded and deformable objects with speckle noise contamination by deep learning
    Lam, H. H. S.
    Tsang, P. W. M.
    Poon, T-C
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2022, 39 (03) : 411 - 417
  • [44] Analyzing the Influence of Diverse Background Noises on Voice Transmission: A Deep Learning Approach to Noise Suppression
    Nogales, Alberto
    Caracuel-Cayuela, Javier
    Garcia-Tejedor, Alvaro J.
    APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [45] Classification of QPSK Signals with Different Phase Noise Levels Using Deep Learning
    Alhazmi, Hatim
    Almarhabi, Alhussain
    Samarkandi, Abdullah
    Alymani, Mofadal
    Alhazmi, Mohsen H.
    Sheng, Zikang
    Yao, Yu-Dong
    2020 29TH WIRELESS AND OPTICAL COMMUNICATIONS CONFERENCE (WOCC), 2020, : 194 - 198
  • [46] Effects of Label Noise on Deep Learning-Based Skin Cancer Classification
    Hekler, Achim
    Kather, Jakob N.
    Krieghoff-Henning, Eva
    Utikal, Jochen S.
    Meier, Friedegund
    Gellrich, Frank F.
    Belzen, Julius Upmeier Zu
    French, Lars
    Schlager, Justin G.
    Ghoreschi, Kamran
    Wilhelm, Tabea
    Kutzner, Heinz
    Berking, Carola
    Heppt, Markus, V
    Haferkamp, Sebastian
    Sondermann, Wiebke
    Schadendorf, Dirk
    Schilling, Bastian
    Izar, Benjamin
    Maron, Roman
    Schmitt, Max
    Froehling, Stefan
    Lipka, Daniel B.
    Brinker, Titus J.
    FRONTIERS IN MEDICINE, 2020, 7
  • [47] Pathological Voice Classification Based on Wavelet Packet Multiscale Analysis
    Zhang, Xuehui
    Hu, Weiping
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [48] Pathological Voice Analysis and Classification Based on Empirical Mode Decomposition
    Schlotthauer, Gaston
    Torres, Maria E.
    Rufiner, Hugo L.
    DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 364 - +
  • [49] Fuzzy Logic Based Classification and Assessment of Pathological Voice Signals
    Aghazadeh, B. Seyed
    Heris, H. Khadivi
    2009 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-20, 2009, : 328 - +
  • [50] Classification of pathological and normal voice based on linear discriminant analysis
    Lee, Ji-Yeoun
    Jeong, SangBae
    Hahn, Minsoo
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT 2, 2007, 4432 : 382 - +