The Effect of Noise on Deep Learning for Classification of Pathological Voice

被引:1
|
作者
Hasebe, Koki [1 ]
Kojima, Tsuyoshi [1 ,2 ]
Fujimura, Shintaro [1 ]
Tamura, Keiichi [1 ]
Kawai, Yoshitaka [1 ]
Kishimoto, Yo [1 ]
Omori, Koichi [1 ]
机构
[1] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, Kyoto, Japan
[2] Kyoto Univ, Grad Sch Med, Dept Otolaryngol Head & Neck Surg, 54 Shogoin Kawahara Cho,Sakyo Ku, Kyoto 6068507, Japan
来源
LARYNGOSCOPE | 2024年 / 134卷 / 08期
基金
日本学术振兴会;
关键词
1D-CNN; GRBAS scale; machine learning; noise resilience; voice disorders;
D O I
10.1002/lary.31303
中图分类号
R-3 [医学研究方法]; R3 [基础医学];
学科分类号
1001 ;
摘要
ObjectiveThis study aimed to evaluate the significance of background noise in machine learning models assessing the GRBAS scale for voice disorders.MethodsA dataset of 1406 voice samples was collected from retrospective data, and a 5-layer 1D convolutional neural network (CNN) model was constructed using TensorFlow. The dataset was divided into training, validation, and test data. Gaussian noise was added to test samples at various intensities to assess the model's noise resilience. The model's performance was evaluated using accuracy, F1 score, and quadratic weighted Cohen's kappa score.ResultsThe model's performance on the GRBAS scale generally declined with increasing noise intensities. For the G scale, accuracy dropped from 70.9% (original) to 8.5% (at the highest noise), F1 score from 69.2% to 1.3%, and Cohen's kappa from 0.679 to 0.0. Similar declines were observed for the remaining RBAS components.ConclusionThe model's performance was affected by background noise, with substantial decreases in evaluation metrics as noise levels intensified. Future research should explore noise-tolerant techniques, such as data augmentation, to improve the model's noise resilience in real-world settings.Level of EvidenceThis study evaluates a machine learning model using a single dataset without comparative controls. Given its non-comparative design and specific focus, it aligns with Level 4 evidence (Case-series) under the 2011 OCEBM guidelines Laryngoscope, 2024
引用
收藏
页码:3537 / 3541
页数:5
相关论文
共 50 条
  • [1] Ensemble and Multimodal Learning for Pathological Voice Classification
    Ariyanti, Whenty
    Hussain, Tassadaq
    Wang, Jia-Ching
    Wang, Chi-Tei
    Fang, Shih-Hau
    Tsao, Yu
    IEEE SENSORS LETTERS, 2021, 5 (07) : 1 - 4
  • [2] Deep Neural Network for Automatic Classification of Pathological Voice Signals
    Chen, Lili
    Chen, Junjiang
    JOURNAL OF VOICE, 2022, 36 (02) : 288.e15 - 288.e24
  • [3] Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine
    Wang, Junlang
    Xu, Huoyao
    Peng, Xiangyu
    Liu, Jie
    He, Chaoming
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (01): : 423 - 435
  • [4] Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine
    Wang, Junlang
    Xu, Huoyao
    Peng, Xiangyu
    Liu, Jie
    He, Chaoming
    Journal of the Acoustical Society of America, 2023, 153 (01): : 423 - 435
  • [5] Deep Learning Approach for Voice Pathology Detection and Classification
    Mittal, Vikas
    Sharma, R. K.
    INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS, 2021, 16 (04)
  • [6] A Modular Deep Learning Architecture for Voice Pathology Classification
    Miliaresi, Ioanna
    Pikrakis, Aggelos
    IEEE ACCESS, 2023, 11 : 80465 - 80478
  • [7] A Deep Learning Method for Pathological Voice Detection using Convolutional Deep Belief Network
    Wu, Huiyi
    Soraghan, John
    Lowit, Anja
    Di Caterina, Gaetano
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 446 - 450
  • [8] Deep Learning Approaches for Pathological Voice Detection Using Heterogeneous Parameters
    Lee, JiYeoun
    Choi, Hee-Jin
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (08) : 1920 - 1923
  • [9] Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach
    Fang, Shih-Hau
    Tsao, Yu
    Hsiao, Min-Jing
    Chen, Ji-Ying
    Lai, Ying-Hui
    Lin, Feng-Chuan
    Wang, Chi-Te
    JOURNAL OF VOICE, 2019, 33 (05) : 634 - 641
  • [10] Breast cancer pathological image classification based on deep learning
    Hou, Yubao
    JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY, 2020, 28 (04) : 727 - 738