Deep convolutional neural network for detection of pathological speech

被引:9
|
作者
Vavrek, Lukas [1 ]
Hires, Mate [1 ]
Kumar, Dinesh [2 ]
Drotar, Peter [1 ]
机构
[1] Tech Univ Kosice, Dept Comp & Informat, Fac Elect Engn & Informat, Kosice, Slovakia
[2] RMIT Univ, Sch Engn, Melbourne, Vic, Australia
关键词
convolutional neural network; deep learning; pathological voice detection; transfer learning;
D O I
10.1109/SAMI50585.2021.9378656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the investigation of the use of the deep neural networks (DNN) for the detection of pathological speech. The state-of-the-art VGG16 convolutional neural network based transfer learning was the basis of this work and different approaches were trialed. We tested the different architectures using the Saarbrucken Voice database (SVD). To overcome limitations due to language and education, the SVD was limited to /a/, /i/ and /u/ vowel subsets with sustained natural pitch. The scope of this study was only diseases that classify as organic dysphonia. We utilized multiple simple networks trained separately on different vowel subsets and combined them as a single model ensemble. It was found that model ensemble achieved an accuracy on pathological speech detection of 82%. Thus, our results show that pre-trained convolutional neural networks can be used for transfer learning when input is the spectrogram representation of the voice signal. This is significant because it overcomes the need for very large data size that is required to train DNN, and is suitable for computerized analysis of the speech without limitation of the language skills of the patients.
引用
收藏
页码:245 / 249
页数:5
相关论文
共 50 条
  • [31] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Kishor Bhangale
    Mohanaprasad Kothandaraman
    Circuits, Systems, and Signal Processing, 2024, 43 : 2341 - 2384
  • [32] Speech Emotion Recognition Using Generative Adversarial Network and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2341 - 2384
  • [33] A Deep Learning Method for Pathological Voice Detection using Convolutional Deep Belief Network
    Wu, Huiyi
    Soraghan, John
    Lowit, Anja
    Di Caterina, Gaetano
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 446 - 450
  • [34] Deep Convolutional Neural Network for Microseismic Signal Detection and Classification
    Zhang, Hang
    Ma, Chunchi
    Pazzi, Veronica
    Li, Tianbin
    Casagli, Nicola
    PURE AND APPLIED GEOPHYSICS, 2020, 177 (12) : 5781 - 5797
  • [35] Automated glaucoma detection based on deep convolutional neural network
    Ko, Yu-Chieh
    Wey, Shin-Yu
    Lee, Chen-Yi
    Liu, Catherine Jui-Ling
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2018, 59 (09)
  • [36] Counterfeit Currency Detection using Deep Convolutional Neural Network
    Kamble, Kiran
    Bhansali, Anuthi
    Satalgaonkar, Pranali
    Alagtmdgi, Shruti
    2019 IEEE PUNE SECTION INTERNATIONAL CONFERENCE (PUNECON), 2019,
  • [37] Development and Application of Deep Convolutional Neural Network in Target Detection
    Hang, Xiaowei
    Wang, Chunping
    Fu, Qiang
    ADVANCES IN MATERIALS, MACHINERY, ELECTRONICS II, 2018, 1955
  • [38] Deep Convolutional Neural Network
    Zhou, Yu
    Fang, Rui
    Liu, Peng
    Liu, Kai
    2019 PROCEEDINGS OF THE CONFERENCE ON CONTROL AND ITS APPLICATIONS, CT, 2019, : 46 - 51
  • [39] DETECTION OF CEREBRAL MICROBLEEDING BASED ON DEEP CONVOLUTIONAL NEURAL NETWORK
    Lu, Siyuan
    Lu, Zhihai
    Hou, Xiaoxia
    Cheng, Hong
    Wang, Shuihua
    2017 14TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2017, : 93 - 96
  • [40] ROAD CRACK DETECTION USING DEEP CONVOLUTIONAL NEURAL NETWORK
    Zhang, Lei
    Yang, Fan
    Zhang, Yimin Daniel
    Zhu, Ying Julie
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3708 - 3712