Deep convolutional neural network for detection of pathological speech

被引:9
|
作者
Vavrek, Lukas [1 ]
Hires, Mate [1 ]
Kumar, Dinesh [2 ]
Drotar, Peter [1 ]
机构
[1] Tech Univ Kosice, Dept Comp & Informat, Fac Elect Engn & Informat, Kosice, Slovakia
[2] RMIT Univ, Sch Engn, Melbourne, Vic, Australia
关键词
convolutional neural network; deep learning; pathological voice detection; transfer learning;
D O I
10.1109/SAMI50585.2021.9378656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the investigation of the use of the deep neural networks (DNN) for the detection of pathological speech. The state-of-the-art VGG16 convolutional neural network based transfer learning was the basis of this work and different approaches were trialed. We tested the different architectures using the Saarbrucken Voice database (SVD). To overcome limitations due to language and education, the SVD was limited to /a/, /i/ and /u/ vowel subsets with sustained natural pitch. The scope of this study was only diseases that classify as organic dysphonia. We utilized multiple simple networks trained separately on different vowel subsets and combined them as a single model ensemble. It was found that model ensemble achieved an accuracy on pathological speech detection of 82%. Thus, our results show that pre-trained convolutional neural networks can be used for transfer learning when input is the spectrogram representation of the voice signal. This is significant because it overcomes the need for very large data size that is required to train DNN, and is suitable for computerized analysis of the speech without limitation of the language skills of the patients.
引用
收藏
页码:245 / 249
页数:5
相关论文
共 50 条
  • [1] A Framework for Hate Speech Detection Using Deep Convolutional Neural Network
    Roy, Pradeep Kumar
    Tripathy, Asis Kumar
    Das, Tapan Kumar
    Gao, Xiao-Zhi
    IEEE ACCESS, 2020, 8 : 204951 - 204962
  • [2] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [3] Speech Enhancement based on Deep Convolutional Neural Network
    Nuthakki, Ramesh
    Masanta, Payel
    Yukta, T. N.
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 770 - 775
  • [4] Obstacle Detection with Deep Convolutional Neural Network
    Yu, Hong
    Hong, Ruxia
    Huang, XiaoLei
    Wang, Zhengyou
    2013 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2013, : 265 - 268
  • [5] Deep Convolutional Neural Network for Fog Detection
    Zhang, Jun
    Lu, Hui
    Xia, Yi
    Han, Ting-Ting
    Miao, Kai-Chao
    Yao, Ye-Qing
    Liu, Cheng-Xiao
    Zhou, Jian-Ping
    Chen, Peng
    Wang, Bing
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT II, 2018, 10955 : 1 - 10
  • [6] Deep Convolutional Neural Network for Fire Detection
    Gotthans, Jakub
    Gotthans, Tomas
    Marsalek, Roman
    PROCEEDINGS OF THE 2020 30TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2020, : 128 - 133
  • [7] Pedestrian Detection with Deep Convolutional Neural Network
    Chen, Xiaogang
    Wei, Pengxu
    Ke, Wei
    Ye, Qixiang
    Jiao, Jianbin
    COMPUTER VISION - ACCV 2014 WORKSHOPS, PT I, 2015, 9008 : 354 - 365
  • [8] Deep Convolutional Neural Network for Survival Analysis with Pathological Images
    Zhu, Xinliang
    Yao, Jiawen
    Huang, Junzhou
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 544 - 547
  • [9] Convolutional Deep Neural Network and Full Connectivity for Speech Enhancement
    Alameri, Ban M.
    Kadhim, Inas Jawad
    Hadi, Suha Qasim
    Hassoon, Ali F.
    Abd, Mustafa M.
    Premaratne, Prashan
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (04) : 140 - 154
  • [10] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    Data Science and Management, 2024, 7 (01): : 25 - 34