Deep convolutional neural network for detection of pathological speech

被引:9
|
作者
Vavrek, Lukas [1 ]
Hires, Mate [1 ]
Kumar, Dinesh [2 ]
Drotar, Peter [1 ]
机构
[1] Tech Univ Kosice, Dept Comp & Informat, Fac Elect Engn & Informat, Kosice, Slovakia
[2] RMIT Univ, Sch Engn, Melbourne, Vic, Australia
关键词
convolutional neural network; deep learning; pathological voice detection; transfer learning;
D O I
10.1109/SAMI50585.2021.9378656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the investigation of the use of the deep neural networks (DNN) for the detection of pathological speech. The state-of-the-art VGG16 convolutional neural network based transfer learning was the basis of this work and different approaches were trialed. We tested the different architectures using the Saarbrucken Voice database (SVD). To overcome limitations due to language and education, the SVD was limited to /a/, /i/ and /u/ vowel subsets with sustained natural pitch. The scope of this study was only diseases that classify as organic dysphonia. We utilized multiple simple networks trained separately on different vowel subsets and combined them as a single model ensemble. It was found that model ensemble achieved an accuracy on pathological speech detection of 82%. Thus, our results show that pre-trained convolutional neural networks can be used for transfer learning when input is the spectrogram representation of the voice signal. This is significant because it overcomes the need for very large data size that is required to train DNN, and is suitable for computerized analysis of the speech without limitation of the language skills of the patients.
引用
收藏
页码:245 / 249
页数:5
相关论文
共 50 条
  • [21] A deep convolutional neural network for efficient microglia detection
    Ilida Suleymanova
    Dmitrii Bychkov
    Jaakko Kopra
    Scientific Reports, 13 (1)
  • [22] A deep convolutional neural network approach for astrocyte detection
    Suleymanova, Ilida
    Balassa, Tamas
    Tripathi, Sushil
    Molnar, Csaba
    Saarma, Mart
    Sidorova, Yulia
    Horvath, Peter
    SCIENTIFIC REPORTS, 2018, 8
  • [23] A deep convolutional neural network for efficient microglia detection
    Suleymanova, Ilida
    Bychkov, Dmitrii
    Kopra, Jaakko
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [24] Deep Convolutional Neural Network for Voice Liveness Detection
    Gupta, Siddhant
    Khoria, Kuldeep
    Patil, Ankur T.
    Patil, Hemant A.
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 775 - 779
  • [25] Deep Convolutional Neural Network for Chicken Diseases Detection
    Mbelwa, Hope
    Machuve, Dina
    Mbelwa, Jimmy
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 759 - 765
  • [26] Deep Convolutional Neural Network for Detection of Disorders of Consciousness
    Xu, Zifan
    Wang, Jiang
    Wang, Ruofan
    Zhang, Zhen
    Yang, Shuangming
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7084 - 7089
  • [27] Spoofing Speech Detection using Temporal Convolutional Neural Network
    Tian, Xiaohai
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [28] A FIRST LOOK INTO A CONVOLUTIONAL NEURAL NETWORK FOR SPEECH EMOTION DETECTION
    Bertero, Dario
    Fung, Pascale
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5115 - 5119
  • [29] Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition
    Dehghani, Arash
    Seyyedsalehi, Seyyed Ali
    2018 25TH IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING AND 2018 3RD INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2018, : 240 - 245
  • [30] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
    Badshah, Abdul Malik
    Ahmad, Jamil
    Rahim, Nasir
    Baik, Sung Wook
    2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129